Polymorphisms in ERAP1 and ERAP2 Genes Are Associated With Tuberculosis in the Han Chinese

Single nucleotide polymorphisms (SNPs) in the endoplasmic reticulum aminopeptidase (ERAP1 and ERAP2) genes are associated with the pathogenesis of bacterial and viral infections. To search for the variations in the ERAP1 and ERAP2 genes associated with tuberculosis (TB), 449 TB cases and 435 healthy individuals of the Han population in the Yunnan province of China were included in the present study. Eleven SNPs of ERAPs were genotyped using the SNaPshot SNP assay. Allelic, genotypic, and haplotypic association analyses were performed between the TB and control groups. Furthermore, stratification analyses among pulmonary TB (PTB), extrapulmonary TB (EPTB), and healthy controls; and initial treatment TB (ITTB), retreatment TB (RTB), and healthy controls were also performed. The TT genotype of rs26618 in ERAP1 exhibited a protective factor for TB, compared with the role of the CC/CT genotype (P = 0.003; OR = 1.490, 95% CI: 1.140–1.940). In ERAP2, the frequency of the G allele of rs2549782 was higher in the case group than in the control group (0.491 vs. 0.417, P = 0.002, OR = 1.350, 95% CI: 1.118–1.631), and the TT genotype exhibited a protective factor for TB, compared with the role of the GG/GT genotype (P = 0.001; OR = 1.650, 95% CI: 1.230–2.220). The frequency of the C allele of rs1056893 was higher in the case group than in the control group (0.468 vs. 0.394, P = 0.002, OR = 1.350, 95% CI: 1.118–1.631), and the genotype exhibited a difference in the log-additive model (P = 0.002; OR = 1.350, 95% CI: 1.120–1.630). The frequencies of the haplotype rs27037-rs27044-s30187-rs26618-rs26653-rs3734016-GCCCGC in ERAP1 (0.290 vs. 0.240, P-adj = 0.028, OR = 1.320, 95% CI: 1.063–1.638) and the haplotypes rs2549782-rs2248374-rs2287988-rs1056893-GTAGC in ERAP2 (0.446 vs. 0.348, P-adj = 4.80E-05, OR = 1.510, 95% CI: 1.246–1.829) was higher in the TB groups, while the frequencies of the haplotypes rs2549782-rs2248374-rs2287988-rs1056893-TAGAT (0.478 vs. 0.539, P-adj = 0.020, OR = 0.782, 95% CI: 0.649–0.943) were lower in the TB groups. The allelic and genotypic associations were also investigated in the subsequent stratification between the PTB, EPTB and control groups as well as between the ITTB, RTB, and control groups. In conclusion, variations in ERAP1 and ERAP2 genes were identified to be associated with TB in the Han Chinese population.

Single nucleotide polymorphisms (SNPs) in the endoplasmic reticulum aminopeptidase (ERAP1 and ERAP2) genes are associated with the pathogenesis of bacterial and viral infections. To search for the variations in the ERAP1 and ERAP2 genes associated with tuberculosis (TB), 449 TB cases and 435 healthy individuals of the Han population in the Yunnan province of China were included in the present study. Eleven SNPs of ERAPs were genotyped using the SNaPshot SNP assay. Allelic, genotypic, and haplotypic association analyses were performed between the TB and control groups. Furthermore, stratification analyses among pulmonary TB (PTB), extrapulmonary TB (EPTB), and healthy controls; and initial treatment TB (ITTB), retreatment TB (RTB), and healthy controls were also performed. The TT genotype of rs26618 in ERAP1 exhibited a protective factor for TB, compared with the role of the CC/CT genotype (P = 0.003; OR = 1.490, 95% CI: 1.140-1.940). In ERAP2, the frequency of the G allele of rs2549782 was higher in the case group than in the control group (0.491 vs. 0.417, P = 0.002, OR = 1.350, 95% CI: 1.118-1.631), and the TT genotype exhibited a protective factor for TB, compared with the role of the GG/GT genotype (P = 0.001; OR = 1.650, 95% CI: 1.230-2.220). The frequency of the C allele of rs1056893 was higher in the case group than in the control group (0.468 vs. 0.394, P = 0.002, OR = 1.350, 95% CI: 1.118-1.631), and the genotype exhibited a difference in the log-additive model (P = 0.002; OR = 1.350, 95% CI: 1.120-1.630). The frequencies of the haplotype rs27037-rs27044-s30187-rs26618-rs26653-rs3734016-GCCCGC in ERAP1 (0.290 vs. 0.240, P-adj = 0.028, OR = 1.320, 95% CI: 1.063-1.638) and the haplotypes rs2549782-rs2248374-rs2287988-rs1056893-GTAGC in ERAP2 (0.446 vs. 0.348, P-adj = 4.80E-05, OR = 1.510, 95% CI: 1.246-1.829) was higher in the TB groups, while the frequencies of the haplotypes rs2549782-rs2248374-INTRODUCTION Tuberculosis (TB) remains a serious public health problem, being one of the top 10 causes of mortality worldwide as well as the leading cause of mortality from a single infectious agent. In 2018, about 10 million people developed TB, with the disease claiming approximately 1.9 million lives, while India, China, and six other counties accounted for two thirds of the global TB cases (WHO, 2019). TB is caused due to infection by Mycobacterium tuberculosis (M. tb), which mostly affects the lungs and some extrapulmonary sites and is referred to as pulmonary TB (PTB) and extrapulmonary TB (EPTB), respectively (WHO, 2019). With a high level of exposure, about 80-90% of the population is expected to be infected by M. tb, with 5% of the infected individuals developing clinical TB within 2 years of infection, and the remainder developing a latent infection (Abel et al., 2014). Only a small fraction (approximately 5-10%) of the individuals with a latent infection will develop a clinical case of TB throughout their lifetime (Abel et al., 2014;Pai et al., 2016;Dallmann-Sauer et al., 2018).
The mechanism of TB pathogenesis is not clear; however, several studies have suggested that genetic changes, especially in the immune-related genes, may be associated with the development of TB (Fol et al., 2015;van Tong et al., 2017). Immediately after an M. tb infection, the host innate immune system initiates the first line of defense. Several genes are involved in this first stage, and polymorphisms in the genes have been reported to be associated with TB, including genes for the Toll-like receptors, the mannose-binding lectin, CD14, surfactant proteins, the dendritic cell-specific ICAM-3-grabbing non-integrin, and the mannose receptor (Fol et al., 2015;van Tong et al., 2017). Subsequently, the adaptive immune response is activated. The genes involved in antigen presentation, induction and function of cytokines/chemokines, and associated receptors have been suggested to play important roles in controlling an M. tb infection (Fol et al., 2015;Mao et al., 2015). In recent genome-wide association studies (GWAS) of TB risk in the Han Chinese population, new risk loci in ASAP1, ESRRB, and TGM6 genes have been identified (Chen et al., 2016a;Zheng et al., 2018).
During the process of antigen presentation by the major histocompatibility complex (MHC) class I, the endoplasmic reticulum aminopeptidases (ERAP) trim peptides to the optimal size for MHC-I binding, to ensure the correct assembly of the peptide-loading complex (PLC), which plays an important role in inducing the correct immune response against pathogens (Chang et al., 2005;Chen et al., 2016b). ERAPs, including ERAP1 and ERAP2, were initially identified as homologs of the human placental leucine aminopeptidase or the insulin-regulated aminopeptidase, and they belong to the oxytocinase subfamily of the M1 zinc metallopeptidase family (Serwold et al., 2002). The human ERAP genes are located on chromosome 5q15 in opposite orientation; ERAP1 is 47,379 bp in length and comprises 20 exons, and ERAP2 gene is 41,438 bp in length and comprises 19 exons (Hattori et al., 2001;Hattori and Tsujimoto, 2013). As the human leukocyte antigen (HLA) class I gene, ERAP also presents diverse polymorphisms, which have been shown to affect several infectious disease processes. For example, the rs2549782 of the ERAP2 gene is associated with the human immunodeficiency virus (HIV), and the TT genotype is over-represented in the HIV-1-exposed seronegative group (Cagliani et al., 2010). The rs2549782 was identified as a risk factor in a Spanish population exposed to HIV-1 infection by intravenous drug administration (Biasin et al., 2013). In our previous study, we found that the A allele of rs2248374 (located in the splice/intron region), compared with the G allele, increased the risk for developing a hepatitis C virus (HCV) chronic infection (Liu et al., 2017).
Some studies have shown that HLA alleles (HLA-A * 02:01, HLA-A * 11:01, and HLA-B * 57) increase the susceptibility or provide resistance to TB; however, the influence of ERAP alleles on TB has not been examined (Geluk et al., 2000;Liu et al., 2016;Petros et al., 2017). In the present study, we genotyped single nucleotide polymorphisms in the ERAP1 and ERAP2 genes in TB patients to determine if the ERAP gene variants are associated with M. tb infection and TB progression.

Subjects
The present study included a total of 449 cases from the Third Hospital of Kunming admitted between 2018 and 2019 with latent TB and 435 cases of unrelated healthy Chinese Han individuals as controls from the Yunnan province (southwest China). The cases were divided into pulmonary TB (PTB) and extrapulmonary TB (EPTB) groups, with EPTB defined as TB affecting the extrapulmonary sites, such as the lymph nodes, abdomen, urinary tract, skin, joints, bones, and meninges, exclusively or in combination with PTB. Additionally, the patients with TB were stratified into initial treated TB (ITTB) and re-treated TB (RTB) according to their treatment history. The ITTB were defined as patients who never received anti-TB drug therapy, or those who underwent anti-TB drug therapy but did not complete the treatment, or those who received unstandardized treatment for no more than 1 month, while the RTB were defined as patients who underwent unstandardized treatment within 1 month of diagnosis, or those who failed to respond to initial treatment and reported recurrent TB (WS 196-2017)  (1) Mycobacterium tuberculosis infection confirmed by bacteriological assessment such as the tuberculin skin test, an interferon-γ release assay, and a culture-positive sputum smear; and (2) clinical symptoms and chest X-ray consistent with TB. Patients with immunodeficiency, autoimmune diseases, or other acute or chronic infections were excluded from this study. All controls had a negative TB disease history and had no acute or chronic pulmonary disease, bacterial or viral infection, or other immune-mediated disorders.
Approximately 2-5 mL of the peripheral blood was drawn from each participant, and genomic DNA was extracted from the peripheral lymphocytes using the QIAamp Blood Mini Kit (Qiagen, Hilden, Germany). Genotyping of the total 11 SNPs was performed using the SNaPshot SNP assay (Thermo Fisher Scientific, Waltham, MA, United States) according to a previous study (Li et al., 2020). Briefly, multiplex polymerase chain reaction (PCR) of the SNPs was performed followed by a single-nucleotide primer extension assay test. The PCR application primers and the extension primers are shown in Supplementary Tables 1, 2. The 11 SNPs were divided into three groups and detected by capillary electrophoresis (Supplementary Figures 1-3), followed by analysis using GeneMapper TM 4.0 software (Applied Biosystems, Foster City, CA, United States). For quality control, 5% of the samples from the case and control groups were genotyped twice and denoted with unique serial numbers of analysis, and the reproducibility was found to be 100%. Furthermore, we sequenced each SNP in the pre-experiment, and selected all the ambiguous samples for sequencing to confirm the result (Supplementary Figure 4).

Statistical Analysis
The Hardy-Weinberg equilibrium (HWE) was evaluated to determine the representativeness of the study population. Age and sex distribution comparisons between the case and control groups were determined by the t-test and the Chi-squared (χ 2 ) test using SPSS (version 19.0; SPSS Inc., Chicago, IL, United States). Allelic and genotypic frequencies of these SNPs were compared between different groups using the Chi-squared test, and odds ratios (ORs) and associated 95% confidence intervals (CIs) were calculated using the SHEsis software (Shi and He, 2005;Li et al., 2009). Five inheritance models (codominant, dominant, recessive, overdominant, and log-additive) were analyzed. Simultaneously, the Akaike information criterion (AIC) and Bayesian information criterion (BIC) values were calculated to determine the inheritance model with the best fit, i.e., the model with the smallest AIC and BIC values (Sole et al., 2006). Additionally, linkage disequilibrium (LD) was calculated and a D' value greater than 0.80 was considered to indicate LD. The haplotypes among these SNPs were analyzed using SHEsis software (Shi and He, 2005;Li et al., 2009). P < 0.05 was considered statistically significant, with Bonferroni's corrections for multiple testing calculated as 0.05/n. Table 1 shows the clinical data of the subjects in the present study. There was no significant difference in age and sex between the TB and control groups (P > 0.05). The TB group was further divided into the PTB and EPTB groups, and the average age of the EPTB group was higher than that of the control group (39.75 ± 15.86 vs. 45.32 ± 8.99, P = 0.002). In the TB group, 278 and 171 cases belonged to the ITTB and RTB groups, respectively. The average age of the ITTB group was lower than that of the control group (43.12 ± 16.31 vs. 45.32 ± 8.99, P = 0.039), and the male to female ratio in the RTB group was 1.59:1, while that in the ITTB group was 1.07:1 (P = 0.047).

Comparison of Allelic and Genotypic Frequencies of ERAP1 and ERAP2 Between TB Patients and Controls
All 11 SNPs were in HWE in both the control and TB case groups (P > 0.05) (Supplementary Table 3). The allelic and genotypic frequencies of these SNPs are presented in Table 2.
The rs2549782 polymorphism in ERAP2 differed between the TB and healthy control groups at both the allelic and genotypic levels. The frequency of the G allele was higher in the case group than in the control group (0.491 vs. 0.417, P = 0.002, OR = 1.348, 95% CI: 1.117-1.626). The rs1056893 polymorphism in ERAP2 also differed between the TB and healthy control groups at the allelic level, but showed a trend difference only at the genotypic level. The frequency of the C allele was higher in the case group than in the control group (0.468 vs. 0.394, P = 0.002, OR = 1.350, 95% CI: 1.118-1.631). The other three SNPs in EARP2 including rs2548538, rs2248374, and rs2287988, showed a trend of having different allelic frequencies, but they were not significantly different after the Bonferroni correction. Similarly, the rs26618 and rs3734016 SNPs in ERAP1 showed different allelic and genotypic frequencies, but they were not significantly different after the Bonferroni correction.

Inheritance Model Analysis
To evaluate the genotypic association of the 11 SNPs with TB, an inheritance analysis was performed ( Table 3). The comparison between the TB and control groups indicated that rs26618 and rs2549782 exhibited different frequencies in the dominant model, the best-fit inheritance model. The TT genotype of rs26618 was a protective factor for TB, compared with the role of the CC/CT genotype (P = 0.003; OR = 1.490, 95% CI: 1.140-1.940). The TT genotype of rs2549782 was also a protective factor for TB, compared with the role of the GG/GT genotype (P = 0.001; OR = 1.650, 95% CI: 1.230-2.220). Four SNPs rs1056893, rs2548538, rs2248374, and rs2287988, exhibited different trends in the log-additive model, the best-fit inheritance model. However, the frequency difference was significant only in rs1056893 after the Bonferroni correction (P = 0.002; OR = 1.350, 95% CI: 1.120-1.630).

LD and Haplotype Analysis of SNPs in ERAP1 and ERAP2
The LD analysis showed that all six SNPs in ERAP1 and five SNPs in ERAP2 were in LD (D' > 0.80) (Figure 1). Subsequently, the haplotypes of rs27037-rs27044-s30187-rs26618-rs26653-rs3734016 and rs2549782-rs2248374-rs2287988-rs1056893 were constructed. The distribution of these haplotypes (with a frequency of more than 3%) was compared in a pairwise manner between the TB and control groups (

Stratification Analysis of the Association Between TB and ERAP Polymorphisms
The stratification analysis between the PTB, EPTB, and control groups was performed at first, and the P-value, OR, and 95% CI of the pair-wise comparison between the EPTB and the control group were calculated on the basis of the logistic regression model adjusted by age ( Table 5 and Supplementary Table 4). The rs2549782 polymorphism in ERAP2 differed between the PTB and control groups at both the allelic and genotypic levels; however, it did not differ between the EPTB and control groups. The frequency of the G allele was higher in the PTB than in the control group (0.494 vs. 0.417, P = 0.003, OR = 1.363, 95% CI: 1.110-1.672), and the GT/GG genotype was a resistant factor for PTB, compared with the role of the TT genotype (P = 4.00E-04; OR = 1.790, 95% CI: 1.290-2.480) in the dominant model. Genotypic differences were observed in the log-additive model of rs2548538 between the EPTB and control groups (P = 0.001, OR = 1.650, 95% CI: 1.220-2.220), but not between the PTB and control groups. Next, stratification analysis between the ITTB and RTB groups was performed ( Table 6 and Supplementary Table 5). The P-values, ORs, and 95% CIs of the pair-wise comparisons between ITTB and control, and between RTB and control were calculated on the basis of the logistic regression model adjusted by age and sex. Notably, all five SNPs in ERAP2 exhibited differences both at allelic and genotypic levels between the RTB and control groups; however, only the CT genotype of rs26618 in ERAP1 was a protective factor compared with the role of the CC/TT genotype in the Overdominant Model between the ITTB and control groups (0.489 vs. 0.379, P = 0.004, OR = 1.570, 95% CI: 1.160-2.130). The minor alleles of all five SNPs were found to be the risk factors in the RTB group compared with those in the control group, and the genotypic differences in the logadditive model of all five SNPs were significant. In addition, the frequencies of the haplotype rs2549782-rs2248374-rs2287988-rs1056893-GTAGC were higher in the RTB group than in the control group (0.500 vs. 0.348, P = 2.00E-06, OR = 1.871, 95% CI: 1.452-2.412), while the frequencies of the haplotype rs2549782-rs2248374-rs2287988-rs1056893-TAGAT were lower in the RTB group than in the control group (0.418 vs. 0.539, P = 3.00E-04, OR = 0.614, 95% CI: 0.477-0.791).
Four SNPs, namely rs2548538, rs2248374, rs2549782, and rs2287988, exhibited differences between the RTB and ITTB groups at either the allelic or genotype level. For rs2548538, the T allele frequencies were higher in the RTB group than in the ITTB group (0.547 vs. 0.473, P = 0.001, OR = 1.623, 95% CI: 1.234-2.135), and the genotypic differences were also observed in the log-additive model (P = 4.00E-04, OR = 1.650, 95% CI: 1.240-2.190). For rs2248374, the A allele frequencies were higher in the RTB group than in the ITTB group (0.529 vs. 0.433, P = 0.003, OR = 1.520, 95% CI: 1.157-1.997), and the genotypic difference was also observed in the log-additive model (P = 0.002, The statistical significant threshold was set at P < 0.0045 after Bonferroni correction. The significant association after Bonerroni correction were indicated in bold.
Frontiers in Genetics | www.frontiersin.org The statistical significant threshold was set at P < 0.0045 after Bonferroni correction. AIC, Akaike information criterion; BIC, Bayesian information criterion.

DISCUSSION
Endoplasmic reticulum aminopeptidases play an important role in shaping the HLA class I peptidome by trimming peptides to the optimal size for MHC-class I-mediated antigen presentation and for preparing the immune system to differentiate between self-derived and foreign antigens. Since the antigen presentation system plays a major role in the interaction between pathogens and the host, ERAPs have been investigated as potential targets and modulators of infectious diseases (Yao et al., 2019).
Our previous study indicated that ERAP1 gene variations are associated with chronic HCV (Liu et al., 2017). In the present study, we analyzed the association of 11 SNPs in ERAP1 and ERAP2 genes, and our results, to our knowledge, report for the first time that variations in rs26618, rs2549782, and rs1056893 are associated with TB risk. In addition, rs2549782, rs2248374, rs2287988, and rs1056893 were found to be associated with PTB, EPTB, or re-treated TB. The ERAP1 roles against virus infection or in autoimmune disease mechanisms through peptide trimming has been investigated in several studies. Aberrant peptide trimming by ERAP1 and a further impaired peptide presentation by HLA-B27 may be involved in the pathogenesis of ankylosing spondylitis (Evans et al., 2011). Comparison of the T cell response between the ERAP1-deficient and wild-type mice infected with the lymphocytic choriomeningitis virus (LCMV) revealed that the magnitude of T cell response to different LCMV epitopes followed an immunodominance hierarchy and was profoundly different between the two groups of mice (York et al., 2006). In chronic HCV infection, the C alleles of rs26618 and the haplotype rs27044-rs30187-rs26618-rs26653 C-C-C-G exhibited the risk association (Liu et al., 2017). In the present study, the C allele of rs26618 in ERAP1 showed a trend of risk association with TB before Bonferroni correction, and the CC/TT genotype showed a risk factor compared with the role of the CT genotype in the dominant model. In addition, the haplotype of rs27037-rs27044-s30187-rs26618-rs26653-rs3734016-GCCCGC exhibited a risk factor for TB. Several studies have demonstrated that SNPs located at critical structural positions in ERAPs may change the conformation of ERAP1 and ERAP2 (Nguyen et al., 2011;Evnouchidou et al., 2012;Hattori and Tsujimoto, 2013). Thus, we deduced that rs26618 may be one of the risk factors for TB infection.
Endoplasmic reticulum aminopeptidases 2 also plays a role in bacterial and viral infections. In 2010, Cagliani et al. studied the ERAPs susceptibility to HIV-1 infection and found that the polymorphisms in ERAP1 and ERAP2 genes may have been maintained through long-standing balancing selection, and that ERAP2 conferred resistance to an HIV infection likely via the presentation of a distinctive peptide repertoire to CD8 + T cells (Cagliani et al., 2010). The rs2549782, rs2548538, rs2248374, rs2287988, and rs1056893 in ERAP2 were in high LD, haplotype A (GTAGC) and haplotype B (TAGAT) have been shown to be prevalent in most populations worldwide (Vanhille et al., 2013). Haplotype B (TAGAT) has been associated with a splicesite variant caused by the A allele of rs2248374, resulting in nonsense-mediated RNA decay that precludes protein expression (Vanhille et al., 2013). Besides, the G allele of rs2549782 causes a neoconservative amino acid substitution, resulting in alterations in the ERAP2 enzyme activity and substrate specificity (Evnouchidou et al., 2012). In the present study, all five SNPs in ERAP2 showed an association with TB in total or in further stratification analysis. The G allele of rs2549782 and the C allele of rs1056893 showed a risk factor for TB infection, and the GT/GG genotype of rs2549782 showed a higher frequency than that of the TT genotype as well as the rs2549782 genotype in the logadditive model in TB patients compared to that in the control. In addition, the higher frequencies of the haplotype A (GTAGC) were investigated between the TB and control groups. In further stratification analysis, the genotypic association was observed in rs2549782 between the PTB and control groups, and in rs2548538 and rs1056893 between the EPTB and control groups. A similar association has also been observed with other infectious diseases such as AIDS or chronic HCV. We found that the A allele of rs2248374 increased the risk for chronic HCV compared with the G allele; Cagliani et al., found that the rs2549782 was associated with HIV in Italians and the TT genotype was over-represented in this HIV-1-exposed seronegative group; Biasin et al., found that rs2549782 may be a risk factor for intravenous drug users exposed to HIV-1 infection in the Spanish population (Cagliani et al., 2010;Biasin et al., 2013;Liu et al., 2017). All these studies suggest that single SNP or SNP haplotypes in ERAP2 may be associated with TB progression.
One of the major problems associated with TB is its drug resistance. In the present study, to provide insights into TB therapy, we divided the patients with TB into initially treated and re-treated groups according to their treatment classification (China National Health and Family Planning Commission of the People's Republic of China, 2018a). Interestingly, the minor alleles of all five SNPs in ERAP2 were found to be the risk factors in the RTB group compared with the control group, and the genotypic differences in the log-additive model of all five SNPs were significant. Furthermore, the differences between the RTB and ITTB groups were investigated in rs2548538 and rs2248374 at the allelic and genotype levels as well as in rs2549782 and rs2287988 at the genotypic level. In addition, the higher The statistical significant threshold was set at P < 0.0045 after Bonferroni correction. *The P-value, OR and 95% CIs of pairs comparison between PTB and control and EPTB and control were calculated on the basis of the logistic regression model adjusted by age. The statistical significant threshold was set at P < 0.0045 after Bonferroni correction. *The P-value, OR and 95% CIs of pairs comparison between ITTB and control were calculated on the basis of the logistic regression model adjusted by age. **The P-value, OR and 95% CIs of pairs comparison between ITTB and RTB were calculated on the basis of the logistic regression model adjusted by gender.