Functional Polymorphism in the NFE2L2 Gene Associated With Tuberculosis Susceptibility

Background Nuclear transcription factor erythroid 2 p45-related factor 2 (Nrf2), encoded by NFE2L2, functions as a key transcription factor and regulates expression of antioxidant genes. Our study aimed to investigate the association of single nucleotide polymorphisms of NFE2L2 with tuberculosis (TB) and latent tuberculosis infection (LTBI) and the underlying causal mechanisms. Methods 1950 unrelated Chinese Han participants were included in our two independent study groups. Five tag polymorphisms were selected and genotyped. The functional effects of the rs13005431 polymorphism were confirmed by dual-luciferase reporter assays and mRNA level comparisons. Results Rs13005431_C and rs2364723_G were associated with increased TB susceptibility (P = 0.010 and P = 0.041) after adjustment for confounding factors. rs6726395_A was associated with increased risk of active TB (P=0.035) in a comparison with the LTBI group. The frequency of haplotype rs1049751- rs13005431 AC was higher in the TB group (P =0.013), while frequency of haplotype AT was higher in the healthy control group (P =0.025). The luciferase activity of a plasmid with the rs13005431C-promoter was significantly lower than that of the rs13005431T-promoter. In addition, neutrophils with the CC/TC genotypes which were activated by GM-CSF showed a decreased level of NFE2L2 mRNA when compared with the rs13005431 TT genotype. Conclusions Our study suggests that allele C of rs13005431 might increase the susceptibility to TB by down-regulating the transcriptional activity of NFE2L2.


INTRODUCTION
Tuberculosis (TB) is a disease caused by Mycobacterium tuberculosis (M.TB), which infects approximately one-third of the population worldwide, and remains one of the most important public health problems (1). In 2018, there were an estimated 10 million cases and 1.3 million deaths from tuberculosis worldwide (2). The aim of the End TB Strategy is to achieve a 95% reduction in TB deaths and a 90% reduction in the TB incidence rate by 2035 (2). Thus, great efforts are urgently needed to strengthen the prevention, diagnosis and treatment of TB.
As is well known, encountering M.TB leads to several possible outcomes including M. TB clearance, primary TB, latent TB infection (LTBI) and active TB. It has been suggested that 5-10% of LTBI individuals will progress to active TB during their lifetime (3). Management of LTBI will be required to implement the End TB Strategy (4). Screening and preventive treatment of LTBI have been recommended in populations with high risk (5). There are two available methods for LTBI diagnosis: tuberculin skin test (TST) and Interferon-gamma release assays (IGRAs) (including the Quantiferon GIT Assay and T-SPOT.TB). Although guidelines suggest that either the TST or IGRAs may be used to diagnose LTBI (5), IGRAs have been shown to have higher specificity and sensitivity (6) with no bacille Calmette-Gueŕin (BCG) vaccination interference and much less environmental mycobacteria interference than that of TST.
Previous studies have suggested that TB is associated with oxidative and antioxidant responses (7,8). Nuclear factorerythroid 2 (NF-E2)-related factor 2 (Nrf2), encoded by the NFE2L2 gene, functions as a critical transcription factor in the anti-oxidation process and usually is repressed by Keap1. Nrf2 enters the nucleus upon activation by oxidative stress (9,10), binds to the GCTGAGTCA site of the antioxidant response element in the promoter of antioxidant phase II genes and promotes the expression of these genes (11,12). Several studies have revealed a relationship between Nrf2 and M.TB infection. It was reported that Nrf2 participated in autophagy (13), the antioxidant response in M. TB-infected guinea pigs (14), and the reduction of granulomas in Nrf2-deficient mice when infected with M.TB (15).
Recently, oxidant/antioxidant related genes such as FMO2 and CYBA have been associated with TB susceptibility (16,17). However, there have been no reports of an association between NFE2L2 polymorphisms and risk of LTBI and TB. Therefore, we carried out a discovery study and a replication study to determine the relationship of NFE2L2 variants with susceptibility to TB and LTBI in the Chinese Han population.
We also performed dual-luciferase reporter assays and compared NFE2L2 mRNA levels of neutrophils with different genotypes to validate the association findings.

Study Population
In total, 1950 unrelated Chinese Han participants, consisting of 636 TB patients and 608 healthy controls in the discovery study as well as 301 TB patients, 201 LTBI subjects and 204 uninfected healthy controls (HC) in the replication study, were consecutively recruited from the West China Hospital between July 2013 and August 2017. For the discovery study, the diagnosis of TB was based upon the following criteria: histopathological evidence of TB disease and/or culture positivity for MTB and/or smear positivity for MTB in at least two separate specimens and/or radiological and clinical findings consistent with TB, with positive clinical response to anti-TB therapy (18). TB cases were divided into two subgroups: (1) pulmonary TB patients (PTB, pathological changes limited to the lung) and (2) extra-pulmonary patients (EPTB, pathological changes involving other tissues or organs merely or in combination with the lungs). The healthy controls who did not have a history or evidence of TB on the basis of their syndromes and radiographic examination results were enrolled during their routine health examination in the West China Hospital. For the replication study, both the uninfected HC group and LTBI participants were close contacts of PTB patients. LTBI was defined as adults (age > 18) with a documented positive result of IGRAs and negative results of radiological and clinical manifestations. HC individuals were defined as adults with negative results of IGRAs, radiological and clinical findings. The inclusion criteria of TB patients enrolled in the replication study were the same as the discovery study. Another 60 healthy volunteers were recruited for comparison of mRNA levels of neutrophils. All participants diagnosed with HIV infection, diabetes mellitus, autoimmune disorders, tumors or treated with immunosuppressive drugs were excluded.

Selection/Genotyping of NFE2L2 Gene SNPs
NFE2L2 is located on chromosome 2 (2q31) and has five exons and four introns. TagSNPs were selected from the region 3,000 base pairs upstream to 300 base pairs downstream of the NFE2L2 gene based on the Chinese Han Beijing data of the HapMap database (http://hapmap.ncbi.nlm.nih.gov, HapMap Data Rel 27 Phase II + III, Feb09, on NCBI B36 assembly, dbSNP b126) by Haploview software 4.2. Using criteria of minor allele frequency (MAF) ≥ 5% and the Tagger pairwise method (r² ≥0.8), five tagSNPs of NFE2L2 (rs10497511, rs2364723, rs13005431, rs6726395 and rs1962142), representing eleven SNPs with MAF ≥ 0.05 in the covered gene region, were selected for genotyping.

Plasmid Constructs
Polymerase chain reaction (PCR) was performed to amplify a 651bp sequence surrounding rs13005431 from genotyped genomic DNA with the TT genotype of rs13005431. Primers were designed using Primer-BLAST (www.ncbi.nlm.nih.gov/ tools/primer-blast/) and sequences for restriction sites NheI and XhoI were introduced (Supplementary Table 2). Both gel purified PCR product and pGL3-promoter reporter vector (Promega, Madison, WI, USA) were digested by NheI and XhoI restriction enzymes (Takara Bio Inc., Kusatsu, Japan) at 37°C for 4 hours and then ligated by T4 DNA ligase (TaKara Bio Inc.) at 16°C for 8 hours. After transforming into competent E. coli DH5a cells and extraction by plasmid miniprep kit (TianGen, Beijing, China), the recombinant plasmid vector was verified by direct sequencing and named PGL3-rs13005431Tpromoter (Supplementary Figure 1). A site-directed mutagenesis strategy was applied to acquire another recombinant plasmid vector of PGL3-rs13005431C-promoter (see Supplementary Table 2 for primer sequences) which was confirmed by DNA sequencing.

Cell Culture and Luciferase Assays
HEK293T cells were plated into 96-well culture plates at a density of 3.2×10 4 cells/well and cultured for 48 hours to acquire 90% confluence at the time of transfection. The PGL3promoter, PGL3-rs13005431T-promoter and PGL3-rs13005431C-promoter were transfected into the HEK293T cells together with PRL-CMV plasmid vectors which acted as an internal reference using Lipofectamine ® 2000 Reagent (Invitrogen (Thermo Fisher Scientific), Waltham, MA USA). Dual-luciferase reporter assays (Promega, USA) were carried out after 30h transfection on a GloMax ™ 96 microplate luminometer. The transfection experiment was performed in triplicate. Results are presented as relative luciferase activity by dividing the firefly luciferase activity of each well by the Renilla luciferase activity. The normalized luciferase activity of recombinant plasmids was further normalized to the PGL3-promoter group.
mRNA Levels of NFE2L2 in Individuals With Different Genotypes of rs13005431 10 ml of peripheral venous blood in a heparin anticoagulant tube was obtained from 60 healthy volunteers. Neutrophils were isolated using Ficoll gradient density centrifugation and confirmed by Wright staining and flow cytometry (CD11b, CD16). The cells were then cultured in 24-well culture plates at a density of 2×10 6 cells/well for 3 hours, activated or not by GM-CSF (in a final concentration of 1 ng/ml). The Trizol method was then used to extract the total RNA of the neutrophils. Reverse transcription was conducted using the PrimeScript ™ RT Reagent Kit with gDNA Eraser (Takara Bio Inc.). Real-time PCR for NFE2L2 was then performed on an ABI7300 Sequence Detection System (Applied Biosystems (Thermo Fisher Scientific), Waltham, MA USA) using SYBR Premix Ex Taq II (Takara Bio Inc.), with B2M as the housekeeping gene (19). Primers for real-time PCR are shown in the Supplementary Table 2.

Statistical Analyses
Experimental data were analyzed using the Statistical Package for Social Sciences version 17.0 (SPSS, Chicago, IL, USA). Distributions of clinical characteristics between the control group and TB patients were evaluated by the chi-squared test and Fisher exact probability for categorical variables and the student's t-test for continuous variables. P values, Odds Ratios (OR) and 95% confidence intervals (95%CI) of association between SNPs and TB susceptibility were calculated by binary or multinomial logistic regression under four genetic models (allelic, additive, dominant and recessive). Linkage disequilibrium (LD) analysis and haplotype blocks were created by Haploview software 4.2 (20), while p values, ORs, 95%CIs and the global test of haplotype analysis were calculated using the SHEsis program (21). Genotype and allele frequencies of SNPs in the control group were assessed to determine whether they conformed to Hardy-Weinberg equilibrium (HWE). Geneenvironment interaction was detected using the multifactor dimensionality reduction (MDR) constructive induction algorithm. SNP-by-sex additive interactions by the method of Andersson and multiplicative interactions by logistic regression in the discovery study (22). Normalized luciferase activity and relative mRNA levels were evaluated by the student's t-test. A statistically significant difference was indicated by a two-sided P-value < 0.05. Table 1 shows the characteristics of all the study subjects. In the discovery study, there was no significant difference in the distribution of gender and age between the case and control groups. However, smoking status was statistically different (P = 0.003). In the replication study, although gender distribution was similar among the three groups, age significantly differed between groups. Smoking status and TB types of the participants in the replication study were not available.

Demographic and Clinical Characteristics
The locations of the five tagSNPs genotyped are presented in Figure 1. The genotype call rates varied from 99.7% to 100% and the genotype reproducibility was 100%. All SNPs in the control group of both studies were in HWE.

Association Between the Five tagSNPs of NFE2L2 and Susceptibility to TB and LTBI in the Discovery and Replication Study
In the discovery study, we analyzed the data between TB patients and healthy controls. As shown in Table 2, allele C of rs13005431 and allele G of rs2364723 were associated with increased TB risk (P = 0.010 and P = 0.041). In stratified analyses, the deleterious effect of rs13005431 C was still seen in the non-smoker, EPTB and female subgroups (P = 0.003; P = 0.015 and P<0.001, respectively). In addition, rs2364723 G still demonstrated a relationship with increased susceptibility to TB in the non-smoker and female subgroups (P = 0.022 and P = 0.004). In addition, allele A of rs6726395 was demonstrated to be a risk factor associated with TB risk in the female subgroup (P = 0.003). There was no significant association observed between the other two tagSNPs (rs10497511 and rs1962142) and TB susceptibility ( Table 2).
In the replication study, we analyzed the data among TB patients, LTBI subjects and uninfected healthy controls using a multinomial logistic regression analysis model. When comparing the LTBI group with TB patients, only allele A of rs6726395 was associated with increased TB risk (P = 0.035) ( Table 3). When the HC group was compared with the LTBI group, none of the tagSNPs was associated with susceptibility to TB infection. In addition, we combined LTBI subjects and uninfected healthy controls as a healthy control group. When comparing the healthy control group with TB patients, we still seen that rs6726395 A was associated with increased TB risk (P = 0.028) ( Table 4).
Furthermore, we combined the data of the two studies including the discovery study and the replication study. As shown in Table 4, allele C of rs13005431, allele G of rs2364723, and allele A of rs6726395 were associated with increased TB risk (P = 0.002; P = 0.013 and P = 0.014, respectively). There was no significant association observed between the other two tagSNPs (rs10497511 and rs1962142) and TB susceptibility.

LD and Haplotype Analysis in the Discovery and Replication Study
The LD analysis (Supplementary Figure 2) demonstrated mildto-moderate levels of LD between the five tagSNPs of NFE2L2. For the discovery study, haplotype analyses of the two blocks are shown in Supplementary Table 3. The frequency of haplotype rs1049751-rs13005431 AC was significantly higher in the TB group (P = 0.013), while the frequency of haplotype AT was higher in the control group (P = 0.025) and there was a significant global result (P = 0.025). For the replication study, no significant haplotype effect was observed.

Association Between Gene-Environment/ SNP-by-Sex Interactions and TB Susceptibility
The results of gene-environment interactions after MDR analysis are shown in Supplementary Table 4. In the discovery study, smoking-rs13005431 formed the best interaction model with 55.69% testing balanced accuracy and 10/10 cross-validation consistency. Smoking-sex-rs13005431 formed an interaction model with 55.03% testing balanced accuracy and 8/10 crossvalidation consistency. After 1000-fold permutation testing, both models were found to be significant (P = 0.001-0.002 and P = 0.023, respectively). As shown in Supplementary Figure 3, smoking status formed the high-risk models regardless the genotypes of rs13005431, non-smoking status and rs13005431 TT formed the low-risk model, while non-smoking status and rs13005431 CC/TC formed the high-risk models. As shown in     Supplementary Table 5, the only positive additive interaction was observed between female sex and rs13005431 TC+CC/C genotypes. In addition, there were suggestive positive multiplicative interactions of three SNPs (rs13005431, rs2364723 and rs6726395) with female sex under two genetic models.

Bioinformatics Prediction
We predicted the potential functional significance of SNPs from three aspects including protein coding, splicing regulation and transcription regulation (Supplementary Table 6). The influence factors of transcription regulation can be judged by three terms as following: 1) prediction of transcription factor binding sites altering using TRANSFAC or JASPAR database; 2) other factors affecting transcription regulation, such as high sensitive site of DNase I, histone modification, according to the comprehensive evaluation by GLODEN path; 3) judgment of evolutionary conservatism by PhastCons. According to the predicted results, rs13005431 was identified as a candidate SNP for further functional identification.
Effect of Allele T/C of rs13005431 on Promoter Activity rs13005431, located in the first intron of NFE2L2, is outside of the promoter region. However, this SNP might disrupt an enhancer or silencer sequence and thus influence promoter activity indirectly. As previously reported, variants in the first intron might influence gene activity (23). Moreover, rs13005431 was reported to be a cis-eQTL (expression quantitative trait locus) of NFE2L2 in whole blood (P = 1.978×10 -4 ) (24). Therefore, the dual-luciferase reporter assay was used to investigate the regulatory role of rs13005431. As shown in Figure 2, the normalized luciferase activity of PGL3-rs13005431C-promoter was significantly lower than the PGL3-rs13005431T-promoter and PGL3-promoter constructs (P <0.001 and P <0.001, respectively), indicating that the mutant allele C of rs13005431 may decrease promoter activity. In addition, no significant difference was observed between the PGL3-rs13005431T-promoter group and the PGL3-promoter group (P =0.077), suggesting that the wild type allele T of rs13005431 had no significant effect on promoter activity.

Expression Levels of NFE2L2 in Neutrophils With Different Genotypes of rs13005431
To further confirm whether allele C of rs13005431 influences the promoter activity of the NFE2L2 gene, 60 healthy subjects were enrolled. The relative expression levels of NFE2L2 in neutrophils were analyzed according to different genotypes under basal and GM-CSF activated conditions. As shown in Figure 3A, the mRNA expression levels of NFE2L2 in GM-CSF activated neutrophils were significantly increased compared with the unstimulated group (P <0.001). As shown in Figure 3B, subjects with genotype TC/CC had lower relative expression level than subjects with genotype TT in the GM-CSF activated group (P =0.017) while no difference was found when the neutrophils were unstimulated (P =0.123).

DISCUSSION
TB is a complex inflammatory process caused by Mycobacterium tuberculosis (M. TB). Active TB in adults is largely due to reactivation of primary infection (25), thus management of LTBI has been promoted for achieving the goals of the End-TB strategy (26). Several gene polymorphisms have shown significant associations with TB and LTBI susceptibility (27), which may help to reveal new aspects of the pathogenesis of TB and discover potential diagnostic genetic markers. It has been suggested that oxidant/antioxidant imbalance is related to TB (7,8). Nrf2, a critical regulator of antioxidant defenses, has been shown to play a protective role in many lung diseases such as COPD, bleomycin-induced pulmonary fibrosis and so on (28,29). Palanisamy et al. have shown that deficiency of Nrf2 was associated with progressive oxidative stress in guinea pigs infected with TB and antioxidant drugs could be a beneficial adjunct to anti-TB treatment (14). One study suggested that an Nrf2-mediated 17-gene signature can be used to distinguish TB patients from healthy controls, LTBI, pneumonia, or lung cancer, as well be used as an indicator of the anti-TB response (30). Here, we performed the first study of the relationship of NFE2L2 SNPs with TB and LTBI susceptibility in the Chinese Han population. Our results showed that tagSNPs of NFE2L2 affect the susceptibility to TB and LTBI. The rs13005431_C and rs2364723_G alleles of NFE2L2 were both risk factors for TB. The haplotype analysis results also showed the association between rs13005431 and TB. The replication study revealed that rs6726395_A was associated with increased risk of active TB when comparing with the LTBI group, while none of the SNPs were associated with TB infection. Different genetic associations between TB and LTBI have been observed for SNPs of genes such as IRGM (27), FOXO3 (31), and TLR9 (32). These data indicate that the underlying genetic mechanisms might differ between susceptibility to LTBI and active TB. It is FIGURE 2 | Normalized luciferase activity of three transfection vectors. The relative luciferase activity is presented as the firefly luciferase activity of each well divided by the Renilla luciferase activity. The normalized luciferase activity was expressed as a ratio to the mean relative luciferase activity of the PGL3promoter group by each relative luciferase activity.
interesting that NFE2L2 SNPs are also associated with diabetes mellitus (33,34), which is one of most common comorbidities of TB. Therefore, NFE2L2 dysfunction might serve as the common mechanism leading to TB and other diseases such as diabetes mellitus.
Additionally, we performed subgroup analysis according to TB location, gender and smoking status. Our results revealed that rs13005431_C was associated with increased risk of TB in the EPTB subgroup rather than in PTB subgroup. It has been reported that genetic susceptibility to TB differs between PTB and EPTB (31). Moreover, different mechanisms of immunity in different locations were found in mouse models (35). Our stratified results strengthen the hypothesis that polymorphisms involved in immunological mechanisms and the pathology of different TB types are different. Complicating factors such as gender, smoking and years after primary infection might influence the reactivation site (36). In another subgroup analysis by gender, rs13005431_C, rs2364723_G and rs6726395_A were found to be significant risk alleles in females but not in males. Multiplicative and additive interactions between female sex and rs13005431 genotypes were also observed. Previous studies have revealed that there were gender-specific associations between NFE2L2 gene variants and several diseases (37,38). Socio-behavioral, cultural factors and gonadal steroids may relate to the gender differences in TB (39,40).
In subgroup analysis by smoking status, the rs13005431_C and rs2364723_G alleles showed relationships with increased TB risk in the non-smoking subgroup. It is well known that smoking confers a higher risk of M. TB tuberculosis infection, TB development and progression (41)(42)(43). The MDR results also revealed smoking-rs13005431 interactions influence TB susceptibility, with smoking contributing to high risk of TB. Animal models and human studies have shown that Nrf2 and downstream activated genes play a critical role in defending against the oxidative stress of cigarette smoke (44,45). Nrf2 was activated and numerous antioxidant enzymes were increased in healthy smokers (44,46,47). The relatively lower antioxidant responses of non-smoking individuals may account for susceptibility to TB in the non-smoking subgroup.
Introns have been shown to influence gene transcription in some reports (23,48). Intronic polymorphisms may regulate the transcriptional activity or mRNA splicing after transcription. Based on our association results, rs13005431 was investigated using dual-luciferase reporter assays to determine the influence of the SNP on transcriptional activity. The normalized luciferase activity of plasmid PGL3-rs13005431C-promoter was lower than that of PGL3-rs13005431T-promoter, suggesting that the C allele of rs13005431 decreases promoter activity. Neutrophils are one of the main phagocytes involved in M. TB infection. In the sputum and bronchoalveolar lavage fluid of active TB patients, neutrophils are the most prevalent cell type. Therefore, neutrophils are representative of immune response cells associated with M. TB infection. A previous study showed that in a chronic infectious granulomatous disease similar to the pathology of TB, the expression and nuclear translocation of NFE2L2 in neutrophils increased in early granulomatous lesions, but this phenomenon was not found in macrophages (49). Therefore, Nrf2 may be involved in protecting neutrophils from oxidative stress and controlling inflammation to a certain extent (49). Since neutrophils play an important role in the oxidative stress of the microbicidal response to TB, we speculate that the role of Nrf2 in M. TB infection may be mainly reflected in protecting neutrophils from oxidative stress in TB granuloma (50). Therefore, we compared the mRNA level of NFE2L2 in neutrophils with different genotypes to confirm the effect of rs13005431 on NFE2L2 transcriptional activity. After stimulation by GM-CSF, neutrophils with the TC/CC genotype of rs13005431 expressed reduced NFE2L2 mRNA level compared with neutrophils with the TT genotype. From all of the above results, we can speculate that the C allele of rs13005431 may increase the risk of TB by lowering the promoter activity of NFE2L2 and reducing its mRNA level.
The potential weaknesses of our study should be listed. Firstly, we did not further investigate the specific transcription factors binding to the sequence located around rs13005431 and thus clarify the underlying mechanism for the increased TB susceptibility. Secondly, we did not directly provide evidence that during M.TB infection, rs13005431 would be an important player or even an expression quantitative trait locus (eQTL). Finally, since our research was limited to the Chinese Han population, follow up studies in different populations are needed to validate our results.

CONCLUSION
In conclusion, we have observed significant associations between NFE2L2 variants and TB susceptibility. Further experiments suggested the potential mechanism: allele C of rs13005431 decreased the transcriptional level of NFE2L2. This study may thus pave the way for new treatment modalities targeting antioxidative mechanisms in TB.

DATA AVAILABILITY STATEMENT
The raw data supporting the conclusions of this article will be made available by the authors, without undue reservation, to any qualified researcher. All data excel files are available from the Figshare database (https://figshare.com/s/917cdb79772f89aee2f9).

ETHICS STATEMENT
The studies involving human participants were reviewed and approved by the ethics committee of the West China Hospital of Sichuan University in China. The patients/participants provided their written informed consent to participate in this study.