CCR5 Promoter Polymorphisms Associated With Pulmonary Tuberculosis in a Chinese Han Population

Background Tuberculosis (TB), an infectious disease caused by Mycobacterium tuberculosis, is a major public health concern. Chemokines and their receptors, such as RANTES, CXCR3, and CCR5, have been reported to play important roles in cell activation and migration in immune responses against TB infection. Methods To understand the correlations involving CCR5 gene variations, M. tuberculosis infection, and TB disease progression, a case-control study comprising 450 patients with TB and 306 healthy controls from a Chinese Han population was conducted, along with the detection of polymorphisms in the CCR5 promoter using a sequencing method. Results After adjustment for age and gender, the results of logistic analysis indicated that the frequency of rs2734648-G was significantly higher in the TB patient group (P = 0.002, OR = 1.38, 95% CI: 1.123–1.696); meanwhile, rs2734648-GG showed notable susceptibility to TB (P = 6.32E-06, OR = 2.173, 95% CI: 1.546–3.056 in a recessive model). The genotypic frequency of rs1799987 also varied between the TB and control groups (P = 0.008). In stratified analysis, rs2734648-GG significantly increased susceptibility to pulmonary TB in a recessive model (P < 0.0001, OR = 2.382, 95% CI: 1.663–3.413), and the rs2734648-G allele significantly increased susceptibility to TB recurrence in a dominant model (P = 0.0032, OR = 1.936, 95% CI: 1.221–3.068), whereas rs1799987-AA was associated with susceptibility to pulmonary TB (P = 0.0078, OR = 1.678, 95% CI: 1.141–2.495 in a recessive model) but not with extra-pulmonary TB and TB recurrence. A haplotype constructed with the major alleles of the eight SNPs in the CCR5 promoter (rs2227010-rs2856758-rs2734648-rs1799987-rs1799988-rs41469351-rs1800023-rs1800024: A-A-G-G-T-C-G-C) exhibited extraordinarily increased risk of susceptibility to TB and pulmonary TB (P = 6.33E-11, OR = 24.887, 95% CI: 6.081–101.841). Conclusion In conclusion, CCR5 promoter polymorphisms were found to be associated with pulmonary TB and TB progression in Chinese Han people.


Tuberculosis (TB) is an infectious disease caused by
Mycobacterium tuberculosis (M. tuberculosis). To date, TB remains a major public health concern since approximately one third of the world's population has been infected with M. tuberculosis (1). Approximately 5% of the individuals who become infected with M. tuberculosis will develop clinical TB disease within 2 years of infection (2). TB is classified as primary TB or latent TB infection (2). Approximately 5-10% of the individuals with latent TB will develop clinical TB disease during their lifetime (2)(3)(4).
C-C motif chemokine receptor type 5 (CCR5), a transmembrane G-coupled cell-surface chemokine receptor, binds to different kinds of CC-chemokines, including human macrophage inflammatory protein-1a (MIP-1a), MIP-1b, RANTES (regulated on activation normal T cell expressed and secreted factor), monocyte chemotactic protein 1 (MCP-1), MCP-2, MCP-3, and MCP-4 (8). Previous studies report that CCR5 is highly expressed during activation of T helper 1 (Th1) cells, which play a critical role in host immune responses against TB (9). Moreover, CCR5 expression is found to be substantially increased during M. tuberculosis infection (10)(11)(12). Pokkali et al. found that CCR5 expression is significantly higher in pulmonary TB patients compared to that in healthy controls (10), and Qiu et al. reported that CCR5 expression levels in rhesus monkeys with severe TB disease exhibit remarkably up-regulated lymphocytes in the lungs, bronchial lymph nodes, and spleen (11). Additionally, CCR5 and its ligand play important roles in T-cell activation and migration during immune responses against TB infection. Galkina et al. observed that CCR5 -/-CD8 + T cells exhibit an approximately 50% reduction in effector CD8 + T cell transmigration from pulmonary vascular compartments into interstitial compartments, as compared with CCR5 wild-type CD8 + T cells (13). CCR5 has also been found to possibly regulate effector CD8 + T cell contraction and memory generation after M. tuberculosis infection (14). Furthermore, several studies have suggested that CXCR3, CCR5, and CXCR6 potentially mediate M. tuberculosis-specific CD4 + T cell migration out of the vascular endothelium, and their entry into the lungs during M. tuberculosis infection, which is a critical step in host immune responses against M. tuberculosis infection (15). Therefore, CCR5 seems to play an important role in the immune response against M. tuberculosis infection.
Although CCR5 is reported to be involved in resistance against M. tuberculosis infection, a number of studies have reported the association of CCR5 gene variants with TB infection and progression. In 2014, Carpenter et al. (16) performed an analysis regarding the possible associations between rs1799987 of CCR5 and clinically active TB phenotypes in three different populations (Peru, Xhosa, and Colored), but found no significant associations. In this study, we examined genetic polymorphisms in the CCR5 promoter in the Chinese Han population to investigate the association between CCR5 promoter polymorphisms and M. tuberculosis infection, and TB progression.

METHODS AND MATERIALS Subjects
A total of 450 patients with TB, including 325 cases of pulmonary TB (PTB) and 125 cases of extrapulmonary TB (EPTB, which was defined as TB influencing extrapulmonary sites such as lymph nodes, abdomen, urinary tract, skin, joints, bones, and meninges, exclusively or in combination with PTB), who were enrolled in the Third People's Hospital of Kunming between 2018-2019, were selected as a TB patient group for this study. All subjects were genetically unrelated and belonged to the Chinese Han population from Yunnan province (southwest China).
Diagnoses of TB were based on clinical case definition guidelines for TB issued by the World Health Organization (WHO) (17), Diagnosis for Pulmonary Tuberculosis (WS 288-2017) (18) and Classification of Tuberculosis (WS 196-2017) (19) from the Health Industry Standard of the People's Republic of China. The diagnostic criteria were as follows: (1) M. tuberculosis positively confirmed by sputum smear culture bacteriological assessment; (2) clinical symptoms such as cough, fever and weight loss over two weeks, and chest X-ray consistent with TB disease. Usually, Tuberculin skin test (TST) and interferon-g release assay (IGRA) are also positive. Patients with immunodeficiency, autoimmune diseases, or other acute or chronic infections were excluded from this study.
Over the same period, 306 healthy individuals were recruited as a control group. All the controls had negative history for TB disease and were without any acute or chronic pulmonary disorder, or any bacterial or viral infection or other immunemediated disorders. All the controls were self-reported Han Chinese.

DNA Extraction and Sequencing
Two to 5 ml of peripheral blood was drawn from each participant, and genomic DNA was extracted from peripheral lymphocytes using the QIAamp Blood Mini Kit (Qiagen, Hilden, Germany) in a Biosafety Level 2 Laboratory of the Third People's Hospital of Kunming. The CCR5 promoter region was PCR amplified using primers used in a previously published study (20); CCR5P_F: 5'-gacgagaaagctgagggtaaga-3' and CCR5P_R: 5'taaccgtctgaaactcattcca-3'. The amplified PCR fragment was 1388 bp in length. PCR for each sample was carried out using the TAKARA PrimeSTAR Max DNA polymerase kit (TAKARA, Dalian, China) in 50 µl reaction volumes containing 10 ng genomic DNA, 10 pmol of each primer, 25 µl 2 × PrimeSTAR Max Premix (TAKARA). Amplification consisted of an initial denaturation step of 5 min at 98°C, followed by: 30 cycles of denaturation for 10 s at 98°C, 5 s of annealing at 55°C, extension at 72°C for 90 s, and a final extension for 5 min at 72°C. Purified PCR fragments were subjected to Sanger DNA sequencing to detect the sites of polymorphism using the Big Dye Terminator Reaction Mix (Applied Biosystems Foster City, CA, USA), along with the same primers used for PCR amplification. Sequencing reaction products were purified using the Big Dye Terminator Purification Kit (Applied Biosystems) and run on an ABI 3730XL sequencer. Sequencing data were analyzed using the DNASTAR Lasergene v.7.1 package. All the CCR5 promoter region sequencing data in this study have been deposited in the Figshare database named "CCR5 promoter sequences of TB patients and controls" (DOI: 10.6084/m9.figshare.12015624).

Polymorphic Loci in CCR5 Promoter
In our previous study (20), we found that there are nine identified single nucleotide polymorphisms (SNPs) loci located in the CCR5 gene promoter. Six SNPs, rs2227010 (A>G), rs2734648 (T>G), rs1799987 (G>A), rs1799988 (T>G), rs1800023 (G>A), and rs1800024 (C>T) were found to be polymorphic in the Chinese Han population sample, whereas the three remaining sites, rs2856758 (AA), rs41469351 (CC), and rs41355345 (CC), were monomorphic in this sample. Thus, we analyzed association between alleles and genotypes of these six SNPs with TB.

Statistical Analysis
The distribution of age and sex between the case and control groups was compared via Student's t-test and c 2 -tests in SPSS (v.19.0; SPSS Inc., Chicago, IL, USA). Basic statistical analyses for allelic and genotypic frequencies of the six SNPs were carried out using PLINK v.1.9 (http://zzz.bwh.harvard.edu/plink/data. shtml) (21), and risks were estimated using odds ratios (OR) with 95% confidence intervals (95% CI). A goodness-of-fit c 2test was used to test for Hardy-Weinberg equilibrium (HWE) for each SNP in the control group, with a threshold of 0.05, which was also assessed using PLINK.
The Linkage disequilibrium (LD) and haplotype frequencies (deduced from the phenotype) among eight SNPs (rs2227010-rs2856758-rs2734648-rs1799987-rs1799988-rs41469351-rs1800023-rs1800024) were calculated based on the genotype results using a standard Expectation Maximization (ignoring missing data) algorithm with a partition-ligation approach for blocks by Haploview v.4.2 software (22). Haploview calculates the LD coefficient D', LOD and r 2 between each pair of genetic markers. D' values were defined in the range [-1, 1], with a value of 1 representing perfect disequilibrium. A D' value over 0.8 indicated there is a strong linkage disequilibrium among SNPs. The lowest frequency threshold for haplotype analysis was 0.01, and hhaplotype with frequency less than this number will not be considered in analysis. The differences in the haplotypes (with frequencies over 0.01) between the TB and control groups, between PTB and control groups, as well as between EPTB and control groups were determined by c 2 -test. And risks were estimated using ORs and 95%CI. OR and 95% CI were used to estimate associations between SNPs and TB disease by adjusting for age and gender using multivariate logistic regression models. The threshold for statistical significance was P <0.05. Bonferroni correction was applied for multiple comparisons among alleles and genotypes, and the P-value was adjusted to 0.05/n. Poweranalysis was performed using Power and Sample Size Calculations (v.3.1.2) (23). Table 1 presents participant demographic characteristic data, including gender, age, and clinical type of TB. The mean age of the TB group was 43.76 ± 16.01 years, with the sex ratio (male/ female) being 249/201, while the mean age of the control group was 44.68 ± 9.26 years, with a sex ratio (male/female) of 154/152. The distributions of age and gender between the TB and control groups showed no statistical differences (P > 0.05). The mean ages were 45.31 ± 15.82 years in the PTB group and 39.75 ± 15.86 years in the EPTB group; whereas the sex ratio (male/female) was 190/135 in the PTB group and 59/66 in the EPTB group. For the initial treatment (IT) and retreatment (RT) groups, the mean ages were 43.20 ± 16.35 years and 44.68 ± 15.46 years, respectively, and sex ratios (male/female) were 144/135 in the IT and 105/66 in the RT group (P = 0.043).

Comparisons of Allelic and Genotypic Frequencies of CCR5 Promoter SNPs Between TB Patients and Controls
All six CCR5 promoter SNPs with polymorphism exhibited HWE in the control group (P > 0.05). However, in the TB patient group, rs2734648, rs1799987, and rs1800023 were not in HWE. The allelic and genotypic frequencies of the six CCR5 promoter SNPs were compared between the TB patient and control groups, after adjusting for age and gender based on the logistic regression model ( Table 2). The results showed that the frequency of rs2734648-G was significantly higher in the TB patient group compared to that in the control group (P = 0.002, OR = 1.380, 95% CI: 1.123-1.696); the genotypic distribution of rs2734648 was significantly different between the TB and control groups (P = 1.07E-05). Furthermore, we performed inheritance model analysis and found that rs2734648-GG was significantly associated with TB disease, and exhibited 2-fold susceptibility in a recessive inheritance model (P = 6.32E-06, OR = 2.173, 95% CI: 1.546-3.056). The genotypic distribution of rs1799987 also showed significant difference between the TB and control groups (P = 0.008).  This study had powers over 80% to detect ORs of 1.021 for rs2227010, 0.725 for rs2734648, and 0.975 for rs1799988, and had power of 53.8% to detect with an OR of 1.1 for rs1799987, 60.6% to detect with an OR of 1.056 for rs1800023, and 72.6% to detect with a OR of 0.955 for rs1800024, respectively, in 450 TB patients compared with 306 controls.

Stratification Analysis of the Association Between TB and CCR5 Promoter Polymorphisms
We performed stratification analysis to investigate the association of TB susceptibility with CCR5 promoter SNPs. We stratified the TB patients into PTB and EPTB patients and compared the distribution of allelic and genotypic frequencies of the six SNPs among the stratification groups and the control group. The associations between SNPs and PTB or EPTB groups were adjusted for age and gender using multivariate logistic regression models. Table 3 shows comparative results of rs2734648 and rs1799987. The results showed that the frequency of rs2734648-G was significantly higher in PTB patients as compared to controls (P = 0.0013, OR = 1.488, 95% CI: 1.192-1.858). Carriers of rs2734648-GG showed a notable increase in susceptibility to PTB in the recessive inheritance model (P < 0.0001, OR = 2.382, 95% CI: 1.663-3.413). Carriers of rs1799987-AA also showed a significant association with PTB in the recessive inheritance model (P = 0.0078, OR = 1.687, 95% CI: 1.141-2.495). However, significant associations of these two SNPs with EPTB in the recessive inheritance model were not detected ( Table 3). For other SNPs, no significant differences were found between the PTB and control groups. It should be noted that the results from PTB ( Table 3) simply reinforce the TB associations observed from Table 2, which is not surprising, since PTB represents 72.2% of the TB sample. Finally, no significant difference was found between the EPTB and control groups, and between the EPTB and PTB groups (Supplementary Table 1).
Additionally, we stratified the TB patients into IT and RT subgroups according to disease stage at the time of treatment, and analyzed allelic and genotypic distributions of the six SNPs. Associations between SNPs and disease recurrence were adjusted for age and gender using multivariate logistic regression models. We found rs2734648-G and rs2734648-GG to be significantly associated with TB recurrence ( Table 4). After comparison between IT and control, RT and control, and RT and IT groups, we found that rs2734648-GG was significantly associated with TB recurrence in a dominant inheritance model (P = 0.0032, OR = 1.936, 95% CI: 1.221-3.068), while rs1799987 showed no significant association with disease recurrence ( Table 4 and Supplementary Table 2).

CCR5 Promoter SNP Combination Analysis and Association With TB
LD among eight SNPs (rs2227010, rs2856758, rs2734648, rs1799987, rs1799988, rs41469351, rs1800023, rs1800024) in the CCR5 promoter was estimated, where the LD coefficient D (D') was calculated. The D' value of these eight SNPs was >0.8, Co, control; TB, tuberculosis; OR, odds ratio; CI, confidence interval. The P-value, OR, and 95% CIs of pair-wise comparisons between TB and control groups were calculated based on the logistic regression model adjusted for age and gender. Bonferroni correction was applied, and the P-value was adjusted to 0.008 (0.05/6). And the P-value lower than 0.008 were marked in bold.
indicating that these CCR5 promoter SNPs were in LD (Supplementary Figure 1). Next, we constructed haplotypes of the eight SNPs (rs2227010-rs2856758-rs2734648-rs1799987-rs1799988-rs41469351-rs1800023-rs1800024) and compared haplotypes with frequencies over 0.01 between case and control groups, as listed in Table 5. The results revealed that haplotype H1 (A-A-T-G-T-C-G-C) was the most prevalent, both in control (50.3%) as well as TB patient groups (41.1%), and showed a significant resistance to TB disease (P = 1. cohort, and frequency differences between PTB and control groups were remarkable after Bonferroni correction (P = 7.63E-05 and P = 0.003, respectively). However, there was no difference between PTB and EPTB groups (Supplementary Table 3).

DISCUSSION
TB is a serious infectious disease caused by M. tuberculosis; however, only 5-10% of infected individuals actually develop the active form of disease with clinical symptoms, prompting researchers to identify the factors influencing susceptibility to TB. Understanding host immune responses to M. tuberculosis infection is critical in identifying the reasons behind varying outcomes after M. tuberculosis exposure (latent or active TB disease), and for the development of effective TB vaccines and immune therapeutics. There is substantial evidence to suggest that the onset of TB is influenced by host genetic factors (2,9,15,24). CCR5 has been reported to play important roles in immune responses against M. tuberculosis infection by regulating and activating the recruitment of macrophages, and by further activation of T-cells. In the present study, we investigated Co, control; PTB, pulmonary tuberculosis; EPTB, extra pulmonary tuberculosis; OR, odds ratio; CI, confidence interval. The P-value, OR, and 95% CIs of pair-wise comparisons between PTB and controls, and EPTB and controls, were calculated based on the logistic regression model adjusted for age and gender. Bonferroni correction was applied, and the P-value was adjusted to 0.008 (0.05/6). And the P-value lower than 0.008 were marked in bold.
associations between CCR5 promoter polymorphisms and TB and discovered that CCR5 promoter polymorphisms were significantly associated with PTB and TB progression in the Chinese Han population for the first time.
In this study, we found that the allelic frequency of rs2734648-G was significantly higher in the TB patient group, especially in the PTB group, as compared to the control group, and also that rs2734648-GG carriers had a 2.382-fold increased Co, control; IT, initial treatment; RT, retreatment; OR, odds ratio; CI, confidence interval. The P-value, OR, and 95% CIs of pair-wise comparisons between IT and control, RT and control, and RT and IT groups were calculated on the basis of the logistic regression model adjusted for age and gender. Bonferroni correction was applied, and the P-value was adjusted to 0.008 (0.05/6). And the P-value lower than 0.008 were marked in bold. HHA, HHB, HHC, and HHF were previously reported (25)(26)(27). c Bonferroni correction was applied, and the P-value was adjusted to 0.004 (0.05/11). And the P-value lower than 0.004 were marked in bold. d risk of susceptibility to PTB in a recessive inheritance model. Additionally, rs2734648 was found to be significantly associated with TB recurrence. It has been reported that CCR5 variants may alter the response of CCR5-chemokines, including altered ligand-binding properties (28), and CCR5 promoter polymorphisms could differentially affect CCR5 gene transcription. We constructed a predictive model involving potential binding sites of transcription factors in the CCR5 promoter and discovered that SNPs in the CCR5 promoter might differentially influence transcription factor binding, based on the nucleotide substitution(s) involved (20). Mummidi et al. demonstrated that G to T substitution in rs2734648 (-2554G>T) is associated with differences in binding avidity of the NF-kB family of transcription factors (the binding avidity of rs2734648-G is greater than that of rs2734648-T), which might affect the transcriptional activity of CCR5. rs2856758-G can bind to novel nuclear factor 1 (NF1), which can repress transcription of certain genes (29), whereas rs2856758-A cannot bind to NF1 (26). For another SNP in the CCR5 promoter, rs1799987-AA, we found an increasing risk of susceptibility to PTB in a recessive inheritance model. McDremott et al. reported that rs1799987 (-2459A>G) influences the expression of CCR5, and rs1799987-G has 45% lower promoter activity than rs1799987-A in vitro (30). The latter authors also observed that rs1799987-A/rs1799988-C in combination stimulates CCR5 promoter activity by 45% more than other rs1799987/rs1799988 allelic combinations (30). Li et al. showed that rs1799988 C to T substitution results in reduced expression of CCR5, which consequently correlated with slower AIDS progression. Furthermore, rs1799988-CC carriers display increased CCR5 expression on the surface of peripheral blood mononuclear cells (PBMCs), CD4+ cells, and CD4+ monocytes, as compared to two other CCR5-rs1799988 genotypes (31). Therefore, we deduced that SNPs in the CCR5 promoter could influence CCR5 gene and cell surface CCR5 protein expression by altering the binding of transcription factors, thereby affecting the function of CCR5. Hence, rs2734648-GG and rs1799987-AA were significantly associated with PTB and TB progression by possibly increased expression of CCR5.
We also analyzed the effects of combinations of the 8 CCR5 promoter SNPs (rs2227010, rs2856758, rs2734648, rs1799987, rs1799988, rs41469351, rs1800023, rs1800024) on TB susceptibility and found that haplotype H1 (A-A-T-G-T-C-G-C)-constructed using the major alleles of the eight SNPs-was significantly associated with resistance to PTB; and haplotype H5 (A-A-G-G-T-C-G-C) increased the susceptibility to PTB by over 20-times. CCR2-CCR5 haplogroups constructed using CCR2 (V>64I), rs2856758(-2733A>G), rs2734648(-2554G>T), rs1799987(-2459G>A), rs1799988(-2135T>C), rs41469351 (-2132C>T), rs1800023(-2086A>G), rs1800024(-1835C>T), and rs333(CCR5D32) have been characterized and described in earlier studies, and the haplogroups are termed HHA, HHB, HHC, HHD, HHE, HHF*1, HHF*2, HHG*1, and HHG*2, respectively (25,26). CCR2-CCR5 haplogroups are correlated with differences in CCR5 expression and transcriptional activity. HHA is associated with lower CCR5 expression, whereas HHF and HHG are associated with higher CCR5 expression (32). Similarly, K562 cells, HHA and HHC exhibit lower transcriptional activity, whereas in Jurkat T-cells, HHB, and HHD show higher transcriptional activity than HHA (33,34). In 2011, Mamtani et al. found that the CCR5 promoter haplogroup HHD was associated with susceptibility to TB, by increasing CCR5 expression in either activated PBMCs, or surface expression on activated (HLA-DR + ) CD4 + T cells (34). In our study, we constructed haplotypes with eight SNPs in the CCR5 promoter, and only rs2227010 was not included in the defined CCR2-CCR5 haplogroups. Our results showed that H1 was similar to HHC (which exhibited lower transcriptional activity and lower CCR5 expression), H2 was similar to HHF*1 (related to higher CCR5 expression), H3 was similar to HHE, and H4 was similar to HHA. Therefore, the noticeable protection against PTB by the H1 haplotype in this study was likely due to the lower transcriptional activity of H1 (similar to HHC, which was associated with lower CCR5 expression). However, haplotype H5 was different from haplotype H1 at only one locus: rs2734648, which was T in haplotype H1 and G in H5. Note that H5 is a novel haplotype not previously detected in any other population. Based on the significant susceptibility of rs2734648-G to TB, and the extraordinarily high frequency of rs2734648-G in the Han population, rs2734648-G as well as haplotype H5 could provide crucial insights into immune responses to TB in the Chinese Han population. In this study, we also found one haplotype rs2856758A-rs2734648T-rs1799987G-rs1799988T-rs41469531C-rs1800023A-rs1800024C was most similar as HHD. However, the haplotype frequencies were very low (0.5 and 0.5% in control and TB group respectively, and data not showed) and with no difference between TB and control group. In addition, haplotype H1, which presented only two differences loci when compared with HHD (rs41469351-C and rs1800023-G in H1, whereas rs41469351-T and rs1800023-A in HHD), showed significant frequency difference between TB and control groups (P = 2.25E-04, OR = 0.674, 95% CI: 0.547-0.832). But haplotype HHD and H1 showed entirely opposite effects on the susceptibility to TB. The reason of this discrepancy could be the different frequency of rs41469351 and rs1800023. According to the 1000 Genomes database the rs41469351-T is only detected in African and American population with the frequencies of 26 and 2%, respectively, and were monomorphic (rs41469351-CC) in Chinese Han and European people (http://asia.ensembl.org/ Homo_sapiens/Variation/Population?db=core;r=3:46370271-46371271;v=rs41469351;vdb=variation;vf=5860779). And for rs1800023, the A-allele frequencies in Chinese Han people is 43%, however, there are 91% in African, 63% in European (http://asia.ensembl.org/Homo_sapiens/Variation/Population? db=core;r=3:46370317-46371317;v=rs1800023;vdb=variation; vf=816197). So rs41469351 and rs1800023 maybe important loci which might affect the susceptibility of TB. Haplotypes H7, H10, and H11, which were unlike any reported haplogroups, also showed significantly higher frequencies in the TB group, as well as the PTB cohort. Hence, the combined functions of CCR5 promoter SNPs might play important roles in CCR5-mediated immune responses to TB.
Associations of SNPs with diseases have always been inconsistent among different populations. Previous studies indicate that CCR5 allelic frequencies are remarkably different among different populations (16,35), which might influence the results of correlative studies. Among most populations in the world (including Africans, Americans, Europeans, Japanese, and South Asians), rs2734648-G is the major allele; however, in Chinese Han populations, rs2734648-T is the predominant allele (Han Chinese in Beijing, and Southern Han Chinese from the 1000 Genomes database, and Chinese Han in Yunnan in this study), indicating that the frequency differences in rs2734648 could account for the specific association with TB in Han Chinese people. However, the function of rs2734648 substitution is still unclear; therefore, functional studies regarding the role of rs2734648 in M. tuberculosis infection and TB progression need to be conducted in the future.
Hence, more studies involving more individuals from different populations are needed. Additionally, as discussed above, SNP combinations also play an important role in susceptibility to TB and in its progression. Hence, further studies regarding haplotype structure, especially in terms of combinations of rs2734648 with other SNPs in the CCR5 promoter, are required.

CONCLUSIONS
SNP rs2734638-G of the CCR5 promoter, as well as haplotype H5, consistent with rs2734648-G, are significantly associated with susceptibility to PTB and with TB recurrence by affecting the transcriptional activity and expression of CCR5.

DATA AVAILABILITY STATEMENT
The data sets presented in this study can be found in online repositories. The names of the repository/repositories and accession number(s) can be found below: https://figshare.com/, 10.6084/m9.figshare.12015624.

ETHICS STATEMENT
The studies involving human participants were reviewed and approved by the institutional review board of the Third Hospital of Kunming (approval number is 2018030720). The patients/ participants provided their written informed consent to participate in this study.

AUTHOR CONTRIBUTIONS
SL and LS conceived and designed the research. SL and NL mainly performed experiments and data analysis. HW and SZ did the clinical diagnosis. HW and XZ prepared samples for experiments and performed part of the experiments. SL wrote the manuscript. YY, SZ, and LS wrote parts of the manuscript and reviewed the manuscript. All authors contributed to the article and approved the submitted version.