- 1National Tuberculosis Reference Laboratory, National Scientific Center of Phthisiopulmonology of the Republic of Kazakhstan, Almaty, Kazakhstan
- 2Faculty of Biology and Biotechnology, Al-Farabi Kazakh National University, Almaty, Kazakhstan
- 3Institute of Genetics and Physiology CS MSHE RK, Almaty, Kazakhstan
Tuberculosis, particularly multidrug-resistant TB (MDR-TB), remains a major public health concern in Kazakhstan, where 26% of new TB cases are MDR, far exceeding the global average. To better understand the genetic diversity, drug resistance, and transmission dynamics of Mycobacterium tuberculosis in Kazakhstan, we conducted a retrospective study at the National Scientific Center of Phthisiopulmonology in Almaty from 2023 to 2024. Whole-genome sequencing (WGS) was performed on 272 culture-confirmed TB isolates collected from patients across the country. Phylogenetic analysis revealed the predominance of Lineage 2 (East Asian genotype, 72.4%) and Lineage 4 (Euro-American genotype, 26.8%). Drug resistance profiling identified 29.0% of isolates as MDR-TB, of which 3.3% were classified as pre-XDR and 0.7% as XDR. The most frequently observed resistance-associated mutations were katG S315T (99.2%) and rpoB S450L (91.1%). Cluster analysis using a ≤ 12 SNP threshold identified 22 genomic clusters involving 80 isolates (29.4%), indicating recent and possibly ongoing transmission. Spatial mapping showed that nearly 60% of clusters spanned multiple regions, while others were highly localized, suggesting household or close-contact transmission. A Mantel correlogram test revealed a statistically significant correlation between geographic and genomic SNP distances in Almaty and Almaty Region (r = 0.0634, p = 0.041) within the first distance class (average 5 km, range 0–8 km). These findings suggest that patients living in close proximity are more likely to carry genetically similar strains. As distance increases, geographic proximity becomes less predictive of transmission, with other factors—such as mobility, shared environments, or healthcare contact—likely playing a greater role. Our findings underscore the need to integrate WGS into national TB control programs to guide targeted interventions, enhance surveillance, and curb the spread of drug-resistant TB strains across Kazakhstan.
Introduction
Tuberculosis (TB) is a preventable and curable infectious disease that affects millions of people every year. TB is caused by bacillus Mycobacterium tuberculosis, which is spread through air. The disease typically affects the lungs (pulmonary TB) but can also manifest in other sites within the body. According to the World Health Organization (WHO) reports in 2024, an estimated 10.8 million people developed TB (incident cases) with 400,000 people developing multidrug-resistant TB (MDR-TB) in 2023 (World Health Organization, 2024).
Despite the gradual decline in TB incidence rates year after year, Kazakhstan is facing an epidemic of MDR-TB (World Health Organization, n.d.). In contrast to the global average where MDR-TB represents 3.2% among new TB cases, in Kazakhstan, 26% of newly diagnosed TB cases are attributed to MDR-TB (World Health Organization, 2024; Authors, n.d.-a). Moreover, Kazakhstan remains on the WHO’s list of the top 30 high MDR-TB burden countries worldwide (World Health Organization, 2023a; World Health Organization, 2024). Hence understanding the factors driving MDR-TB prevalence in Kazakhstan and developing targeted strategies to mitigate this epidemic is crucial.
Previous studies have shown that the transmission of MDR-TB strains is a driver of MDR-TB epidemics in the former Soviet Union countries including Kazakhstan (Auganova et al., 2023; Merker et al., 2018). Therefore, it is of great significance to understand the transmission and risk factors of MDR-TB acquisition in the country for the formulation of the anti-TB policies. Whole-genome sequencing (WGS) is a novel technique that could provide new insights of TB and address the problem. WGS of M. tuberculosis is widely used to predict its drug resistance, perform phylogenetic classification, investigate transmission chains, identify mixed infections, and reveal the evolution of the pathogen (Satta et al., 2018; Liu et al., 2022). Only a few studies have used WGS to investigate and characterize TB and MDR-TB strains in the country, primarily focusing on the largest cities, Astana and Almaty (Auganova et al., 2023; Tarlykov et al., 2020; Daniyarov et al., 2021; Auganova et al., 2023; Daniyarov et al., 2020; Daniyarov et al., 2023; Kairov et al., 2014). However, comprehensive nationwide studies utilizing WGS to assess the genetic diversity, transmission dynamics, and drug resistance of M. tuberculosis strains across different regions of Kazakhstan remain limited (Auganova et al., 2023). To provide a scientific basis for DR-TB control and prevention, we conducted a retrospective cohort study on TB from 2023 to 2024, using the WGS to better understand the drug-resistance, genetic diversity, transmission dynamics.
Methods
Study setting
The study was conducted at the National Scientific Center of Phthisiopulmonology (NSCP) under the Ministry of Health of the Republic of Kazakhstan, located in Almaty. The NSCP is a leading institution for TB prevention, diagnosis, and treatment, admitting patients from across the country. This retrospective study was based on data collected from January 2023 to December 2024 at the National Tuberculosis Reference Laboratory.
For bacteriological confirmation, biological specimens from symptomatic patients were submitted to the laboratory before initiation of treatment. TB diagnosis followed the official protocol outlined in Order of the Minister of Healthcare of the Republic of Kazakhstan dated November 30, 2020, № RK DSM-214/2020 “On Approval of the Rules for Conducting Tuberculosis Prevention Activities” (Authors, n.d.-b). Patients were eligible for the study if they tested positive for both the Xpert MTB/RIF assay and culture using the Mycobacteria Growth Indicator Tube (MGIT) system. Informed consent was obtained from patients or legal guardians after providing detailed information about the study objectives. Patient data were extracted from the Damumed Integrated Medical Information System and the National Tuberculosis Patient Registry. These included demographic variables (sex, age, place of residence, and nationality) and clinical characteristics (history of TB treatment, concomitant disease, BMI, and HIV status).
Sample processing and culture procedures
During routine diagnostic work-up, each sputum specimen was subjected to both molecular and culture-based testing. The Xpert MTB/RIF assay (Cepheid, Sunnyvale, CA, USA) was performed according to the manufacturer’s instructions for rapid detection of M. tuberculosis complex and rifampicin resistance. In parallel, sputum samples were liquefied and decontaminated using the N-acetyl-L-cysteine–sodium hydroxide method (Ratnam et al., 1987). Following decontamination, 0.5 ml of the processed sputum was inoculated into MGIT culture tubes and incubated at 37°C in the BACTEC MGIT 960 system (Becton Dickinson, Franklin Lakes, NJ, USA) to promote the growth and detection of M. tuberculosis. From the resulting positive MGIT cultures, a 0.5 ml aliquot of broth with sediment containing clinical strains was transferred into a cryovial using a pipette and stored at −80°C until further use for WGS. To obtain pure M. tuberculosis biomass for DNA extraction, sediment from MGIT cultures was re-cultured on Löwenstein–Jensen (LJ) solid medium and incubated at 37°C for up to 4 weeks.
WGS and data analysis
Genomic DNA was extracted and purified using the PureLink Genomic DNA Mini Kit (Invitrogen, Waltham, USA). DNA concentration was measured using both a NanoDrop spectrophotometer and the Qubit dsDNA HS Assay Kit (Thermo Fisher Scientific, Waltham, USA). A DNA concentration range of 0.2–0.4 ng/μL was used for sequencing library preparation with the Illumina Nextera XT Library Preparation Kit (Illumina, San Diego, USA). Whole-genome sequencing was performed on the Illumina MiSeq platform, generating paired-end FASTQ files.
Raw sequencing reads were processed using TBProfiler (v6.6.3), a tool that identifies M. tuberculosis lineages, drug resistance-associated mutations, and other genomic features (Phelan et al., 2019). Lineage assignment was carried out within TBProfiler using a validated SNP-based barcoding scheme, with alignment to the M. tuberculosis H37Rv reference genome (GenBank accession number NC_000962.3) under default parameters. For additional validation and phylogenetic tree construction, MTBseq (v.1.0.3) was employed (Kohl et al., 2018). A phylogenetic tree was constructed using the maximum likelihood method implemented in IQ-TREE2 (Minh et al., 2020) and visualized using iTOL1.
The drug resistance spectrum of each strain was predicted based on 16 anti-tuberculosis drugs using the latest WHO-recommended mutation catalogue (World Health Organization, 2023b). Only group 1 (associated with resistance) and group 2 (associated with resistance—interim) mutations were used for resistance prediction. Drug resistance classes were defined according to the updated 2021 WHO definitions. HR-TB was defined as resistance to isoniazid. MDR-TB was defined as resistance to at least isoniazid and rifampicin. Pre-extensively drug-resistant tuberculosis (pre-XDR-TB) was defined as MDR-TB with additional resistance to any fluoroquinolone. Extensively drug-resistant tuberculosis (XDR-TB) was defined as MDR-TB with additional resistance to any fluoroquinolone and at least one additional Group A drug (bedaquiline or linezolid; World Health Organization, 2021). The genetic distance was calculated to analyze TB transmission characteristics in the study areas, and genomic transmission clusters were defined using 12 SNPs as cutoff values.
Statistical analysis
SPSS version 27 software (SPSS Inc., Chicago, Illinois) and R version 4.3.1 (RStudio, PBC, Boston, MA, USA) were used for the statistical analysis. Mantel and Mantel correlogram tests were used to assess correlations between continuous variables (geographic distances and SNP differences) across sample pairs. The test employed Pearson’s correlation coefficient to assess the association between genetic and spatial distances. Results with a p-value less than 0.05 were considered statistically significant.
Results
Demographic and clinical characteristics
A total of 446 TB-positive cases, confirmed by Xpert MTB/RIF and MGIT, were collected between January 2023 and December 2024 at the NSCP in Almaty, Kazakhstan. After excluding 14 duplicate strains collected from the same patients, 432 representative M. tuberculosis strains were selected for WGS. Of these, WGS was performed on 295 strains, while the remaining 137 were not sequenced due to the lack of culturable M. tuberculosis, probably caused by sample degradation, poor bacterial growth and contamination. Additionally, 23 strains had low genome coverage (<50×) and were excluded from the study. Overall, 272 (63.0%) M. tuberculosis clinical isolates were successfully sequenced and included in the final analysis (Figure 1). Most patients diagnosed with TB (57.35%, 156/272) were male. The adolescent accounted for 10.29% (28/272) of the study population. The mean age was 40.56 years (range, 1–87 years), and the percentage of patients of Kazakh nationality was 72.79% (198/272). Moreover, most patients were residents of Almaty-city and Almaty region (52.94%, 144/272), and 141 patients (51.84%, 141/272) had no any comorbidities. Pulmonary TB was diagnosed in 229 (84.19%) patients, and extrapulmonary TB was found in the other 43 (15.81%). Most patients (85.66%, 233/272) were newly diagnosed, whereas 39 (14.34%) had previously received anti-TB treatment. The percentage of patients with HIV-positive status was 8.82% (24/272). The average BMI index was 21 (range, 11.8–42.2). The detailed demographic information and clinical characteristics of the study population are presented in Table 1.

Figure 1. Classification of TB cases based on treatment history and genomic analysis. WGS, whole-genome sequencing.
Lineage distribution
A phylogenetic tree was constructed based on whole-genome sequences of 272 M. tuberculosis clinical isolates from NSCP in Almaty. Two major lineages were identified: specifically, 72.43% (197/272) M. tuberculosis isolates were assigned to lineage 2 (East Asian genotype), and 26.84% (73/272) were assigned to lineage 4 (Euro-American genotype). Additionally, one isolate (0.37%; 1/272) isolates were determined as lineage 1 (Indo-Oceanic genotype), and one was identified as Mycobacterium bovis BCG, recovered from a 1-year-old infant with immunodeficiency who developed TB following BCG vaccination. Regarding sublineages, the majority (72.43%, 197/272) belonged to lineage 2.2.1. The remaining 27 isolates belonged to lineage 4.8 (9.93%, 27/272); 16 isolates belonged to lineage 4.3.3 (5.88%, 16/272), and eight isolates belonged to lineage 4.5 (2.94%, 8/272). A maximum likelihood phylogeny of the 272 M. tuberculosis isolates, calculated using a concatenated list of 13,844 SNPs (Figure 2), fully confirmed the phylogenetic classification of the strains investigated.

Figure 2. Phylogenetic tree of 272 TB strains isolated in NSCP, Almaty. Note the different colors on the branches indicate different lineages. The first outer circle indicates the presence or absence of katG S315T. The second outer circle indicates the presence or absence of rpoB S450L. The small circles of different colors on the outer middle ring indicate drug resistance. The third outer circle indicates genomic-clustered strains differing by ≤12 SNPs. The outer-most circle indicates genomic-clustered strains differing by ≤5 SNPs.
Resistance profile
Resistance to 16 anti-TB drugs was detected, with the overall rates (including both new and recurrent TB cases) ranked from highest to lowest as follows: streptomycin (SM, 136/272, 50%), isoniazid (INH, 127/272, 46.69%), ethambutol (EMB, 107/272, 39.34%), rifampicin (RIF, 90/272, 33.09%), pyrazinamide (PZA, 54/272, 19.85%), kanamycin (KM, 22/272, 8.09%), ethionamide (ETO, 18/272, 6.62%), fluoroquinolones (FQs, 13/272, 4.78%), amikacin (AM, 6/272, 2.21%), capreomycin (CM, 6/272, 2.21%), para-aminosalicylic acid (PAS, 6/272, 2.21%), linezolid (LZD, 4/272, 1.47%), bedaquiline (BDQ, 3/272, 1.1%), clofazimine (CFZ, 3/272, 1.1%), delamanid (DEL, 3/272, 1.1%), pretomanid (PTO, 3/272, 1.1%). Strains resistant to cycloserine (Cs) were not detected. The rates of HR-TB, MDR-TB, pre-XDR, XDR strains were 13.97% (38/272), 29.04% (79/272), 3.31% (9/272), and 0.74% (2/272), respectively. Remaining 130 M. tuberculosis isolates (130/272, 47.8%) showed no resistance mediating mutations and were classified as drug susceptible. The distribution of resistance classes across major M. tuberculosis lineages is summarized in Table 2. A follow-up genetic analysis of drug resistance-associated mutations revealed that the most common mutation in RIF-resistant strains was rpoB S450L (82/90, 91.11%), while INH-resistant isolates most frequently carried the katG S315T mutation (126/127, 99.21%; Figure 3). In BDQ- and CFZ-resistant strains, two harbored the mutation mmpR5 c.139dupG, and one carried mmpR5 c.466dupC. Among the three DEL-resistant strains, each exhibited a unique mutation: ddn c.223delG, ddn p. Glu118*, and fbiC c.2565_*56delCNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNN.

Figure 3. Classification of drug resistance and mutation patterns in the rpoB and katG genes. (A) The frequency of resistance to each of the 16 drugs. (B) Pie chart showing the distribution of MDR or rifampicin-resistant, pre-XDR and XDR isolates. (C) Mutation patterns in the rpoB and katG genes. The number of strains is indicated in parentheses. MDR, multidrug resistant. Pre-XDR, pre-extensively drug resistant.
Clustering analysis
Overall, 80 M. tuberculosis strains were divided into 22 clusters, ranging in size from two to 18 strains, suggesting recent transmission. The cluster rate, defined as the proportion of isolates within SNP-defined transmission clusters (≤12 SNPs), was 29.41% (80/272). The median SNP distance of clustered strains were 8.5, and 90% (72/80) of them were Lineage 2, and all drug resistant strains were also L2 strains (Table 3). Moreover, most clusters contained two strains, accounting for 32.5% of the clustered cases (26/80). A previous study showed that the presence of TB among new cases suggests the transmission of TB strains (Yang et al., 2017). Therefore, cases of newly diagnosed TB were combined with those in the genomic clusters, indicating that 87.87% (239/272) of the cases were likely caused by transmission in our study (Figure 1).
The results of the maximum-likelihood tree of clustered strains showed 22 clusters including the largest cluster (C18) of strains spanning the study period (2023–2024). In total, 22.73% of the clusters (5/22) included strains with different drug-resistance spectra. Drug-resistance profiles progressively increased in the strains of two clusters (C7, C20) in relation to chronology, including the C7 cluster, with increased drug-resistance profiles against RIF, the C20 cluster, against PZA, and FQs. The ancestral strains from clusters C6 and C8 showed broader drug-resistance profiles than their descendants. In the largest cluster C18, all strains exhibited an identical drug-resistance profile against RIF, INH, PZA, EMB and STR, with only two harboring additional mutations conferring resistance to FQs.
Spatial distribution of clustered patient samples
We analyzed the geographic distribution of patient residential addresses across all 22 identified clusters. Four clusters (18.2%) consisted of patients residing in the same town, five clusters (22.7%) included patients from different towns within the same region, and 13 clusters (59.1%) spanned different regions entirely. In the largest cluster, C18, half of the patients (9 out of 18) were from Almaty, with five of them residing within a 5 km radius of each other (Figure 4, red dots). Notably, in cluster C3, all patients lived in the same residential building, differing only by apartment number—strongly suggesting potential for close-contact transmission.

Figure 4. Spatial distribution of unique and clustered M. tuberculosis strains in Almaty region (A) and Almaty-city (B). Black dots illustrate the spatial location of each strain and red dots illustrate the location of clustered strains.
To investigate whether geographic proximity was associated with genetic similarity of M. tuberculosis strains, we performed a geospatial correlation analysis using the Mantel test based on pairwise SNP distances. This analysis was restricted to residential addresses in Almaty and Almaty Region, which accounted for over 50% of the total dataset (Figures 4, 5). Limiting the analysis to this subset ensured sufficient data density and minimized confounding due to regional variability in transmission dynamics and healthcare infrastructure.

Figure 5. Distribution of M. tuberculosis isolates in Kazakhstan used in this study. Pie charts show the proportion of isolates among different lineages for each location. The size of pie charts corresponds with the number of isolates.
The global Mantel test revealed no statistically significant correlation between geographic distance and genetic distance among M. tuberculosis strains (Mantel statistic r = 0.06173, p = 0.1628), suggesting that, overall, closer residential proximity did not correspond to greater genetic similarity.
However, a more detailed spatial autocorrelation analysis using the Mantel correlogram uncovered significant correlations at specific distance classes. Significant positive correlation was observed in the first distance class (D.cl.1; Mantel r = 0.0634, p = 0.041), indicating that patients living very close to each other (within ~5 km; range 0–8 km) were more likely to carry genetically similar strains.
Discussion
To the best of our knowledge, this is the first comprehensive molecular epidemiological study using WGS to characterize M. tuberculosis genetic diversity, drug resistance, and transmission dynamics over a two-year period in Kazakhstan. Among the 272 successfully sequenced M. tuberculosis isolates, the dominant lineage was lineage 2.2.1 (East Asian genotype), accounting for 72.43% of cases, while the clustering rate—indicating potential recent transmission—was 29.41%. Notably, 92.65% (252/272) of TB cases were associated with recent transmission, as inferred by genomic clustering and case status, highlighting ongoing community-level spread.
Overall, 52.2% (142/272) of isolates harbored resistance to at least one anti-TB drug, and 29.04% (79/272) met the definition of MDR-TB. Importantly, 13.97% (38/272) of cases were HR-TB, underscoring the clinical significance of detecting isoniazid resistance early—especially as this resistance is not identified by the widely used GeneXpert MTB/RIF assay. In such cases, treatment regimens may be suboptimal, facilitating ongoing transmission and further resistance acquisition. These findings support the urgent need for the broader implementation of diagnostics like the cobas assay (Roche), the BD MAX (Becton Dickinson) and the Xpert MTB/XDR (Cepheid), which detects specific canonical isoniazid resistance conferring mutations and thus improve HR-TB case detection (Klopper et al., 2024).
Consistent with global patterns, the most frequent mutations observed were katG S315T (99.21%, 126/127) for INH resistance and rpoB S450L (91.11%, 82/90) for RIF resistance. The rpoB S450L mutation is causing the least fitness loss, often accompanied by compensatory mutations in rpoA or rpoC, which can fully restore fitness (Lempens et al., 2023). The strikingly high prevalence of katG S315T mutation in Kazakhstan compared to other regions (Australia: 65.4%, India: 71%, China: 63%, Iran: 53.3%) suggests a dominant, actively transmitted INH-resistant lineage (Sailo et al., 2022). While data from neighboring Central Asian countries is sparse, our findings likely reflect regional trends and emphasize the necessity of expanded molecular surveillance across the region (Auganova et al., 2023; Daniyarov et al., 2021).
Globally, the majority of studies frequently have shown that mutations within the 81-bp rifampicin-resistance-determining region (RRDR) of the rpoB gene account for over 95% of RIF resistance (Sailo et al., 2022; Nono et al., 2025). In this study, all RIF-resistant strains carried mutations within the 81-bp RRDR of the rpoB gene, with the S450L substitution dominating.
Importantly, pre-XDR-TB and XDR-TB were identified in 3.31 and 0.74% of isolates, respectively—rates notably higher than the global averages reported in the 2024 WHO TB report. Mutations associated with resistance to bedaquiline and clofazimine were predominantly found in mmpR5, including c.139dupG and c.466dupC, both of which are predicted to cause frameshifts leading to loss of repressor function and overexpression of the MmpL5 efflux pump (Sonnenkalb et al., n.d.; Snobre et al., 2024). Delamanid resistance–associated mutations were detected in ddn (c.223delG and p. Glu118*, both likely resulting in truncated, nonfunctional proteins) and fbiC (c.2565_*56del…), which is involved in cofactor F420 biosynthesis (Liu et al., 2022). These genetic alterations highlight multiple pathways to resistance, some of which may arise under selective pressure from prior drug exposure or cross-resistance mechanisms. Together, these findings indicate an emerging threat of advanced drug resistance in the region, underscoring the urgent need for ongoing genomic surveillance and the implementation of individualized treatment regimens informed by whole-genome sequencing or comprehensive drug susceptibility testing.
The clustering rate of 29.41% observed in this study is consistent with reports from other Central Asian countries, where active transmission is the primary driver of MDR-TB (Auganova et al., 2023; Engström et al., 2019). In contrast, Western European countries tend to report lower clustering rates, suggesting reactivation plays a larger role there (Guthmann and Haas, 2019; Walls and Shingadia, 2007). In Kazakhstan, especially in densely populated areas like Almaty, the high clustering rate and identification of multiple geographically and temporally linked clusters support the hypothesis of sustained community transmission.
Cluster C18, the largest identified, included strains spanning both years and predominantly affected patients residing within a 5 km radius. Intra-cluster evolution was observed, such as the progressive acquisition of resistance mutations in clusters C7 and C20. This indicates that some strains are not only being transmitted but are also evolving resistance in situ—possibly due to incomplete or ineffective treatment.
Spatial analysis revealed that 59.1% of clusters included patients from different regions, highlighting the potential role of inter-regional migration or mobility in the dissemination of M. tuberculosis strains. In some clusters, patients shared identical or nearly identical addresses, emphasizing the need to strengthen TB control in high-density housing or communal living environments.
To further explore the relationship between geographic proximity and genetic similarity of M. tuberculosis strains, we conducted a Mantel test using pairwise SNP distance matrices and geographic coordinates of patient residences. Initial analysis did not reveal a statistically significant correlation (Mantel r = 0.06173, p = 0.1628). However, a more detailed spatial autocorrelation analysis using the Mantel correlogram uncovered significant correlations at specific distance classes. Significant positive correlation was observed in the first distance class (D.cl.1; Mantel r = 0.0634, p = 0.041), indicating that patients living very close to each other (within ~5 km) were more likely to carry genetically similar strains. Suggesting spatial structuring of genetic diversity that may reflect complex patterns of TB transmission or mobility within and between communities. These findings indicate that while a broad correlation between geographic and genetic distance was not evident, spatial structure at finer scales does exist, supporting the hypothesis of localized transmission within the Almaty area.
The predominance of transmission-driven MDR-TB, the high proportion of isoniazid-monoresistant strains, and the emergence of pre-XDR/XDR-TB point to substantial gaps in current TB control measures in Kazakhstan. Strengthening contact tracing, enhancing community-based case finding, and introducing WGS-based surveillance in routine diagnostic workflows are crucial next steps (Burke et al., 2021; MacPherson et al., 2024). Special attention should be given to newly diagnosed cases and their close contacts, as the majority of clustered strains were found among new cases, indicating undetected chains of transmission.
Furthermore, molecular tools that can detect INH resistance should be prioritized to capture the full resistance spectrum. Incorporating WGS into national TB programs can enhance outbreak detection, guide individualized therapy, and provide real-time epidemiological insights.
Several limitations must be acknowledged. First, our analysis was restricted to culture-positive cases, potentially underestimating the burden of transmission from culture-negative patients. Second, we lacked detailed sociodemographic data (e.g., incarceration, homelessness, drug use) that may contribute to TB transmission risk. Lastly, as isolates were collected from a single national reference center, regional disparities in strain diversity and resistance patterns may have been missed.
This study provides a detailed genomic snapshot of the M. tuberculosis epidemic in Kazakhstan, revealing high levels of drug resistance, ongoing community transmission, and the dominance of a highly resistant lineage 2.2.1 strain. These findings highlight the urgent need for comprehensive TB control strategies incorporating WGS, rapid molecular diagnostics, and targeted interventions. Detecting isoniazid resistance—particularly given its high monoresistance rate (14%)—should be prioritized in clinical practice. As XDR-TB strains begin to emerge, national preparedness must include enhanced diagnostics, optimized treatment regimens, and regional collaboration to prevent the further spread of drug-resistant TB.
Data availability statement
The datasets presented in this study can be found in online repositories. The names of the repository/repositories and accession number(s) can be found here: https://www.ncbi.nlm.nih.gov/, PRJNA1277026.
Ethics statement
The studies involving humans were approved by the bio-ethical review board of National Scientific Center of Phthisiopulmonology of Ministry of Health of Republic of Kazakhstan (EPM no 2021-63), approval 2021-07-07. The studies were conducted in accordance with the local legislation and institutional requirements. Informed consent was obtained from all subjects and/or their legal guardian(s).
Author contributions
NT: Writing – original draft, Writing – review & editing. AK: Writing – original draft, Writing – review & editing. AM: Writing – original draft, Writing – review & editing. LC: Writing – original draft, Writing – review & editing. BT: Writing – original draft, Writing – review & editing. VB: Writing – original draft, Writing – review & editing. MA: Writing – original draft, Writing – review & editing. LE: Writing – original draft, Writing – review & editing. NN: Writing – original draft, Writing – review & editing. GZ: Writing – original draft, Writing – review & editing.
Funding
The author(s) declare that no financial support was received for the research and/or publication of this article.
Conflict of interest
The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.
Generative AI statement
The author(s) declare that no Gen AI was used in the creation of this manuscript.
Any alternative text (alt text) provided alongside figures in this article has been generated by Frontiers with the support of artificial intelligence and reasonable efforts have been made to ensure accuracy, including review by the authors wherever possible. If you identify any issues, please contact us.
Publisher’s note
All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article, or claim that may be made by its manufacturer, is not guaranteed or endorsed by the publisher.
Footnotes
References
Auganova, D., Atavliyeva, S., Akisheva, A., Tsepke, A., and Tarlykov, P. (2023). Draft genome sequence of a multidrug-resistant Mycobacterium tuberculosis clinical isolate, 3184-KZ. Microbiol. Resour. Announc. 12:e0086323. doi: 10.1128/MRA.00863-23
Auganova, D., Atavliyeva, S., Amirgazin, A., Akisheva, A., Tsepke, A., and Tarlykov, P. (2023). Genomic characterization of drug-resistant Mycobacterium tuberculosis L2/Beijing isolates from Astana, Kazakhstan. Antibiotics (Basel) 12:1523. doi: 10.3390/antibiotics12101523
Authors (n.d.-a). Available online at: https://endtb.org/kazakhstan (Accessed March 3, 2025).
Authors (n.d.-b). Available at: https://adilet.zan.kz/rus/docs/V2000021695 (Accessed 30 May, 2025).
Burke, R. M., Nliwasa, M., Feasey, H. R. A., Chaisson, L. H., Golub, J. E., Naufal, F., et al. (2021). Community-based active case-finding interventions for tuberculosis: a systematic review. Lancet Public Health 6, e283–e299. doi: 10.1016/S2468-2667(21)00033-5
Daniyarov, A., Akhmetova, A., Rakhimova, S., Abilova, Z., Yerezhepov, D., Chingissova, L., et al. (2023). Whole-genome sequence-based characterization of pre-XDR M. tuberculosis clinical isolates collected in Kazakhstan. Diagnostics (Basel) 13:2005. doi: 10.3390/diagnostics13122005
Daniyarov, A., Molkenov, A., Rakhimova, S., Akhmetova, A., Nurkina, Z., Yerezhepov, D., et al. (2020). Whole genome sequence data of Mycobacterium tuberculosis XDR strain, isolated from patient in Kazakhstan. Data Brief 33:106416. doi: 10.1016/j.dib.2020.106416
Daniyarov, A., Molkenov, A., Rakhimova, S., Akhmetova, A., Yerezhepov, D., Chingissova, L., et al. (2021). Genomic analysis of multidrug-resistant Mycobacterium tuberculosis strains from patients in Kazakhstan. Front. Genet. 12:683515. doi: 10.3389/fgene.2021.683515
Engström, A., Antonenka, U., Kadyrov, A., Kalmambetova, G., Kranzer, K., Merker, M., et al. (2019). Population structure of drug-resistant Mycobacterium tuberculosis in Central Asia. BMC Infect. Dis. 19:908. doi: 10.1186/s12879-019-4480-7
Guthmann, J. P., and Haas, W. (2019). Tuberculosis in the European Union/European economic area: much progress, still many challenges. Euro Surveill. 24:1900174. doi: 10.2807/1560-7917.ES.2019.24.12.1900174
Kairov, U., Kozhamkulov, U., Rakhimova, S., Askapuli, A., Zhabagin, M., Bismilda, V., et al. (2014). Whole genome sequencing of M.Tuberculosis in Kazakhstan: preliminary data. Cent. Asian J. Glob. Health 2:121. doi: 10.5195/cajgh.2013.121
Klopper, M., van der Merwe, C. J., van der Heijden, Y. F., Folkerts, M., Loubser, J., Streicher, E. M., et al. (2024). The hidden epidemic of isoniazid-resistant tuberculosis in South Africa. Ann. Am. Thorac. Soc. 21, 1391–1397. doi: 10.1513/AnnalsATS.202312-1076OC
Kohl, T. A., Utpatel, C., Schleusener, V., De Filippo, M. R., Beckert, P., Cirillo, D. M., et al. (2018). MTBseq: a comprehensive pipeline for whole genome sequence analysis of Mycobacterium tuberculosis complex isolates. PeerJ 6:e5895. doi: 10.7717/peerj.5895
Lempens, P., Van Deun, A., Aung, K. J. M., Hossain, M. A., Behruznia, M., Decroo, T., et al. (2023). Borderline rpoB mutations transmit at the same rate as common rpoB mutations in a tuberculosis cohort in Bangladesh. Microb. Genom. 9:001109. doi: 10.1099/mgen.0.001109
Liu, D., Huang, F., Zhang, G., He, W., Ou, X., He, P., et al. (2022). Whole-genome sequencing for surveillance of tuberculosis drug resistance and determination of resistance level in China. Clin. Microbiol. Infect. 28, 731.e9–731.e15. doi: 10.1016/j.cmi.2021.09.014
Liu, Y., Shi, J., Li, L., Wu, T., Chu, P., Pang, Y., et al. (2022). Spontaneous mutational patterns and novel mutations for Delamanid resistance in Mycobacterium tuberculosis. Antimicrob. Agents Chemother. 66:e0053122. doi: 10.1128/aac.00531-22
MacPherson, P., Shanaube, K., Phiri, M. D., Rickman, H. M., Horton, K. C., Feasey, H. R. A., et al. (2024). Community-based active-case finding for tuberculosis: navigating a complex minefield. BMC Glob. Public Health 2:9. doi: 10.1186/s44263-024-00042-9
Merker, M., Barbier, M., Cox, H., Rasigade, J. P., Feuerriegel, S., Kohl, T. A., et al. (2018). Compensatory evolution drives multidrug-resistant tuberculosis in Central Asia. eLife 7:e38200. doi: 10.7554/eLife.38200
Minh, B. Q., Schmidt, H. A., Chernomor, O., Schrempf, D., Woodhams, M. D., von Haeseler, A., et al. (2020). IQ-TREE 2: new models and efficient methods for phylogenetic inference in the genomic era. Mol. Biol. Evol. 37, 1530–1534. doi: 10.1093/molbev/msaa015
Nono, V. N., Nantia, E. A., Mutshembele, A., Teagho, S. N., Simo, Y. W. K., Takong, B. S., et al. (2025). Prevalence of katG and inhA mutations associated with isoniazid resistance in Mycobacterium tuberculosis clinical isolates in Cameroon. BMC Microbiol. 25:127. doi: 10.1186/s12866-025-03816-9
Phelan, J. E., O'Sullivan, D. M., Machado, D., Ramos, J., Oppong, Y. E. A., Campino, S., et al. (2019). Integrating informatics tools and portable sequencing technology for rapid detection of resistance to anti-tuberculous drugs. Genome Med. 11:41. doi: 10.1186/s13073-019-0650-x
Ratnam, S., Stead, F. A., and Howes, M. (1987). Simplified acetylcysteine-alkali digestion-decontamination procedure for isolation of mycobacteria from clinical specimens. J. Clin. Microbiol. 25, 1428–1432. doi: 10.1128/jcm.25.8.1428-1432.1987
Sailo, C. V., Lalremruata, R., Sanga, Z., Fela, V., Kharkongor, F., Chhakchhuak, Z., et al. (2022). Distribution and frequency of common mutations in rpoB gene of Mycobacterium tuberculosis detected by Xpert MTB/RIF and identification of residential areas of rifampicin resistant-TB cases: a first retrospective study from Mizoram, Northeast India. J. Clin. Tuberc. Other Mycobact. Dis. 29:100342. doi: 10.1016/j.jctube.2022.100342
Satta, G., Lipman, M., Smith, G. P., Arnold, C., Kon, O. M., and McHugh, T. D. (2018). Mycobacterium tuberculosis and whole-genome sequencing: how close are we to unleashing its full potential? Clin. Microbiol. Infect. 24, 604–609. doi: 10.1016/j.cmi.2017.10.030
Snobre, J., Meehan, C. J., Mulders, W., Rigouts, L., Buyl, R., de Jong, B. C., et al. (2024). Frameshift mutations in the mmpR5 gene can have a bedaquiline-susceptible phenotype by retaining a protein structure and function similar to wild-type Mycobacterium tuberculosis. Antimicrob. Agents Chemother. 68:e0085424. doi: 10.1128/aac.00854-24
Sonnenkalb, L., Carter, JJ., Spitaleri, A., Iqbal, Z., Hunt, M., Malone, KM., et al. (n.d.). Bedaquiline and clofazimine resistance in Mycobacterium tuberculosis: an in-vitro and in-silico data analysis. Lancet Microbe 4, e358–e368. doi: 10.1016/S2666-5247(23)00002-2
Tarlykov, P., Atavliyeva, S., Alenova, A., and Ramankulov, Y. (2020). Genomic analysis of Latin American-Mediterranean family of Mycobacterium tuberculosis clinical strains from Kazakhstan. Mem. Inst. Oswaldo Cruz 115:e200215. doi: 10.1590/0074-02760200215
Walls, T., and Shingadia, D. (2007). The epidemiology of tuberculosis in Europe. Arch. Dis. Child. 92, 726–729. doi: 10.1136/adc.2006.102889
World Health Organization (2021). Meeting report of the WHO expert consultation on the definition of extensively drug-resistant tuberculosis, 27–29 October 2020. Geneva: World Health Organization. CC BY-NC-SA 3.0 IGO.
World Health Organization (2023a). Global tuberculosis report 2023. Geneva: World Health Organization. Licence: CC BY-NC-SA 3.0 IGO.
World Health Organization (2023b). Catalogue of mutations in Mycobacterium tuberculosis complex and their association with drug resistance, second edition. Geneva: World Health Organization. Licence: CC BY-NC-SA 3.0 IGO.
World Health Organization (2024). Global tuberculosis report 2024. Geneva: World Health Organization. Licence: CC BY-NC-SA 3.0 IGO.
World Health Organization. (n.d.). Tuberculosis Profile: Kazakhstan Available online at: https://worldhealthorg.shinyapps.io/tb_profiles/?_inputs_&entity_type=%22country%22&lan=%22EN%22&iso2=%22KZ%22 (Accessed May 30, 2025).
Yang, C., Luo, T., Shen, X., Wu, J., Gan, M., Xu, P., et al. (2017). Transmission of multidrug-resistant Mycobacterium tuberculosis in Shanghai, China: a retrospective observational study using whole-genome sequencing and epidemiological investigation. Lancet Infect. Dis. 17, 275–284. doi: 10.1016/S1473-3099(16)30418-2
Keywords: tuberculosis, whole-genome sequencing (WGS), drug resistance, transmission, Mycobacterium tuberculosis, phylogenetic diversity, Kazakhstan
Citation: Takenov N, Kaziyev A, Mukhamadi A, Chingissova L, Toxanbayeva B, Bismilda V, Adenov M, Eralieva L, Nakisbekov N and Zhunussova G (2025) Genetic diversity, drug resistance, and transmission patterns of tuberculosis based on whole-genome sequencing in Almaty, Kazakhstan. Front. Microbiol. 16:1649137. doi: 10.3389/fmicb.2025.1649137
Edited by:
Vijay Srinivasan, Texas A&M University, United StatesReviewed by:
Arup Ghosh, Norwegian Institute of Public Health (NIPH), NorwayRicha Dwivedi, Meharry Medical College, United States
Copyright © 2025 Takenov, Kaziyev, Mukhamadi, Chingissova, Toxanbayeva, Bismilda, Adenov, Eralieva, Nakisbekov and Zhunussova. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.
*Correspondence: Nurlan Takenov, dGFrZW5vdm51ckBnbWFpbC5jb20=