Your new experience awaits. Try the new design now and help us make it even better

ORIGINAL RESEARCH article

Front. Microbiol., 26 November 2025

Sec. Systems Microbiology

Volume 16 - 2025 | https://doi.org/10.3389/fmicb.2025.1636322

Comparative metagenomic analysis on COPD and health control samples reveals taxonomic and functional motifs

Guangyi Chen,,&#x;Guangyi Chen1,2,3Chantal WiegandChantal Wiegand4Andreas WillettAndreas Willett4Christian HerrChristian Herr4Rolf Müller,Rolf Müller5,6Robert Bals,
&#x;Robert Bals4,7*Olga V. Kalinina,,
&#x;Olga V. Kalinina2,3,8*
  • 1Graduate School of Computer Science, Saarland University, Saarbrücken, Saarland, Germany
  • 2Center for Bioinformatics, Saarland University, Saarbrücken, Germany
  • 3Research Group Drug Bioinformatics, Helmholtz Institute for Pharmaceutical Research Saarland (HIPS), Helmholtz Centre for Infection Research (HZI), Saarbrücken, Germany
  • 4Department of Internal Medicine V-Pulmonology, Allergology, Infectious Diseases and Critical Care Medicine, Saarland University, Homburg, Germany
  • 5Department of Microbial Natural Products, Helmholtz Institute for Pharmaceutical Research Saarland (HIPS), Helmholtz Centre for Infection Research (HZI), Saarbrücken, Germany
  • 6PharmaScienceHub, Saarbrücken, Germany
  • 7Department of Molecular Therapies for Lung Diseases, Helmholtz Institute for Pharmaceutical Research Saarland (HIPS), Helmholtz Centre for Infection Research (HZI), Saarbrücken, Germany
  • 8Medical Faculty, Saarland University, Homburg, Germany

Chronic obstructive pulmonary disease (COPD) is a progressive lung condition marked by persistent respiratory symptoms and airflow limitation and significantly affects global health. The intricate relationship between COPD and the lung microbiome has garnered attention, with metagenomic analyses revealing critical insights into microbial community dynamics and their functional roles. In this study, we conducted a comprehensive metagenomic analysis comparing throat samples from COPD patients (n = 26) and healthy controls (n = 32) derived from a large cohort analyzed at the Saarland University Hospital. Taxonomic profiling and differential abundance analysis indicated a significant reduction of the microbial diversity in COPD patients, with notable overrepresentation of pathogenic bacteria, such as Veillonella parvula (NCBI:txid29466), Streptococcus gordonii (NCBI:txid1302), Scardovia wiggsiae (NCBI:txid230143), as well as a less stable microbiome composition than in healthy individuals. Functional profiling identified alterations in metabolic pathways implicating microbial dysbiosis in disease progression. The study also highlighted enrichment of inflammation-related pathways in COPD samples, emphasizing the microbiome’s role in inflammatory processes. Comparative analysis of bronchoalveolar lavage (BAL) and throat samples collected from the same 11 individuals further underscored distinct microbial compositions across respiratory tract regions, suggesting spatial variability in microbial communities. Metagenomic approaches including analysis of metabolic pathways showed significant alteration of the microbiome of the lung in COPD.

1 Introduction

Chronic obstructive pulmonary disease (COPD) is a progressive inflammatory lung disease characterized by persistent airflow limitation and chronic bronchitis or emphysema. It is a leading cause of morbidity and mortality worldwide (3.5 million deaths, fourth most death cases in 2021), significantly impacting the quality of life and placing a considerable burden on healthcare systems (Mayo Clinic, 2020; World Health Organization, 2023). COPD results from long-term exposure to harmful particles or gases, most commonly from smoking, which leads to abnormal inflammatory responses in the lungs (Cleveland Clinic, 2022). The chronic exposure to smoke in COPD causes influx of myeloid cells (macrophages, neutrophils), activation of lymphoid cells, activation of epithelial inflammation and remodeling interaction between inflammatory processes and alterations of the microbiome (O’Donnell et al., 2006). In the COPD-infected samples, changes in the composition and function of the microbiome have been observed. Studies using sputum and bronchoalveolar lavage (BAL) samples have shown distinct microbial communities in the upper and lower respiratory tracts of COPD patients (Zakharkina et al., 2013). Recent 16S rRNA gene sequencing and shotgun/metagenomic studies demonstrate that these variations may associate with disease status, severity, and exacerbation risk and influence disease progression and exacerbation frequency (Ramirez et al., 2021; Pathak et al., 2020; Tangedal et al., 2024).

Metagenomic sequencing offers a culture-independent approach that enables comprehensive profiling of the microbial communities and their functional potentials directly from clinical samples (Pérez-Cobas et al., 2020). Metagenomic profiling involves the extraction and sequencing of microbial DNA from clinical samples, followed by bioinformatics analysis to identify microbial taxa (taxonomic profiling) and their functional genes and pathways (functional profiling) (Aguiar-Pulido et al., 2016). This approach allows for high-resolution analysis of the compositional microbiome, providing insights into the potential roles of specific microbes and their metabolic pathways in COPD, and further uncovers alterations in metabolic pathways related to lipid metabolism, oxidative stress, and immune responses in COPD patients (Bowerman et al., 2020; Dora et al., 2024).

Differential abundance analysis (DAA) is a critical component of metagenomic studies, as it identifies microbial taxa and functional genes/pathways that are significantly associated with disease states (Yang and Chen, 2022). To provide a more robust perspective, these findings are typically complemented by multivariate community-level analyses (e.g., ordination and PERMANOVA), which demonstrate overall differences in microbial composition between groups and thereby strengthen the evidence for disease-associated shifts (Kleine Bardenhorst et al., 2021; Xia and Sun, 2017). In the context of COPD, such analyses have highlighted specific bacterial species and functional pathways that are differentially abundant in patients compared to healthy controls. For instance, the increased presence of Proteobacteria and the depletion of beneficial commensals like Firmicutes have been linked to disease severity and exacerbations. In healthy individuals, the predominant phyla in health lungs are Firmicutes and Bacteroidetes, followed by Proteobacteria and Actinobacteria (Hou et al., 2022). Altered abundance of Pseudomonas, Moraxella, Lactobacillus, and Haemophilus have been identified during COPD exacerbations (Millares et al., 2014). The airway microbiome of COPD patients is typically characterized by a reduction in microbial diversity and an overrepresentation of potentially pathogenic bacteria in genera such as Streptococcus, Pseudomonas, Moraxella and Haemophilus using 16S rRNA gene amplification (Ramsheh et al., 2021; Millares et al., 2014). These alterations can disrupt the homeostasis of the respiratory tract, leading to increased inflammation and exacerbations (Po et al., 2011; De Matteis et al., 2019). Another type of analysis focuses on differentially represented genes and pathways. Specifically, COPD patients exhibited an enrichment of genes related to virulence, antibiotic resistance, and inflammation (Kayongo et al., 2022).

The aim of this study was to perform a detailed comparison of the microbiomes from upper respiratory tract samples from COPD patients and healthy controls from the IMAGINE study (Schmartz et al., 2024), as well as BAL samples from the University Hospital Saarland, applying metagenomic analysis of taxonomic and functional profiling. We demonstrate significant differences in the diversity and composition of the microbiome between COPD patients and controls already in the throat samples, alleviating the need to obtain sputum samples. We highlight inflammation-related genes and pathways that are enriched in the samples from the COPD patients.

2 Materials and methods

2.1 Sample collection and study design

This study capitalizes on the data collected by the IMAGINE consortium (Schmartz et al., 2024). The whole IMAGINE cohort consists of 3,483 samples from 657 individuals spanning different body sites including saliva, interdental plaque, conjunctival swabs, throat swabs, stool, skin swabs, and so on. The disease information of these 657 patients was also documented. To focus on the respiratory system, we selected the available throat samples from 32 normal health control individuals (without any disease) and 26 COPD patients, forming the two groups for this study (Supplementary Table 1). Additionally, in order to draw the comparison between bronchoalveolar lavage (BAL) and throat samples, we selected the 11 individuals whose BAL (acquired from the University Hospital Saarland) and throat samples are both available. Among them 3 individuals are COPD patients overlapped with the COPD patients in comparison 1, the rest 8 are other non-health individual (Supplementary Table 1). These individuals are all from the IMAGINE study (Schmartz et al., 2024). The BAL samples are internally collected from the University Hospital Saarland and not a part of the IMAGINE study.

We designed two comparisons (Figure 1A): comparison 1 focuses on testing the taxonomical and functional differences between the throat samples in COPD and control groups; comparison 2 is designed to test if there are significant differences in terms of the microbiome compositions between the BAL and throat samples (Figure 1B).

Figure 1
Diagram consisting of three parts: (A) depicts the cohort split into two groups—Throat Control (n=32) and Throat COPD (n=26), and BAL and Throat, both with n=11. (B) shows a Venn diagram comparing Throat Control and COPD (Comparison 1) with BAL and Throat (Comparison 2). (C) outlines a data processing workflow detailing steps from sequencing to analysis, including tools like KneadData, MetaPhlAn, HUMAnN, and MaAsLin2, focusing on raw data to differential analysis.

Figure 1. Design and Bioinformatics workflow of this study. (A) Study design. A total of 1931 high-quality metagenomic samples were obtained from IMAGINE cohort. Two sets of samples were compared in this study: Comparison 1 is used to compare the taxonomic and functional profiles of the COPD and control groups. Comparison 2 focused on comparing the taxonomic profiles between BAL (acquired additionally from the University Hospital Saarland) and throat samples collected from the same individuals. (B) Venn diagram of selected individuals in this study. (C) Workflow of this study.

2.2 Metagenomics analysis

The computational pipeline of this study is shown in Figure 1C.

For all the throat samples, we downloaded the preprocessed reads directly from the IMAGINE study in Sequencing Read Archive (SRA) under the accession code PRJNA1057503. For all the BAL samples, we collected the processed reads internally and uploaded to SRA under the accession code PRJNA1327646. All the processed reads were applied uniformly with the following pipeline from the IMAGINE study: The raw paired-end reads were firstly processed with Kneaddata (v0.7.4) to remove human reads contamination (Beghini et al., 2021). The clean reads were fed into fastp (v0.20.1) to trim out the low-quality reads (Chen, 2023). MultiQC (v1.11) was used to visualize the results (Ewels et al., 2016). The remaining filtered reads were used for further analysis.

To perform the taxonomic profiling, MetaPhlAn4 (4.1.0) was run on the filtered reads of each sample using the reference database mpa_vJun23_CHOCOPhlAnSGB_202307 to get the profiling report for each sample (Blanco-Míguez et al., 2023). The relative counts were normalized to 100%. The individual samples were merged into an aggregated text file. These profiling reports were used for further calculation of alpha and beta diversity using the auxiliary utilities from the same tool. Species and genus-level abundances were extracted for visualization (using hclust2 v1.0.0; SegataLab, 2020) and further differential abundance analysis.

For functional profiling, HUMAnN3 (3.6.1) with nucleotide database full_chocophlan_v201901_v31, translation database UniRef90, and the taxonomic profile from the previous step for each sample was employed (Beghini et al., 2021; Suzek et al., 2015). The output from this tool, namely identified MetaCyc pathway abundances with contributions from each specific species (stratified outputs), were then normalized to relative abundances, and individual samples were merged into an aggregated text file. In MetaCyc, microbial pathways are defined as metabolic pathways or biochemical reaction networks that are found in microbes (e.g., bacteria, archaea, fungi). MetaCyc provides detailed information about these pathways, describing how specific sequences of enzymatic reactions transform substrates into products (Caspi et al., 2020). We extracted the total abundance (unstratified) for each pathway from the aggregated profiles for further differential abundance analysis. In order to investigate the dynamics of pathways that are involved in inflammation, we searched the pathways that are relevant to inflammation in the MetaCyc database through a literature review and mapped them back to the pathway abundance results. Pathways were visualized using the ‘Pathway Collages’ tool from the MetaCyc website.

Welch’s t-test and Chi-square test were performed using Python package scipy (v 1.16.2). PERMANOVA and Principal Coordinate Analysis (PCoA) analysis were performed using Python package scikit-bio (v0.7.0). We used the R package MaAsLin2 (version 1.14.1) to perform differential abundance analysis, fitting generalized linear models to identify microbial features significantly associated with the primary grouping factor (Comparison 1: COPD vs. Control; Comparison 2: Throat vs. BAL). For the comparison of throat samples between the COPD and control groups (Comparison 1), we focus on testing the microbial features of taxonomic profiles, functional profiles, and alpha diversity, by applying the following test settings: (taxonomic profiles and functional profiles) fixed effects: group (COPD and control, main interest for testing), age, sex and BMI (covariates); analysis method: linear model; the minimal required prevalence: 10%; Benjamini-Hochberg correction; the Total Sum Scaling (TSS) normalization; log transformation (Mallick et al., 2021). For taxonomic profiles, we focused on testing the species and genus-level relative abundances. For functional profiles, we focused on testing the unstratified profile (community-level abundance) to reduce the number of features (taxonomic profiles and functional profiles). For the comparison between BAL and throat samples (Comparison 2), we focus on testing the microbial features of the alpha diversity by applying the following test settings: fixed effect: group (COPD and control, main interest for testing), age, sex, and BMI (covariates); analysis method: linear model; the minimal required prevalence: 10%. We tested for alpha diversity indices in Shannon and Simpson metrics (Shannon, 1948; Simpson, 1949). Venn diagrams used in this study were created using DeepVenn (Hulsen, 2022).

3 Results

3.1 Summary of study participants

In the IMAGINE cohort, each individual is associated with metadata including information on, for example, disease status, age, sex, etc. We identified 38 COPD patients and 46 healthy individuals (participants without any known disease) as the healthy controls in this cohort. To focus on COPD-relevant probes, we selected throat samples, resulting in 32 and 26 available throat samples for the COPD and control groups, respectively (Table 1). The age and BMI between the COPD and control group are significantly different (Welch’s t-test, age: p = 0.0000; BMI: p = 0.0164). Additionally, to compare the microbiome composition between BAL and throat samples, we collected data for 11 individuals (among which 3 are COPD patients, Figure 1B), whose BAL and throat samples are both available (Table 1). See Supplementary 1 for the complete metadata.

Table 1
www.frontiersin.org

Table 1. Baseline summary of the individuals in this study.

3.2 Both taxonomic and functional profiles show a higher diversity in the control group over the COPD group

PERMANOVA analysis indicated a statistically significant difference in throat microbiome composition at the species level between COPD and control cohorts (Bray–Curtis dissimilarity; pseudo-F = 3.16, p = 0.002, 999 permutations). This result was corroborated by the principal coordinates analysis (PCoA), which revealed clear group separation along the first two principal coordinates based on Bray–Curtis dissimilarities (Figure 2A).

Figure 2
(A) Scatter plot showing Principal Coordinates Analysis (PCoA) of data from Control and COPD groups, with red and blue dots representing each group, encircled by dashed lines. (B) Heatmap illustrating microbial species abundance between Control and COPD groups, with a color gradient indicating relative abundance. Labels on the right detail specific species. (C) Similar heatmap comparing microbial genera between the two groups, with labels indicating different genera. Both heatmaps use a color scale bar for reference above.

Figure 2. COPD and control species profiling comparison. (A) Principal Coordinates Analysis (PCoA) based on Bray–Curtis dissimilarity of throat microbiome samples from COPD patients and healthy controls. Each point represents a sample, colored by group (blue: Control; red: COPD). Dashed ellipses indicate the 95% confidence interval of each group, showing partial separation along the first two principal coordinates (PC1: 20.69% variance explained; PC2: 15.30% variance explained). (B) The top 20 variable species in both the COPD and control groups; (C) The top 20 variable genera in both the COPD and control groups.

Taxonomic profiling results have shown that more different bacterial species and genera have been detected in the control group than in the COPD samples (control: 230, COPD: 179, shared: 151). For the control group, we found that the most abundant species detected include Neisseria subflava (NCBI:txid28449), Rothia muciladinosa (NCBI:txid43675), Veillonella dispar (NCBI:txid39778), Veillonella atypical (NCBI:txid39777), and Schaalia species (NCBI:txid2529408) and the most abundant genera are Neisseria (NCBI:txid482), Veillonella (NCBI:txid29465), Schaalia, Rothia (NCBI:txid32207), and Actinomyces (NCBI:txid1654), the results are largely consistent with the findings reported in the previous study (Natalini et al., 2023). For the COPD group, we detected similar species and genera as most abundant, but their distribution is skewed compared to the control group, with a more dominant abundance for Rothia mucilaginosa on the species level and Veillonella on the genus level (Supplementary Figures 1A,B).

Among the top 20 species and genera with the largest abundance variation across all COPD and control samples (Figures 2B,C), we observed that the most variable taxa for healthy controls agree well with the those observed in the sample-wise profiles, while the COPD samples have higher variable abundances for these taxa. Further, differential abundance analysis revealed 73 species and 40 genera significantly enriched in the control group, and 43 species and 15 genera significantly enriched in the COPD group (Supplementary Tables 2, 3). The results align closely with the findings of the previous study (Natalini et al., 2023) (Figure 3).

Figure 3
Two Volcano plots labeled A and B compare sample groups based on effect size and -log10(q-value). In plot A, blue dots indicate control-enriched data, with top controls marked in blue stars, and red dots show COPD-enriched data, with top COPD markers in red stars. Notable species include Veillonella rogosae and Veillonella parvula. Plot B similarly categorizes data, noting species like Candidatus Absconditabacteria and Lactobacillus. Gray dots represent insignificant data points, with dashed lines marking the significance threshold at a q-value of 0.25. Each plot has a legend explaining dot colors and symbols.

Figure 3. Volcano plots of differential abundant taxa between COPD and control groups in throat samples using MaAsLin2. Each point represents one taxon, plotted by effect size (x-axis) and –log10-transformed q-value (y-axis). Gray points indicate non-significant species, while red and blue points denote COPD- and control-enriched species, respectively. Stars highlight the top three significantly enriched taxa per group. The horizontal dashed line marks the default significance threshold at q-value = 0.25 by MaAsLin2. (A) Results for species-level. (B) Results for genus-level.

Interestingly, the control samples have statistically significantly higher alpha diversities (Shannon and Simpson) than the COPD samples (Figures 4A,B and Supplementary Figure 2). Beta diversity (Bray-Curtis) analysis indicated that the control group showed slightly smaller inter-group diversity (0.658 ± 0.193) compared to the COPD group (0.726 ± 0.197) (Figure 4C). These results suggest that the lung microbiome of COPD patients tends, on one hand, to comprise fewer different bacteria, but on the other hand, has a more variable composition between patients, as compared to the healthy controls.

Figure 4
Box plots (A) and (B) compare Shannon and Simpsonindices between control and COPD groups, indicating differences in microbial diversity. A heatmap (C) shows beta-diversities for control vs. control, COPD vs. COPD, and control vs. COPD, with a color gradient indicating the level of similarity.

Figure 4. Alpha and beta diversity of samples between the COPD and control group. (A) Box plot of alpha diversities in Shannon metrics show statistically significant differences between the COPD and control groups. (B) Box plot of alpha diversities in Simpson metrics show statistically significant differences between the COPD and control groups. (C) Grouped pairwise beta diversities (Bray-Curtis) in intra-group comparisons (control vs. control, COPD vs. COPD) and inter-group comparison (COPD vs. control).

Functional profiling identified a total of 430 microbial pathways (metabolic pathways or biochemical reaction networks that are found in microbes, e.g., bacteria, archaea, fungi) across all the samples, where 382 pathways are shared between the COPD and control groups. The control group contains a higher number of pathways than the COPD group. The abundance of each pathway was determined by summing the abundances of its constituent reactions, inferred from gene family abundances mapped to enzymatic functions, and adjusted for pathway completeness and sequencing depth. By analyzing the eight most abundant pathways per sample (Figure 5), we found that the control group exhibits a greater number of distinctive abundant pathways in total than the COPD group (control: 38; COPD: 32). Differential abundance analysis suggests that 21 pathways are significantly enriched in the control group, and 55 pathways are significantly enriched in the COPD group (Supplementary Table 4).

Figure 5
Dot plot comparing the top eight abundant pathways between COPD and control groups. Each pathway is represented by colored dots, with abundances shown vertically. A legend on the right identifies the pathways by name and color.

Figure 5. Comparison of detected pathways between the COPD and control group. The top 8 abundant pathways in each sample of the COPD and control groups. Circle size is proportional to the relative abundance.

3.3 Inflammation-related pathway enrichment analysis

Previous studies have demonstrated that isoprenoids, in particular farnesyl pyrophosphate (FPP), geranylgeranyl giphosphate (GGPP) and farnesol, play a key role in inflammation response (Marcuzzi et al., 2008; Santoro et al., 2018) (Supplementary Figure 3). Between our COPD and control groups, the COPD are enriched in three pathways assisting isoprenoid production: isoprene biosynthesis I (via MEP) (BioCyc Id: PWY-6270), superpathway of geranylgeranyl diphosphate biosynthesis II (via MEP) (BioCyc Id: PWY-5121) and all-trans-farnesol biosynthesis (BioCyc Id: PWY-6859). By examining the stratified contributors to each pathway, we cannot identify a single major contributing species (where they come from), but rather we observe a community effort from various bacteria across different samples, possibly caused by the infection stimulating the joint proliferation of bacteria harboring these pathways (Figure 6).

Figure 6
Three stacked bar charts labeled (A), (B), and (C) show biosynthesis pathways across 58 samples. Each chart presents abundance on the y-axis, illustrating contributions from various bacteria. The pathways are isoprene biosynthesis I, superpathway of geranylgeranyl diphosphate biosynthesis II, and all-trans-farnesol biosynthesis. Samples are categorized as COPD or control, indicated by blue and orange bars at the bottom. Each chart includes a key for stratified contributions from different bacterial species.

Figure 6. Three inflammation-related pathways (A) isoprene biosynthesis I (via MEP) (BioCyc Id: PWY-6270), (B) superpathway of geranylgeranyl diphosphate biosynthesis II (via MEP) (BioCyc Id: PWY-5121), and (C) all-trans-farnesol biosynthesis (BioCyc Id: PWY-6859) are enriched in the COPD samples.

3.4 BAL samples show significant difference with throat samples

We also performed a comparison between the BAL samples and throat samples from 11 participants to evaluate whether pharyngeal samples can replace BAL samples for metagenomic and metabolomic analysis, since BAL samples depend on an invasive procedure of bronchoscopy. However, the throat samples contain more species (throat: 392; BAL: 82; shared: 62) and genera (throat: 166; BAL: 59; shared: 44) than the BAL samples. Differential abundance analysis also shows significantly higher alpha diversities (Shannon, Simpson, and richness) from the throat samples (Figure 7), which makes it difficult to replace one with the other.

Figure 7
Box plots labeled A, B, and C show diversity metrics for two groups: BAL and Throat. A displays Shannon index, B Simpson index, and C Richness. Throat group generally shows higher diversity. Data points, statistical values, and coefficients are annotated for each plot.

Figure 7. The microbiome abundance comparison between BAL and throat samples. Box plot of alpha diversities in (A) Shannon, (B) Simpson and (C) Richness metrics show statistically significant differences between the throat and BAL groups.

4 Discussion

In this study, we compared upper respiratory tract microbiomes of COPD patients and healthy individuals. We conclude, first, that the control group exhibits greater taxonomic and functional diversity compared to the COPD group; second, that in the COPD group, three pathways involved in isoprenoid production are enriched, which supports the notion of the inflammatory response in COPD; and third, that bronchoalveolar lavage (BAL) samples differ significantly from throat samples.

COPD is a complex disease whose mechanisms are not yet fully understood. It involves interactions among bacteria within the human lung microbiome environment. To understand the disease mechanisms, it is essential to understand the role of microbiomes and its functional capabilities. Thanks to the recent development of sequencing technology and metagenomics methods, we are now in a position to gain a better understanding of that. In recent years, several studies have leveraged metagenomic approaches to explore the microbial and functional landscape of COPD. High-throughput sequencing has been used to analyze the lung microbiomes of COPD patients, identifying significant alterations in microbial diversity and functional genes related to inflammation and immune response (Cameron et al., 2016). Another study focused on the microbiome diversity in the bronchial tracts of COPD patients using high-throughput sequencing, revealing that COPD patients have a significantly different microbial composition compared to healthy individuals (Cabrera-Rubio et al., 2012). Furthermore, a comprehensive study analyzed sputum samples from COPD patients and controls and identified biomarkers that are significantly elevated in COPD patients. These biomarkers are associated with disease severity and can predict future exacerbations, implicating pathways such as mucus hydration, adenosine metabolism, and oxidative stress as potential therapeutic targets (Esther et al., 2022). All these findings agree well with the results of the study presented here. Despite these advancements, limitations persist. Regardless of comprehensive metagenomic studies on microbial organisms, genes, and pathways, they do not always clarify which microbial species are actively contributing to disease pathology. Functional metagenomics is still in its infancy, and interpreting the vast amount of data generated remains a significant challenge. Future research should focus on integrating multi-omics approaches and longitudinal studies to better understand the dynamic interactions between the lung microbiome and COPD pathogenesis.

This study contributes to the progress on the field in several aspects. First, the analysis of the taxonomic and functional profiles of the COPD and control groups throat samples and of the microbiomes contributed to the understanding of the species diversity and its change in the disease. Second, a systematical comparison of the COPD and control groups indicates that the microbiomes and pathways that are significantly different. Third, the characterization of the pathways involving the inflammation process and of other inflammation-related pathways demonstrates that they are enriched in the COPD samples. The detected microbiomes in COPD samples from our study align closely with those reported in the previous research (Cameron et al., 2016; Cabrera-Rubio et al., 2012; Wang et al., 2021). Furthermore, our findings on higher alpha diversity in the control group over the COPD group are consistent with the previous study (Diao et al., 2017). Additionally, we identified a higher beta diversity in the COPD samples, which, together with our observations on alpha diversity, indicate that microbiome in COPD patients is narrower and destabilized. This finding aligns well with the prior research and further strengthens our comprehension of the microbiome community within the intricate landscape of COPD (Sin, 2023). A key innovation of our study lies in its comprehensive functional profiling of samples, particularly the comparison of inflammation-related pathways between COPD and control groups. This contrasts with previous studies that predominantly focused on other pathways, such as bacterial growth, or focused on the mechanisms of the inflammation-related pathway itself (Cameron et al., 2016; Yamada and Ichinose, 2018). Earlier studies have shown that isoprenoids such as farnesyl pyrophosphate (FPP), geranylgeranyl diphosphate (GGPP), and farnesol play central roles in regulating inflammation and immune signaling (Marcuzzi et al., 2008; Santoro et al., 2018). The enrichment of three isoprenoid-related microbial pathways in COPD-associated microbiomes (isoprene biosynthesis I (via MEP) (BioCyc Id: PWY-6270), superpathway of geranylgeranyl diphosphate biosynthesis II (via MEP) (BioCyc Id: PWY-5121) and all-trans-farnesol biosynthesis (BioCyc Id: PWY-6859)) identified by this study suggests a clinically relevant metabolic link between microbial activity and chronic airway inflammation. Specifically, the microbial pathway of isoprene biosynthesis leads to the formation of precursors for nonsterol isoprenoids such as farnesyl and geranylgeranyl derivatives that play essential roles in immune regulation and inflammation control (Houten et al., 2003). Further, farnesol biosynthesis can downregulate the expression of inflammatory mediators and act as a virulence factor by inducing anti-inflammatory responses and suppressing pro-inflammatory cytokines, thereby increasing host susceptibility to infection (Jung et al., 2018). Together, these findings point to microbial isoprenoid metabolism as a clinically relevant contributor to airway inflammation in COPD and a potential target for therapeutic modulation.

We also compared samples from BAL with those from pharyngeal swabs to evaluate whether both sample types correlate and established a significantly lower microbiome diversity in the BAL samples compared to the pharyngeal swabs. This conclusion aligns with the well-established ecological split between the upper and lower airways. Oropharyngeal communities are consistently more diverse and cluster separately from lung communities, reflecting the upper airway’s higher biomass and frequent immigration from the oral cavity. In contrast, the lower airways are a low-biomass environment shaped by stronger niche filtering and host defenses. Previous study has also validated this conclusion and reported greater diversity in oropharyngeal/throat swabs than in BAL, with clear community separation (Kirst et al., 2019). Clinically, reduced α-diversity in the lower airways is often interpreted as a shift toward dysbiosis or domination by a few taxa, which may compromise ecological resilience and cost the lung susceptible to pathogen overgrowth or inflammation (Dickson et al., 2020). In chronic airway disease such as COPD, lower airway microbiome alterations and loss of diversity have been associated with more frequent exacerbations and adverse clinical trajectories (Li et al., 2024). The diminished diversity in BAL relative to throat samples underscores the possibility that changes in the lower-airway microbiota may more closely reflect disease processes or prognostic risk than do surrogate upper-airway samples (Mikhail and O’Dwyer, 2025).

In conclusion, this study provides important contributions to our understanding of the COPD-associated microbiome and its functional capabilities. The insights gained could trigger future efforts to identify microbiome-based biomarkers or therapeutic targets, ultimately aiding in the development of more personalized and effective treatment strategies for COPD.

Data availability statement

The throat samples metagenomic sequencing data after removing ambient human DNA analyzed for this study was retrieved from the Sequencing Read Archive under the accession code PRJNA1057503 (https://www.ncbi.nlm.nih.gov/bioproject/PRJNA1057503). The BAL samples metagenomic sequencing data after removing ambient human DNA analyzed for this study has been deposited in the Sequencing Read Archive under the accession code PRJNA1327646 (https://www.ncbi.nlm.nih.gov/bioproject/PRJNA1327646). Details of the sample sources are provided in Supplementary Table 5.

Ethics statement

The studies involving humans were approved by Ethic committee of the Landesärztekammer des Saarlandes. The studies were conducted in accordance with the local legislation and institutional requirements. The participants provided their written informed consent to participate in this study.

Author contributions

GC: Visualization, Writing – original draft, Investigation, Software, Data curation, Resources, Project administration, Conceptualization, Formal Analysis, Methodology, Writing – review & editing, Validation. CW: Writing – review & editing. AW: Writing – review & editing. CH: Writing – review & editing. RM: Writing – review & editing. RB: Resources, Funding acquisition, Investigation, Writing – review & editing, Project administration, Methodology, Supervision, Validation, Conceptualization. OK: Validation, Resources, Conceptualization, Funding acquisition, Project administration, Supervision, Methodology, Writing – review & editing, Investigation.

Funding

The author(s) declare that financial support was received for the research and/or publication of this article. GC was funded by the Leibniz Science Campus “Living Therapeutic Materials.” OK acknowledges funding from the Klaus Faber Foundation.

Acknowledgments

We would like to thank Georges P. Schmartz (Clinical Bioinformatics, Saarland University) for the management of IMAGINE data. We would also like to extend our gratitude to Andreas Keller (Helmholtz Institute for Pharmaceutical Research Saarland (HIPS) and Clinical Bioinformatics, Saarland University) and Aránzazu del Campo (INM-Leibniz Institute for New Materials and Chemistry Department, Saarland University) for their guidance.

Conflict of interest

The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Generative AI statement

The authors declare that no Gen AI was used in the creation of this manuscript.

Any alternative text (alt text) provided alongside figures in this article has been generated by Frontiers with the support of artificial intelligence and reasonable efforts have been made to ensure accuracy, including review by the authors wherever possible. If you identify any issues, please contact us.

Publisher’s note

All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article, or claim that may be made by its manufacturer, is not guaranteed or endorsed by the publisher.

Supplementary material

The Supplementary material for this article can be found online at: https://www.frontiersin.org/articles/10.3389/fmicb.2025.1636322/full#supplementary-material

References

Aguiar-Pulido, V., Huang, W., Suarez-Ulloa, V., Cickovski, T., Mathee, K., and Narasimhan, G. (2016). Metagenomics, metatranscriptomics, and metabolomics approaches for microbiome analysis. Evol. Bioinformatics Online 12, 5–16. doi: 10.4137/EBO.S36436

PubMed Abstract | Crossref Full Text | Google Scholar

Beghini, F., McIver, L. J., Blanco-Míguez, A., Dubois, L., Asnicar, F., Maharjan, S., et al. (2021). Integrating taxonomic, functional, and strain-level profiling of diverse microbial communities with bioBakery 3. eLife 10:e65088. doi: 10.7554/eLife.65088

PubMed Abstract | Crossref Full Text | Google Scholar

Blanco-Míguez, A., Beghini, F., Cumbo, F., McIver, L. J., Thompson, K. N., Zolfo, M., et al. (2023). Extending and improving metagenomic taxonomic profiling with uncharacterized species using MetaPhlAn 4. Nat. Biotechnol. 41, 1633–1644. doi: 10.1038/s41587-023-01688-w

PubMed Abstract | Crossref Full Text | Google Scholar

Bowerman, K. L., Rehman, S. F., Vaughan, A., Lachner, N., Budden, K. F., Kim, R. Y., et al. (2020). Disease-associated gut microbiome and metabolome changes in patients with chronic obstructive pulmonary disease. Nat. Commun. 11:5886. doi: 10.1038/s41467-020-19701-0

PubMed Abstract | Crossref Full Text | Google Scholar

Cabrera-Rubio, R., Garcia-Núñez, M., Setó, L., Antó, J. M., Moya, A., Monsó, E., et al. (2012). Microbiome diversity in the bronchial tracts of patients with chronic obstructive pulmonary disease. J. Clin. Microbiol. 50, 3562–3568. doi: 10.1128/JCM.00767-12

PubMed Abstract | Crossref Full Text | Google Scholar

Cameron, S. J. S., Lewis, K. E., Huws, S. A., Lin, W., Hegarty, M. J., Lewis, P. D., et al. (2016). Metagenomic sequencing of the chronic obstructive pulmonary disease upper bronchial tract microbiome reveals functional changes associated with disease severity. PLoS One 11:e0149095. doi: 10.1371/journal.pone.0149095

PubMed Abstract | Crossref Full Text | Google Scholar

Caspi, R., Billington, R., Keseler, I. M., Kothari, A., Krummenacker, M., Midford, P. E., et al. (2020). The MetaCyc database of metabolic pathways and enzymes - a 2019 update. Nucleic Acids Res. 48, D445–D453. doi: 10.1093/nar/gkz862

PubMed Abstract | Crossref Full Text | Google Scholar

Chen, S. (2023). Ultrafast one-pass FASTQ data preprocessing, quality control, and deduplication using fastp. iMeta 2:e107. doi: 10.1002/imt2.107

PubMed Abstract | Crossref Full Text | Google Scholar

Cleveland Clinic. (2022) Available online at: https://my.clevelandclinic.org/health/diseases/8709-chronic-obstructive-pulmonary-disease-copd (Accessed June 1, 2024).

Google Scholar

De Matteis, S., Jarvis, D., Darnton, A., Hutchings, S., Sadhra, S., Fishwick, D., et al. (2019). The occupations at increased risk of COPD: analysis of lifetime job-histories in the population-based UK biobank cohort. Eur. Respir. J. 54:1900186. doi: 10.1183/13993003.00186-2019

PubMed Abstract | Crossref Full Text | Google Scholar

Diao, W., Shen, N., Du, Y., Qian, K., and He, B. (2017). Characterization of throat microbial flora in smokers with or without COPD. Int. J. Chron. Obstruct. Pulmon. Dis. 12, 1933–1946. doi: 10.2147/COPD.S140243

PubMed Abstract | Crossref Full Text | Google Scholar

Dickson, R. P., Schultz, M. J., van der Poll, T., Schouten, L. R., Falkowski, N. R., Luth, J. E., et al. (2020). Lung microbiota predict clinical outcomes in critically ill patients. Am. J. Respir. Crit. Care Med. 201, 555–563. doi: 10.1164/rccm.201907-1487OC

PubMed Abstract | Crossref Full Text | Google Scholar

Dora, D., Revisnyei, P., Mihucz, A., Kiraly, P., Szklenarik, G., Dulka, E., et al. (2024). Metabolic pathways from the gut metatranscriptome are associated with COPD and respiratory function in lung cancer patients. Front. Cell. Infect. Microbiol. 14:1381170. doi: 10.3389/fcimb.2024.1381170

PubMed Abstract | Crossref Full Text | Google Scholar

Esther, C. R., O’Neal, W. K., Anderson, W. H., Kesimer, M., Ceppe, A., Doerschuk, C. M., et al. (2022). Identification of sputum biomarkers predictive of pulmonary exacerbations in COPD. Chest 161, 1239–1249. doi: 10.1016/j.chest.2021.10.049

PubMed Abstract | Crossref Full Text | Google Scholar

Ewels, P., Magnusson, M., Lundin, S., and Käller, M. (2016). MultiQC: summarize analysis results for multiple tools and samples in a single report. Bioinformatics 32, 3047–3048. doi: 10.1093/bioinformatics/btw354

PubMed Abstract | Crossref Full Text | Google Scholar

Hou, K., Wu, Z.-X., Chen, X.-Y., Wang, J.-Q., Zhang, D., Xiao, C., et al. (2022). Microbiota in health and diseases. Signal Transduct. Target. Ther. 7:135. doi: 10.1038/s41392-022-00974-4

PubMed Abstract | Crossref Full Text | Google Scholar

Houten, S. M., Frenkel, J., and Waterham, H. R. (2003). Isoprenoid biosynthesis in hereditary periodic fever syndromes and inflammation. Cell. Mol. Life Sci. 60, 1118–1134. doi: 10.1007/s00018-003-2296-4

PubMed Abstract | Crossref Full Text | Google Scholar

Hulsen, T. (2022). Deepvenn -- a web application for the creation of area-proportional Venn diagrams using the deep learning framework tensorflow.Js. arXiv.

Google Scholar

Jung, Y. Y., Hwang, S. T., Sethi, G., Fan, L., Arfuso, F., and Ahn, K. S. (2018). Potential anti-inflammatory and anti-Cancer properties of farnesol. Molecules 23:2827. doi: 10.3390/molecules23112827

PubMed Abstract | Crossref Full Text | Google Scholar

Kayongo, A., Robertson, N. M., Siddharthan, T., Ntayi, M. L., Ndawula, J. C., Sande, O. J., et al. (2022). Airway microbiome-immune crosstalk in chronic obstructive pulmonary disease. Front. Immunol. 13:1085551. doi: 10.3389/fimmu.2022.1085551

PubMed Abstract | Crossref Full Text | Google Scholar

Kirst, M. E., Baker, D., Li, E., Abu-Hasan, M., and Wang, G. P. (2019). Upper versus lower airway microbiome and metagenome in children with cystic fibrosis and their correlation with lung inflammation. PLoS One 14:e0222323. doi: 10.1371/journal.pone.0222323

PubMed Abstract | Crossref Full Text | Google Scholar

Kleine Bardenhorst, S., Berger, T., Klawonn, F., Vital, M., Karch, A., and Rübsamen, N. (2021). Data analysis strategies for microbiome studies in human populations-a systematic review of current practice. mSystems 6:10.1128/msystems.01154-20msystems.01154-20. doi: 10.1128/mSystems.01154-20

PubMed Abstract | Crossref Full Text | Google Scholar

Li, R., Li, J., and Zhou, X. (2024). Lung microbiome: new insights into the pathogenesis of respiratory diseases. Signal Transduct. Target. Ther. 9:19. doi: 10.1038/s41392-023-01722-y

PubMed Abstract | Crossref Full Text | Google Scholar

Mallick, H., Rahnavard, A., McIver, L. J., Ma, S., Zhang, Y., Nguyen, L. H., et al. (2021). Multivariable association discovery in population-scale meta-omics studies. PLoS Comput. Biol. 17:e1009442. doi: 10.1371/journal.pcbi.1009442

PubMed Abstract | Crossref Full Text | Google Scholar

Marcuzzi, A., Pontillo, A., De Leo, L., Tommasini, A., Decorti, G., Not, T., et al. (2008). Natural isoprenoids are able to reduce inflammation in a mouse model of mevalonate kinase deficiency. Pediatr. Res. 64, 177–182. doi: 10.1203/PDR.0b013e3181761870

PubMed Abstract | Crossref Full Text | Google Scholar

Mayo Clinic. (2020) Available online at: https://www.mayoclinic.org/diseases-conditions/copd/symptoms-causes/syc-20353679 (Accessed June 1, 2024).

Google Scholar

Mikhail, S. G., and O’Dwyer, D. N. (2025). The lung microbiome in interstitial lung disease. Breathe (Sheff.) 21:240167.

Google Scholar

Millares, L., Ferrari, R., Gallego, M., Garcia-Nuñez, M., Pérez-Brocal, V., Espasa, M., et al. (2014). Bronchial microbiome of severe COPD patients colonised by Pseudomonas aeruginosa. Eur. J. Clin. Microbiol. Infect. Dis. 33, 1101–1111. doi: 10.1007/s10096-013-2044-0

PubMed Abstract | Crossref Full Text | Google Scholar

Natalini, J. G., Singh, S., and Segal, L. N. (2023). The dynamic lung microbiome in health and disease. Nat. Rev. Microbiol. 21, 222–235. doi: 10.1038/s41579-022-00821-x

PubMed Abstract | Crossref Full Text | Google Scholar

O’Donnell, R., Breen, D., Wilson, S., and Djukanovic, R. (2006). Inflammatory cells in the airways in COPD. Thorax 61, 448–454. doi: 10.1136/thx.2004.024463

Crossref Full Text | Google Scholar

Pathak, U., Gupta, N. C., and Suri, J. C. (2020). Risk of COPD due to indoor air pollution from biomass cooking fuel: a systematic review and meta-analysis. Int. J. Environ. Health Res. 30, 75–88. doi: 10.1080/09603123.2019.1575951

PubMed Abstract | Crossref Full Text | Google Scholar

Pérez-Cobas, A. E., Gomez-Valero, L., and Buchrieser, C. (2020). Metagenomic approaches in microbial ecology: an update on whole-genome and marker gene sequencing analyses. Microb Genom. 6, 1–22. doi: 10.1099/mgen.0.000409

PubMed Abstract | Crossref Full Text | Google Scholar

Po, J. Y. T., FitzGerald, J. M., and Carlsten, C. (2011). Respiratory disease associated with solid biomass fuel exposure in rural women and children: systematic review and meta-analysis. Thorax 66, 232–239. doi: 10.1136/thx.2010.147884

PubMed Abstract | Crossref Full Text | Google Scholar

Ramirez, R., van Buuren, N., Gamelin, L., Soulette, C., May, L., Han, D., et al. (2021). Targeted long-read sequencing reveals comprehensive architecture, burden, and transcriptional signatures from hepatitis B virus-associated integrations and translocations in hepatocellular carcinoma cell lines. J. Virol. 95:e0029921. doi: 10.1128/JVI.00299-21

PubMed Abstract | Crossref Full Text | Google Scholar

Ramsheh, M. Y., Haldar, K., Esteve-Codina, A., Purser, L. F., Richardson, M., Müller-Quernheim, J., et al. (2021). Lung microbiome composition and bronchial epithelial gene expression in patients with COPD versus healthy individuals: a bacterial 16S rRNA gene sequencing and host transcriptomic analysis. Lancet Microbe. 2, e300–e310. doi: 10.1016/S2666-5247(21)00035-5

PubMed Abstract | Crossref Full Text | Google Scholar

Santoro, A., Ciaglia, E., Nicolin, V., Pescatore, A., Prota, L., Capunzo, M., et al. (2018). The isoprenoid end product N6-isopentenyladenosine reduces inflammatory response through the inhibition of the NFκB and STAT3 pathways in cystic fibrosis cells. Inflamm. Res. 67, 315–326. doi: 10.1007/s00011-017-1123-6

PubMed Abstract | Crossref Full Text | Google Scholar

Schmartz, G. P., Rehner, J., Gund, M. P., Keller, V., Molano, L.-A. G., Rupf, S., et al. (2024). Decoding the diagnostic and therapeutic potential of microbiota using pan-body pan-disease microbiomics. Nat. Commun. 15:8261. doi: 10.1038/s41467-024-52598-7

PubMed Abstract | Crossref Full Text | Google Scholar

SegataLab 2020 Hclust2: a handy tool for plotting heat-maps with several useful options to produce high quality figures that can be used in publication

Google Scholar

Shannon, C. E. (1948). A mathematical theory of communication. Bell Syst. Tech. J. 27, 379–423. doi: 10.1002/j.1538-7305.1948.tb01338.x

Crossref Full Text | Google Scholar

Simpson, E. H. (1949). Measurement of diversity. Nature 163, 688–688. doi: 10.1038/163688a0

Crossref Full Text | Google Scholar

Sin, D. D. (2023). Chronic obstructive pulmonary disease and the airway microbiome: what respirologists need to know. Tuberc Respir Dis (Seoul). 86, 166–175. doi: 10.4046/trd.2023.0015

PubMed Abstract | Crossref Full Text | Google Scholar

Suzek, B. E., Wang, Y., Huang, H., McGarvey, P. B., and Wu, C. H.UniProt Consortium (2015). UniRef clusters: a comprehensive and scalable alternative for improving sequence similarity searches. Bioinformatics 31, 926–932. doi: 10.1093/bioinformatics/btu739

PubMed Abstract | Crossref Full Text | Google Scholar

Tangedal, S., Nielsen, R., Aanerud, M., Drengenes, C., Husebø, G. R., Lehmann, S., et al. (2024). Lower airway microbiota in COPD and healthy controls. Thorax 79, 219–226. doi: 10.1136/thorax-2023-220455

PubMed Abstract | Crossref Full Text | Google Scholar

Wang, Z., Zhang, R., Yang, Q., Zhang, J., Zhao, Y., Zheng, Y., et al. (2021). Recent advances in the biosynthesis of isoprenoids in engineered Saccharomyces cerevisiae. Adv. Appl. Microbiol. 114, 1–35. doi: 10.1016/bs.aambs.2020.11.001

PubMed Abstract | Crossref Full Text | Google Scholar

World Health Organization. (2023). Available online at: https://www.who.int/news-room/fact-sheets/detail/chronic-obstructive-pulmonary-disease-(copd) (Accessed June 1, 2024).

Google Scholar

Xia, Y., and Sun, J. (2017). Hypothesis testing and statistical analysis of microbiome. Genes Dis. 4, 138–148. doi: 10.1016/j.gendis.2017.06.001

PubMed Abstract | Crossref Full Text | Google Scholar

Yamada, M., and Ichinose, M. (2018). The cholinergic pathways in inflammation: a potential pharmacotherapeutic target for COPD. Front. Pharmacol. 9:1426. doi: 10.3389/fphar.2018.01426

PubMed Abstract | Crossref Full Text | Google Scholar

Yang, L., and Chen, J. (2022). A comprehensive evaluation of microbial differential abundance analysis methods: current status and potential solutions. Microbiome. 10:130. doi: 10.1186/s40168-022-01320-0

PubMed Abstract | Crossref Full Text | Google Scholar

Zakharkina, T., Heinzel, E., Koczulla, R. A., Greulich, T., Rentz, K., Pauling, J. K., et al. (2013). Analysis of the airway microbiota of healthy individuals and patients with chronic obstructive pulmonary disease by T-RFLP and clone sequencing. PLoS One 8:e68302. doi: 10.1371/journal.pone.0068302

PubMed Abstract | Crossref Full Text | Google Scholar

Keywords: metagenomics, chronic obstructive pulmonary disease, microbiome, Taxonomic profiling, microbial pathways

Citation: Chen G, Wiegand C, Willett A, Herr C, Müller R, Bals R and Kalinina OV (2025) Comparative metagenomic analysis on COPD and health control samples reveals taxonomic and functional motifs. Front. Microbiol. 16:1636322. doi: 10.3389/fmicb.2025.1636322

Received: 27 May 2025; Revised: 07 November 2025; Accepted: 12 November 2025;
Published: 26 November 2025.

Edited by:

Federica Pulvirenti, Accademic Hospital Policlinico Umberto, Italy

Reviewed by:

David Dora, Semmelweis University, Hungary
Kazuhiro Itoh, National Hospital Organization Awara Hospital, Japan
Leon Deutsch, The NU B.V., Netherlands

Copyright © 2025 Chen, Wiegand, Willett, Herr, Müller, Bals and Kalinina. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

*Correspondence: Robert Bals, Um9iZXJ0LkJhbHNAdWtzLmV1; Olga V. Kalinina, b2xnYS5rYWxpbmluYUBoZWxtaG9sdHotaGlwcy5kZQ==

These authors have contributed equally to this work

ORCID: Guangyi Chen, orcid.org/0000-0001-6614-5315

Disclaimer: All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article or claim that may be made by its manufacturer is not guaranteed or endorsed by the publisher.