Your new experience awaits. Try the new design now and help us make it even better

ORIGINAL RESEARCH article

Front. Microbiol., 24 November 2025

Sec. Microorganisms in Vertebrate Digestive Systems

Volume 16 - 2025 | https://doi.org/10.3389/fmicb.2025.1705965

This article is part of the Research TopicNew and advanced mechanistic insights into the influences of the infant gut microbiota on human health and disease, Volume IIView all 12 articles

Disruption of the gut bile acid-microbiota axis precedes severe bronchopulmonary dysplasia in preterm infants


Haiyue YuHaiyue Yu1Yongjing GuoYongjing Guo2Jialu LiJialu Li1Rong FuRong Fu1Yunfeng Zhang
Yunfeng Zhang3*Wanxu Guo
Wanxu Guo2*
  • 1Department of Pediatrics, The Second Hospital of Jilin University, Changchun, China
  • 2Department of Neonatology, The Second Hospital of Jilin University, Changchun, China
  • 3Children's Disease Diagnosis and Treatment Center, The Second Hospital of Jilin University, Changchun, China

Background: Bronchopulmonary dysplasia (BPD) remains a major cause of morbidity in preterm infants, yet current diagnostic criteria are delayed and underlying mechanisms are incompletely defined. Evidence suggests that intestinal dysbiosis may influence pulmonary outcomes via the gut–lung axis, but the metabolic mediators of this interaction remain unclear.

Methods: We conducted a prospective cohort study of 50 preterm infants (≤ 32 weeks gestation), stratified by BPD severity at 36 weeks. Stool samples collected on postnatal day 7 underwent 16S rRNA sequencing and targeted bile acid metabolomics. Differential features were identified via multivariate statistics and LEfSe. Spearman correlation analysis explored bile acid–microbiota interactions. An interpretable machine learning model (XGBoost) incorporating bile acid and microbial features was developed and validated using five-fold cross-validation and an independent test set.

Results: Infants with severe BPD showed significantly reduced levels of 16 bile acids—including primary, secondary, and sulfated species—compared to non-BPD controls. Gut microbiome β-diversity differed significantly among groups, with enrichment of opportunistic Proteobacteria (e.g., Brevundimonas) in severe BPD. Negative correlations were observed between depleted bile acids and enriched bacterial genera. The XGBoost model predicted BPD severity with 80% accuracy (AUC = 0.91), leveraging key features such as chenodeoxycholic acid (CDCA), hyocholic acid (HCA), and Brevundimonas.

Conclusions: Preterm infants who develop severe BPD exhibit early disruption of the bile acid–microbiota axis, characterized by reduced bile acid levels and enrichment of opportunistic taxa. Integrating these features within interpretable machine-learning models enables accurate early risk stratification and provides mechanistic insights beyond traditional inflammation-based frameworks. Validation in larger, multicenter cohorts is warranted to refine biomarker panels and explore targeted interventions that modulate bile acid signaling or microbial ecology to prevent or attenuate BPD.

1 Introduction

Bronchopulmonary dysplasia (BPD) is the most common chronic respiratory complication among extremely preterm infants, characterized by impaired alveolarization (alveolar simplification) and abnormal pulmonary microvascular development. Although advances in perinatal care have improved survival, the incidence of BPD has not declined and it remains a major contributor to long-term respiratory morbidity and reduced quality of life (Thekkeveedu et al., 2017; Zhu, 2024). Current diagnostic criteria rely on oxygen requirement at 28 days of life or at 36 weeks' postmenstrual age (4), an inherently delayed assessment. The pathogenesis of BPD also remains incompletely defined: classic models emphasize antenatal inflammation, ventilator-induced injury, and oxygen toxicity leading to dysregulated injury-repair cycles (Thekkeveedu et al., 2017; Dankhara et al., 2023), but these factors alone do not capture the full disease spectrum. This combination of delayed diagnosis and incomplete mechanistic insight hampers timely intervention.

The gut-lung axis has emerged as a systemic framework linking intestinal homeostasis to pulmonary immunity and disease. Bidirectional crosstalk through immune, metabolic, and neuroendocrine pathways suggests that gut microbial balance can shape distal pulmonary immune tone and inflammatory responses (Tirone et al., 2019). Preterm infants commonly exhibit intestinal dysbiosis–marked by delayed colonization, reduced microbial diversity, and overrepresentation of opportunistic taxa such as Proteobacteria–which has been associated with higher risks of necrotizing enterocolitis (NEC) (Pammi et al., 2017), late-onset sepsis (LOS) (Mai et al., 2013), and BPD (Chen, 2021). However, the molecular mediators and downstream signaling that connect intestinal dysbiosis to lung injury remain poorly defined.

Bile acids are increasingly recognized as signaling molecules at the host-microbiota interface, acting through receptors such as Farnesoid X Receptor (FXR) and Takeda G-protein-coupled Receptor 5 (TGR5) to influence systemic immunity and organ development (Fleishman and Kumar, 2024; Godlewska et al., 2022). Preclinical data indicate that bile acid dysregulation–via toxic pulmonary accumulation or systemic deficiency that impairs anti-inflammatory signaling–may contribute to neonatal lung injury (Zecca et al., 2008; De Luca, 2022) Yet, whether bile acid imbalance contributes to BPD in preterm infants, particularly through its interaction with intestinal dysbiosis, remains unclear.

Machine learning provides a powerful means to interrogate such complex host-microbiota-metabolite systems. By integrating multi-omics datasets, these methods can capture nonlinear relationships, identify key biomarkers, and enable accurate, interpretable prediction (Li et al., 2022; Licciardi et al., 2024). To date, most BPD prediction studies have focused on clinical variables such as gestational age and birth weight (Leigh et al., 2022; Yoneda et al., 2024; Hwang et al., 2023; Choi, 2025); integrative approaches combining bile acid metabolomics with gut microbiome profiling have not been reported.

Against this background, we conducted a prospective observational cohort study to examine the contribution of early-life (postnatal day 7) bile acid profiles and gut microbial composition to BPD and to explore their interaction. We hypothesized that infants who develop severe BPD exhibit early reductions in intestinal bile acids together with dysbiosis, and that these disturbances act in concert through bile acid-microbiota crosstalk to drive disease progression. To test this, we built interpretable machine-learning models using differential metabolite and microbial features to evaluate early prediction of BPD severity. This study is novel in two respects: (i) it integrates bile acid metabolomics and gut microbiome data in preterm infants to elucidate their cross-talk in BPD pathogenesis, and (ii) it develops a bile acid-microbiota-based, interpretable machine-learning framework for very early risk prediction. These advances offer a metabolic-microbial perspective on BPD mechanisms and lay the groundwork for precision risk stratification and targeted interventions.

2 Materials and methods

2.1 Study design and ethics

This single-center, prospective observational cohort study was conducted in the neonatal unit of the Second Hospital of Jilin University. The study protocol, including predefined inclusion criteria, sampling schedule, and analysis plan, was designed and approved by the institutional ethics committee in July 2024 (Approval No.: 2024-309) before patient recruitment began.

Enrollment of eligible preterm infants commenced in November 2024 and continued until February 2025. Clinical data and biological samples were collected in real time during hospitalization according to the study protocol, and all infants were followed until discharge. Written informed consent was obtained from parents or legal guardians prior to enrollment.

2.2 Study participants

2.2.1 Inclusion and exclusion criteria

Inclusion criteria: (i) preterm infants with gestational age (GA) ≤ 32 weeks, born at and admitted to the Second Hospital of Jilin University (GA determined by last menstrual period and confirmed by first-trimester ultrasound); (ii) no major congenital anomalies or genetic syndromes; and (iii) guardian consent. Exclusion criteria: (i) complex congenital heart disease; (ii) malformations of the central nervous, gastrointestinal, or respiratory systems; (iii) requirement for invasive respiratory support due to surgery or other non-BPD indications; (iv) other major anomalies (e.g., congenital diaphragmatic hernia, chromosomal abnormalities); (v) anticipated survival <28 days; or (vi) refusal to participate.

2.2.2 Group definitions

BPD status at 36 weeks' postmenstrual age was determined according to the 2018 NIH workshop consensus (Higgins et al., 2018). Infants were classified as Non-BPD (NonBPD7), non-severe BPD (BPD7m; including mild and moderate cases), or severe BPD (BPD7s) based on oxygen requirement and respiratory support.

2.3 Sample collection and processing

On postnatal day 7 (PND7), ~1 g of stool was collected with a sterile disposable spatula, transferred to sterile cryovials, and immediately stored at −80°C until 16S rRNA gene sequencing and targeted bile acid metabolomics.

2.4 16S rRNA gene sequencing and bioinformatics

2.4.1 DNA extraction

Total genomic DNA was extracted using the MagBeads FastDNA Kit for Soil (116564384, MP Biomedicals, CA, USA) according to the manufacturer's protocol and stored at −20°C until further analysis. DNA concentration and purity were assessed using a NanoDrop NC2000 spectrophotometer (Thermo Fisher Scientific, Waltham, MA, USA), and integrity was verified by agarose gel electrophoresis.

2.4.2 16S rRNA gene amplicon sequencing

The V3–V4 hypervariable region of the bacterial 16S rRNA gene was amplified using primers 338F (5′-ACTCCTACGGGAGGCAGCA-3′) and 806R (5′-GGACTACHVGGGTWTCTAAT-3′) with sample-specific 7-bp barcodes. Each 20 μl PCR reaction contained 5 μl of 5 × buffer, 0.25 μl FastPfu DNA polymerase (5 U/μL), 2 μl dNTPs (2.5 mM), 1 μl of each primer (10 μM), 1 μl DNA template, and 14.75 μl ddH2O. The PCR program consisted of an initial denaturation at 98°C for 5 min, followed by 26 cycles of 98°C for 30 s, 52°C for 30 s, and 72°C for 45 s, with a final extension at 72°C for 5 min. Amplicons were purified with VAHTS DNA Clean Beads (Vazyme, Nanjing, China), quantified with the Quant-iT PicoGreen dsDNA Assay Kit (Invitrogen, Carlsbad, CA, USA), pooled in equimolar amounts, and sequenced on the Illumina NovaSeq 6000 platform (2 × 250 bp paired-end, SP Reagent Kit, 500 cycles) at Shanghai Personal Biotechnology Co., Ltd (Shanghai, China).

2.5 Targeted bile acid metabolomics

2.5.1 Chemicals and reagents

HPLC-grade acetonitrile (ACN) and methanol (MeOH) were obtained from Merck (Darmstadt, Germany). Milli-Q water (Millipore, Bradford, USA) was used throughout. Bile acid standards were purchased from CNW (Shanghai, China) and IsoReag (Shanghai, China). Acetic acid and ammonium acetate were obtained from Sigma-Aldrich (St. Louis, MO, USA). Stock solutions of each standard were prepared at 1 mg/ml in MeOH and stored at −20°C, then diluted with MeOH to working solutions prior to analysis.

2.5.2 Sample preparation and extraction

Approximately 20 mg of sample was ground using a ball mill and extracted with 495 μl of MeOH containing 5 μl of internal standard mixture (10 μg/ml). Extracts were incubated at −20°C for 10 min to precipitate proteins, followed by centrifugation at 12,000 rpm for 10 min at 4°C. The supernatant was passed through a protein precipitation plate prior to LC–MS analysis.

2.5.3 HPLC conditions

Chromatographic separation was performed on an LC-ESI-MS/MS system (UHPLC, ExionLCTM AD; MS, SCIEX 6500 Triple Quadrupole). The column was a Waters ACQUITY UPLC HSS T3 C18 (100 mm × 2.1 mm, 1.8 μm). Mobile phases were (A) water with 0.01% acetic acid and 5 mmol/L ammonium acetate, and (B) acetonitrile with 0.01% acetic acid. A linear gradient was applied: 5–40% B (0–1 min), 40–50% B (1–6 min), 50–75% B (6–11 min), 75–95% B (11–13 min), held at 95% B (13–15 min), and re-equilibrated to 5% B (16–17.5 min). Flow rate: 0.35 ml/min; column temperature: 40°C; injection volume: 3 μL.

2.5.4 ESI-MS/MS conditions

Mass spectrometry was performed on a QTRAP 6500+ LC–MS/MS system (SCIEX) equipped with an ESI Turbo Ion-Spray source operating in negative ion mode under Analyst 1.6.3 control. Source parameters: source temperature 550°C, ion spray voltage −4, 500 V, curtain gas 35 psi. Bile acids were quantified using scheduled multiple reaction monitoring (MRM). Multiquant 3.0.3 (SCIEX) was used for peak integration and quantification, with declustering potential (DP) and collision energy (CE) individually optimized for each bile acid. Specific MRM transitions were monitored according to expected retention time windows.

2.6 Statistical analysis

Comprehensive maternal and neonatal baseline data were collected, including sex, mode of delivery, maternal infection, obstetric medication (magnesium sulfate, antenatal corticosteroids, antibiotics), pregnancy complications, plurality, gestational age, birth weight, Apgar score, and early postnatal interventions (endotracheal intubation, surfactant administration, caffeine, vasoactive agents). Categorical variables were summarized as counts (percentages) and compared using the χ2 test or Fisher's exact test when >20% of cells had an expected frequency <5. Continuous variables were tested for normality with the Shapiro–Wilk test. Normally distributed data were expressed as mean ± standard deviation (SD) and analyzed by one-way ANOVA, while skewed data were expressed as median (interquartile range, IQR) and compared by the Kruskal–Wallis H-test. All tests were two-sided with α = 0.05. Statistical analyses were performed in Python (v3.9) using the SciPy library.

Microbiome bioinformatics were performed with QIIME 2 (2024.10) with slight modification according to the official tutorials (https://docs.qiime2.org/2024). Briefly, raw sequence data were demultiplexed using the demux plugin, followed by primer trimming with the cutadapt plugin (Martin, 2011). Sequences were quality-filtered, denoised, merged, and chimera-removed using the DADA2 plugin (Callahan et al., 2016). Non-singleton amplicon sequence variants (ASVs) were aligned with mafft (Katoh et al., 2002) and used to construct a phylogeny with fasttree2 (Price et al., 2009). For downstream analyses, α-diversity indices [Chao1 (Chao, 1984), Simpson (Simpson, 1949), Shannon (Shannon, 1948), Good's coverage (Good, 1953), Observed species, and Pielou's evenness (Pielou, 1966)] were calculated, and inter-group comparisons were performed using the Kruskal–Wallis test. β-diversity was assessed using Bray–Curtis (Bray and Curtis, 1957) and UniFrac distances (Lozupone and Knight, 2005), visualized by principal coordinate analysis (PCoA). Differences in microbial community structure between groups were tested using PERMANOVA (Adonis) with 999 permutations. Taxonomic differences at the genus level were identified by linear discriminant analysis effect size (LEfSe), with LDA score > 2.0 and P < 0.05 considered significant.

For bile acid profiling, univariate analysis used the Kruskal–Wallis H test to compare relative concentrations across groups, with P < 0.05 as the selection threshold. Multivariate analysis was performed using orthogonal partial least squares discriminant analysis (OPLS–DA) to visualize inter-group separation; model validity was evaluated by 999-time permutation testing, and models with Q2>0.3 and permutation P < 0.05 were deemed reliable. Differential bile acids were defined through pairwise group comparisons using combined criteria: VIP > 1, P < 0.05, and absolute log2 fold change (|log2FC|) > 1. When different numbers of differential bile acids were obtained across group pairs, the comparison yielding the largest differential set was used as the reference for subsequent analyses.

2.7 Integrated analysis and machine learning

2.7.1 Interaction network analysis

Spearman correlations (ρ) were computed between differential bile acids and bacterial genera; significant associations (|ρ|>0.5, P < 0.05) were visualized in Cytoscape v3.9.1.

2.7.2 Modeling and validation

All machine learning analyses were performed in Python using scikit-learn, imbalanced-learn, and XGBoost. The modeling pipeline was standardized across experiments to ensure comparability. First, clinical and omics variables were preprocessed by removing constant features and applying z-score normalization. To identify informative predictors, a Kruskal–Wallis test across BPD severity groups was performed for each variable, followed by Benjamini–Hochberg false discovery rate (FDR) correction; the top 20 features with the smallest adjusted p-values were retained as candidate inputs. To address class imbalance among BPD categories, Synthetic Minority Over-sampling Technique (SMOTE) was applied on the training data prior to model fitting. Stratified train–test partitioning was adopted (70:30 split), with the option of repeated stratified shuffling or five-fold cross-validation for robustness checks.

Multiple classifiers were benchmarked, including tree-based models [Extreme Gradient Boosting (XGBoost), Random Forest (RF)], linear models [Logistic Regression (LR)], kernel methods [Support Vector Machine (SVM)], instance-based learning [k-nearest neighbors (KNN)], and Naïve Bayes (NB). For scale-sensitive algorithms, normalization was incorporated within imbalanced-learn pipelines to prevent information leakage. Hyperparameters of XGBoost and RF were tuned toward stability (e.g., depth restriction, subsampling, regularization), while ensemble modeling was further explored by implementing a soft-voting strategy that linearly combined the class-probability outputs of XGBoost and RF, with the optimal fusion weight (α) selected on a held-out validation subset by maximizing macro-averaged F1-score.

Model evaluation was conducted on the independent test set. Primary metrics included overall accuracy, macro-averaged F1 (F1-macro), and macro-averaged one-vs-rest AUROC. For cross-validation settings, mean performance and standard deviation were reported. Robustness was further examined through repeated random splits (100 iterations) and permutation tests (1,000 label shuffles) to estimate empirical p-values. Model interpretability was assessed using SHapley Additive exPlanations (SHAP), where both global summary plots and local attribution waterfall plots were generated. Additionally, SHAP stability was quantified by repeating feature attribution across multiple resampled datasets and calculating the frequency of top-k features and their pairwise Jaccard similarity.

3 Results

3.1 Clinical characteristics

A total of 50 preterm infants were enrolled: Non-BPD (n = 17), non-severe BPD (n = 11; including one mild and 10 moderate cases), and severe BPD (n = 22). Baseline characteristics—including sex, mode of delivery, maternal infection, obstetric medications (magnesium sulfate, steroids, and antibiotics), pregnancy complications, early antibiotic exposure, and both the timing and type of feeding–did not differ significantly among groups (all P>0.05). Gestational age and birth weight decreased with increasing BPD severity (Non-BPD: 32.0 weeks, 1,760 g; Mild BPD: 29.9 weeks, 1,510 g; Severe BPD: 29.2 weeks, 1075 g; P < 0.05). The proportion of singletons was highest in the Mild BPD group (100%) and lowest in the Non-BPD group (58.8%) (Table 1).

Table 1
www.frontiersin.org

Table 1. Baseline characteristics of the study population.

3.2 Alterations in bile acid profiles

To investigate early metabolic differences between preterm infants with and without BPD of varying severity, we performed multigroup multivariate statistical analyses. Orthogonal partial least squares discriminant analysis (OPLS–DA) revealed the clearest separation between the NonBPD7 and BPD7s groups (see Supplementary Figures S1S3 for the three-group model and additional pairwise results). Component 1 and Component 2 explained 20 and 30% of the total variance, respectively (cumulative 50%). In the score plot, NonBPD7 samples clustered mainly within the negative range of Component 1, whereas BPD7s samples were distributed in the positive range, with additional separation along Component 2, reflecting distinct metabolic profiles. The model showed a satisfactory fit and predictive capacity (RX2=0.50, RY2=0.585, Q2 = 0.442). Permutation tests (200 iterations) confirmed model reliability, as both Q2 and RY2 values of the original model were significantly higher than those from permuted models (P < 0.005; Figures 1A, B).

Figure 1
A set of scientific graphs labeled A to D, depicting various data analyses. A: A scatter plot showing two groups, NonBPD7 and BPD7s, distinguished by color, across two components. B: A histogram displaying permutation results for two data sets, with frequency against permutation values. C: A scatter plot showing Log2 Fold Change versus -Log10 P-Values with statistical significance indicated. D: Multiple box plots comparing intensity levels across NonBPD7, BPD7m, and BPD7s groups for different variables, with significance values labeled.

Figure 1. Targeted bile acid metabolomics in preterm infants with different BPD severities. (A) OPLS-DA score plot showing clear separation between NonBPD7 and Severe BPD (BPD7s) groups (Component 1: 20%; Component 2: 30%). (B) Permutation test validating model reliability (y-axis: frequency of permuted RY2 and Q2 values). (C) Volcano plot identifying 16 significantly downregulated bile acids in BPD7s compared with NonBPD7 (VIP > 1, P < 0.05, |log2FC| > 1). (D) Violin plots with overlaid boxplots showing representative metabolites. BPD, bronchopulmonary dysplasia; NonBPD7, non-BPD group at postnatal day 7; BPD7m, non-severe BPD group (mild + moderate) at postnatal day 7; BPD7s, severe BPD group at postnatal day 7; 3-oxo-DCA, 3-oxodeoxycholic acid; CA-3S, cholic acid-3-sulfate; CDCA-3S, chenodeoxycholic acid-3-sulfate; CA, cholic acid; CDCA, chenodeoxycholic acid; HCA, hyocholic acid; 3-oxo-CA, 3-oxocholic acid; coproCA, coprocholic acid; 7-KLCA, 7-ketolithocholic acid; 7-KDCA, 7-ketodeoxycholic acid; NCA, norcholic acid; HDCA, hyodeoxycholic acid; 3β-CA, 3β-cholic acid; DLCA, deoxylithocholic acid; ACA, allocholic acid; DCA, deoxycholic acid; VIP, variable importance in projection; log2FC, log2 fold change. Primary bile acids: CA, CDCA, CA-3S, and CDCA-3S; Secondary bile acids: 3-oxo-DCA, HCA, 3-oxo-CA, coproCA, 7-KLCA, 7-KDCA, NCA, HDCA, 3β-CA, DLCA, ACA, and DCA.

Differential metabolite screening identified 16 bile acids significantly downregulated in BPD7s compared with NonBPD7, based on VIP >1, P < 0.05, and |log2FC|>1. These comprised four primary bile acids—cholic acid (CA), chenodeoxycholic acid (CDCA), and their sulfated conjugates CA-3S and CDCA-3S—and 12 secondary bile acids, including deoxycholic acid (DCA), deoxylithocholic acid (DLCA), hyodeoxycholic acid (HCA), 3-oxocholic acid (3-oxo-CA), 3β-cholic acid (3β-CA), 7-ketolithocholic acid (7-KLCA), norcholic acid (NCA), hyodeoxycholic acid (HDCA), allocholic acid (ACA), and coprocholic acid (coproCA), among others. Fold changes for these metabolites ranged from 0.004 to 0.472 (Figure 1C).

To visualize the distribution patterns of key differential metabolites among the groups, violin plots with overlaid boxplots were generated (Figure 1D). The results revealed significant inter-group differences in multiple bile acids, including primary bile acids (CA and CDCA) and secondary bile acids (7-KDCA, 7-KLCA, HCA, HDCA, ACA, and coproCA). Notably, 7-ketodeoxycholic acid (7-KDCA), 7-ketolithocholic acid (7-KLCA), and HCA were undetectable in most BPD7s samples, with median values of zero, which was consistent with the quantitative results in Supplementary Table S1. 7-KDCA showed the largest reduction (fold change = 0.004), accompanied by a high VIP score and a highly significant P value (P < 0.001). Overall, these differential metabolites demonstrated strong discriminatory potential across BPD severity levels.

Other pairwise comparisons, such as between NonBPD7 and BPD7m, also revealed metabolic differences, but both the number of altered metabolites and the magnitude of change were smaller than in the NonBPD7 vs. BPD7s comparison.

3.3 Gut microbiota diversity and differential taxa

To characterize the gut microbiota on postnatal day 7 across BPD severities, we first assessed sequencing depth. Rarefaction curves plateaued in all groups and Good's coverage exceeded 0.99, indicating sufficient sampling (Supplementary Figure S4). Six α-diversity metrics—Chao1 (P = 0.17), Observed species (P = 0.12), Shannon (P = 0.15), Simpson (P = 0.13), Pielou's evenness (P = 0.13), and Good's coverage (P = 0.18)—did not differ among NonBPD7, BPD7m, and BPD7s (Figure 2A).

Figure 2
Composite image featuring multiple panels of microbiome data. Panel A displays six box plots representing diversity indices (Chao1, Simpson, Shannon, Goods coverage, Observed species, and Pielou's evenness) for three groups (NonBPD7, BPD7m, BPD7s) with p-values indicated. Panel B shows a PCA plot with ellipses for each group indicating clustering. Panel C zooms into part of the PCA plot. Panel D presents a Kruskal-Wallis test showing differential abundance of genera, with log-transformed p-values. Panel E displays a LefSe analysis bar chart with LDA scores highlighting significant taxa for BPD7s. Legend is provided for color-group associations.

Figure 2. Gut microbiota composition on postnatal day 7 across BPD severities. (A) α-diversity indices (Chao1, Observed species, Shannon, Simpson, Pielou's evenness, Good's coverage) showed no significant differences among groups. (B) Principal coordinate analysis (PCoA) based on Bray–Curtis distances revealed clearer separation, while (C) unweighted UniFrac PCoA showed modest group differences. (D) Boxplots of log-transformed abundances of representative genera enriched in BPD7s. (E) LEfSe analysis identified ten genera with significant enrichment in BPD7s (LDA >2.0, P < 0.05), including Brevundimonas, Burkholderia, Delftia, Achromobacter, Ochrobactrum, Mycobacterium, Bradyrhizobium, Sphingomonas, Methylobacterium, and Rhodococcus. BPD, bronchopulmonary dysplasia; NonBPD7, non-BPD group at postnatal day 7; BPD7m, non-severe BPD group (mild + moderate) at postnatal day 7; BPD7s, severe BPD group at postnatal day 7; PCoA, principal coordinate analysis; LDA, linear discriminant analysis; LEfSe, linear discriminant analysis effect size. The symbols indicate statistical significance levels for the Dunn test as follows: *P < 0.05, **P < 0.01, and ***P < 0.001.

β-diversity (PERMANOVA) based on Bray–Curtis and unweighted UniFrac distances revealed overall between-group differences (Bray–Curtis: R2 = 0.1137, P = 0.001; unweighted UniFrac: R2 = 0.0551, P = 0.041; Supplementary Table S2; Supplementary Figures S5, S6), with clearer separation under Bray–Curtis. In the Bray–Curtis PCoA (Figure 2B), PC1 and PC2 explained 35.1 and 11.8% of the variance, respectively; NonBPD7 and BPD7m samples clustered mainly at positive PC1 values, whereas BPD7s samples were more dispersed. Ninety-five percent confidence ellipses supported the separation. For unweighted UniFrac (Figure 2C), PC1 and PC2 explained 15.0 and 10.8% of the variance.

LEfSe identified ten genera enriched in BPD7s (LDA>2.0, P < 0.05; Figure 2E): Brevundimonas, Burkholderia, Delftia, Achromobacter, Ochrobactrum, Mycobacterium, Bradyrhizobium, Sphingomonas, Methylobacterium, and Rhodococcus. Log-transformed abundances of these genera are shown as boxplots (Figure 2D). To confirm the robustness of these findings, Kruskal–Wallis tests were performed across the same taxa, indicating significant differences among groups (P < 0.05). Dunn's post-hoc tests showed the largest contrasts between NonBPD7 and BPD7s (all P < 0.05), followed by BPD7m vs. BPD7s (7/10 genera, P < 0.05), with no differences between NonBPD7 and BPD7m. Abundances generally followed BPD7s > BPD7m > NonBPD7.

Functional prediction analysis using PICRUSt2 was further performed to infer microbial metabolic potential. A total of 170 KEGG pathways were identified across all samples (Supplementary Table S3). While several metabolic pathways showed groupwise differences, pathways associated with bile acid metabolism, including primary and secondary bile acid biosynthesis (ko00120 and ko00121), did not differ significantly among groups.

3.4 Interactions between bile acids and gut microbiota

Spearman correlation analysis between the differentially abundant bile acids and microbial genera identified previously revealed a general pattern of negative associations, as shown in the heatmap (Figure 3A). Notably, the genus Brevundimonas was negatively correlated with 15 out of the 16 bile acids examined, while several bile acids—including CDCA-3S, HCA, CA, and CDCA—also exhibited consistent negative correlations with all the differential bacterial genera. Applying correlation thresholds of |ρ|>0.5 and P < 0.05, we found that multiple primary and secondary bile acids—namely CA, CDCA, CDCA-3S, coproCA, 7-KLCA, 7-KDCA, HDCA, and HCA—were significantly negatively correlated with proteobacterial genera such as Brevundimonas and Delftia (Supplementary Table S4). Among these, the strongest negative correlations were observed between CA and Brevundimonas (ρ = −0.64), HCA and Delftia (ρ = −0.64), and CDCA and Brevundimonas (ρ = −0.62). Correlation network analysis (Figure 3B) further indicated that Brevundimonas occupies a central position within the network, exhibiting direct interactions with all eight bile acids included in the analysis.

Figure 3
Panel A shows a heatmap of Spearman correlation coefficients between bile acids and bacterial genera, with color scale ranging from blue (negative correlation, −1.0) to red (positive correlation, +1.0); cells marked with “×” indicate non-significant correlations. Panel B displays a correlation network highlighting Brevundimonas as a central node linked to multiple bile acids.

Figure 3. Correlation analysis of bile acids and microbial genera. (A) Spearman correlation heatmap showing predominantly negative associations between differential bile acids and bacterial genera. (B) Correlation network highlighting Brevundimonas as a central node linked to multiple bile acids. Significant correlations were defined as |ρ|>0.5 and P < 0.05. Cells marked with “ × ” indicate non-significant correlations (P≥0.05). The color scale represents the Spearman correlation coefficient (ρ), ranging from −1 (blue, strong negative) to +1 (red, strong positive). BPD, bronchopulmonary dysplasia; NonBPD7, non-BPD group at postnatal day 7; BPD7m, non-severe BPD group (mild + moderate) at postnatal day 7; BPD7s, severe BPD group at postnatal day 7; 3-oxo-DCA, 3-oxodeoxycholic acid; CA-3S, cholic acid-3-sulfate; CDCA-3S, chenodeoxycholic acid-3-sulfate; CA, cholic acid; CDCA, chenodeoxycholic acid; HCA, hyocholic acid; 3-oxo-CA, 3-oxocholic acid; coproCA, coprocholic acid; 7-KLCA, 7-ketolithocholic acid; 7-KDCA, 7-ketodeoxycholic acid; NCA, norcholic acid; HDCA, hyodeoxycholic acid; 3β-CA, 3β-cholic acid; DLCA, deoxylithocholic acid; ACA, allocholic acid; DCA, deoxycholic acid; ρ, Spearman correlation coefficient.

3.5 Machine learning-based prediction modeling

Using Kruskal–Wallis testing with FDR adjustment, 20 differential features (bile acids and Proteobacteria-related genera) were selected. Class imbalance in the training set was addressed with SMOTE. Global feature importance and SHAP analysis (Figure 4A) highlighted bile acids such as CDCA, HCA, and DLCA, together with bacterial taxa including unclassified Enterobacteriaceae, Brevundimonas, and Delftia, as the major contributors to the three-class discrimination. Feature importance and SHAP analyses indicated strong contributions of these metabolites and taxa to three-class discrimination.

Figure 4
Bar charts and waterfall plots showcasing feature importance and classifier performance. Panel A displays mean SHAP values for various features across three classes. Panels B and C depict confusion matrices for XGBoost and RandomForest classifiers, showing predicted versus true labels. Panels D, E, and F present waterfall plots illustrating feature contributions to specific predictions in different cases.

Figure 4. Machine learning models for BPD severity classification. (A) Global feature importance and SHAP summary plot for the selected feature set. (B) Confusion matrix of the XGBoost classifier on the independent test set (rows: true labels; columns: predicted labels). (C) Confusion matrix of the Random Forest classifier on the independent test set (same label order as in panel B). (D–F) Individual SHAP waterfall plots illustrating feature contributions to representative cases from each class (D = NonBPD7, E = BPD7m, F = BPD7s). Models were trained with SMOTE; evaluation used five-fold stratified cross-validation and an independent test set. Class labels were encoded as 0 = NonBPD7, 1 = BPD7m, and 2 = BPD7s. BPD, bronchopulmonary dysplasia; NonBPD7, non-BPD group at postnatal day 7; BPD7m, non-severe BPD group (mild + moderate) at postnatal day 7; BPD7s, severe BPD group at postnatal day 7; SHAP, SHapley Additive exPlanations; CDCA, chenodeoxycholic acid; HCA, hyocholic acid; DLCA, deoxylithocholic acid; 7-KDCA, 7-ketodeoxycholic acid; 3β-CA, 3β-cholic acid; coproCA, coprocholic acid; 3-oxo-CA, 3-oxocholic acid; CA, cholic acid; CDCA-3S, chenodeoxycholic acid-3-sulfate; 7-KLCA, 7-ketolithocholic acid; ACA, allocholic acid.

Tree-based models performed best in five-fold cross-validation: RF achieved a macro-F1 of 0.78 and XGBoost 0.76. Corresponding AUCs were 0.93 and 0.91, outperforming LR, NB, KNN, and SVM (AUC <0.90) (Supplementary Table S5). On the independent test set (70/30 split), both models reached 80% accuracy. Recall for the Non-BPD class approached 100%, whereas misclassification occurred primarily between Mild and Severe BPD (Figures 4B, C). Model fusion did not yield further gains (Supplementary Figures S7, S8). Among the two tree-based classifiers, XGBoost was ultimately chosen for subsequent SHAP analysis, owing to its consistently robust cross-validation performance, reliable probability calibration, and well-established interpretability in biomedical applications.

Robustness checks—including repeated stratified splits and permutation testing—supported performance exceeding chance (p≈0.001). Stability analysis further confirmed that the set of top-20 SHAP-ranked features was highly reproducible (mean Jaccard ≈1.00) (Supplementary Figures S9S11). Individual-level SHAP waterfall plots (Figures 4DF) revealed that alterations in bile acids and specific bacterial taxa were critical determinants for class prediction, consistent with the global SHAP ranking. Notably, CDCA, HCA, and Brevundimonas consistently emerged as the most stable contributors across all resampling runs (Supplementary Figure S8).

4 Discussion

This study integrated analyses of clinical features, fecal bile acid profiles, and gut microbiome structure in 50 preterm infants. We confirmed that the severity of bronchopulmonary dysplasia (BPD) was inversely correlated with gestational age and birth weight, consistent with previous reports (Klinger et al., 2013; Geetha et al., 2021). These findings not only reaffirm that developmental immaturity is a prerequisite for BPD, but also suggest that such immaturity may increase susceptibility to microbiome–metabolite dysregulation by compromising the host's ability to maintain microenvironmental homeostasis. We focused on the 7th day after birth as a critical time point—a period within the initial colonization window (typically the first 1–2 weeks of life), during which the gut microbiome exhibits high volatility (Ning et al., 2025; Zhang, 2022). At this early stage, infants who later developed severe BPD already showed significant differences in metabolic and microbial profiles compared to non-BPD infants. This temporal alignment suggests that early microenvironmental disruption is not merely a consequence of disease progression, but may actively contribute to pathological processes during a critical period of lung development. This relationship is particularly salient in preterm infants, in whom immature gut barrier function—marked by reduced expression of tight junction proteins such as occludin (Golubkova and Hunter, 2023)—coincides with delayed hepatic maturation of UDP-glucuronosyltransferase activity (Kawade and Onishi, 1981), impairing metabolic clearance. Against a backdrop of immature immune function (Collado et al., 2015), these factors collectively diminish the host's capacity to maintain homeostasis in response to microbial and metabolic challenges.

The intestinal microbiota plays a central role in maintaining bile-acid compositional homeostasis by reshaping the host bile-acid pool through deconjugation, dehydroxylation, and dehydrogenation reactions (Ikegami and Honda, 2018). Bile salt hydrolases are widely distributed across bacterial phyla, while C-7 hydroxysteroid dehydrogenases (7α/7β-HSDH) and the downstream 7α-dehydroxylation pathway are predominantly encoded by obligate anaerobes within the class Clostridia (e.g., Clostridium, Eubacterium); members of Bacteroidetes (e.g., Bacteroides spp.) have also been reported to harbor BSH and certain HSDH activities (Doden and Ridlon, 2021). These enzymes oxidize primary bile acids to 7-keto intermediates (e.g., 7-oxo-DCA, 7-oxo-LCA), which are subsequently converted by bai-operon–positive strains via 7α-dehydroxylation into secondary bile acids such as DCA and LCA (Wise and Cummings, 2022). In the metabolic profiles of infants with severe BPD, we observed an overall reduction in bile acid levels. Notably, 7-keto bile acids such as 7-KDCA and 7-KLCA were largely undetectable or present at low abundance within the current detection limits. This pattern is consistent with a potential deficiency in specific microbial metabolic functions. However, it may also reflect a general reduction in the bile acid pool size and/or the immaturity of relevant microbial niches in early life. Therefore, these observations cannot be solely attributed to overall changes in microbial abundance (Doden and Ridlon, 2021; Guzior and Quinn, 2021; Bennett et al., 2003).

Concurrently, infants with severe BPD exhibited significantly reduced levels of sulfated bile acids. Sulfonation enhances bile acid solubility, facilitating their excretion via urine and feces, reducing enterohepatic recirculation, and thereby helping to regulate the total bile acid pool and mitigate the accumulation of toxic hydrophobic bile acids (Alnouti, 2009). This reduction may suggest not only potentially impaired host sulfotransferase activity or enhanced microbial desulfation (Assem et al., 2004; Ridlon et al., 2016), but could also reflect broader influences such as a reduced overall bile acid pool or altered excretion kinetics.

Our findings reveal a broad reduction in fecal bile acid levels in infants who later developed severe BPD. This observation aligns with the established role of bile acids as signaling molecules that may influence immune and inflammatory pathways via receptors such as FXR and TGR5 (De Luca, 2022; Li et al., 2015). While preclinical models suggest that physiological bile acid signaling helps regulate pulmonary inflammation and vascular integrity (De Luca, 2022; Cao, 2024; Wang et al., 2011), our study cannot establish a direct mechanistic link in humans. Instead, our data provide novel clinical evidence of an association between early-life bile acid deficiency and subsequent BPD development. This deficiency could hypothetically impair FXR- and TGR5-mediated anti-inflammatory and pro-developmental signals, thereby increasing susceptibility to lung injury, but this requires functional validation. It is also important to note that while abnormal pulmonary accumulation of bile acids has been shown to exert direct cytotoxic effects (Zecca et al., 2008; De Luca, 2022; Chen et al., 2016), our metabolomic analysis of fecal samples suggests that systemic deficiency, rather than excess, is the predominant early-life alteration associated with BPD risk. We therefore propose that an early reduction in bile acids may represent a novel, clinically relevant risk marker for BPD, and that the potential disruption of bile acid signaling pathways represents a plausible, though not yet proven, contributing mechanism to its pathogenesis.

Consistent with the metabolic findings, infants who later developed severe BPD already exhibited significant alterations in their gut microbiome structure (beta diversity) by day 7 of life, although alpha diversity remained comparable to that of the non-BPD group. Notably, we observed an increased abundance of several opportunistic pathogens—including Brevundimonas, Delftia, and Achromobacter—genera previously associated with neonatal sepsis and other serious infections (Viswanathan et al., 2010; Scaglione, 2025; Ozden Turel, 2013). Emerging evidence suggests that colonization by these taxa may impair intestinal barrier and immune function: aberrant microbial colonization in preterm infants has been linked to disrupted immune maturation (Yao et al., 2023); experimental work indicates that intestinal colonization with Brevundimonas vesicularis can aggravate inflammatory responses and reduce efficacy of immunomodulatory treatments (Liu, 2023); and studies of milk microbiota have reported correlations between overgrowth of Brevundimonas and metabolic or inflammatory dysregulation (Liu, 2024). These findings align with our results and suggest that such organisms may contribute to early disruptions in intestinal homeostasis, potentially influencing BPD pathogenesis.

Integrated analysis revealed broad negative correlations between several downregulated bile acids and the enriched bacterial taxa. This aligns with the established view that reduced bile acid levels diminish antimicrobial pressure, thereby providing a competitive advantage to opportunistic pathogens (Larabi et al., 2023; Lynch, 2023). Notably, Brevundimonas occupied a central role in the bile acid–microbiota interaction network, showing negative correlations with all differential bile acids. This suggests that it may not merely be a passive consequence of bile acid deficiency, but also contribute to driving dysbiosis and amplifying inflammatory responses.

Based on these findings, we developed and validated an early prediction model integrating both bile acid and microbial features to assess its ability to stratify risk and identify severe BPD as early as day 7 after birth. Using five-fold cross-validation and an independent test set (70/30 split), both XGBoost and RF models demonstrated consistent performance, with overall accuracy around 80% and a maximum AUC of 0.91 for severe BPD. The models exhibited high specificity in identifying non-BPD infants, approaching 100%, indicating high reliability in ruling out non-BPD cases. These results are consistent with recent similar studies (Choi, 2025).

Compared to previous models primarily reliant on conventional clinical indicators such as gestational age and birth weight (Choi, 2025; Tao, 2024), we developed a multi-omics-based interpretable model that not only demonstrates superior predictive performance but also provides mechanistic insights into disease phenotypes by integrating bile acid and microbial features to stratify risk and identify severe BPD as early as day 7 after birth. SHAP-based interpretability analysis revealed that the top-ranking features contributing to model predictions overlapped with previously identified differential metabolites and microbial genera. In particular, CDCA, HCA, and Brevundimonas emerged as the most stable contributors across repeated resampling, while additional factors such as DLCA and Delftia also showed relevance. Although the sets of features highlighted by SHAP (which reflects global feature importance on model output) and correlation analysis (which captures linear associations between variables) are not identical, both methods underscore the importance of several key biomarkers. This finding aligns with the common use of SHAP for identifying critical biomarkers in multi-omics studies (Liu et al., 2025), and further supports the potential role of these factors in BPD pathogenesis.

Furthermore, robustness evaluations—including permutation tests and feature stability analysis—demonstrated that the model significantly outperformed random chance and that the most influential features were reproducible across subsampling. This study confirms that integrating fecal bile acid metabolome and gut microbiome data within an interpretable tree-based modeling framework (XGBoost–SHAP) yields an early discrimination strategy with both high predictive performance and biological interpretability. This approach aligns with advanced multi-omics strategies for studying neonatal diseases (Chen, 2024; Xu, 2022; Ding, 2025) and highlights the potential of this methodology for very early risk prediction and mechanistic investigation in BPD.

In summary, this study revealed an early co-occurrence of bile acid homeostasis disruption and microbial dysbiosis within the 7-day postnatal window, and demonstrated the feasibility of an interpretable machine-learning model for early BPD risk prediction.

This study has several limitations. First, the modest sample size (n = 50) and single-center design constrain statistical power and may introduce selection bias, limiting external validity and generalizability. Second, the observational nature of the study precludes causal inference between bile acid–microbiota interactions and BPD severity, and residual confounding cannot be fully excluded. Third, although the machine-learning models performed well under internal validation, their transportability requires confirmation in larger, multicenter prospective cohorts, including rigorous assessments of calibration and overfitting. Finally, 16S rRNA gene sequencing offers limited species-level resolution and provides only indirect functional inference, which may restrict interpretation of microbial functional potential relative to shotgun metagenomics or metatranscriptomics.

5 Conclusion

In a clinical cohort of preterm infants, we show that severe BPD is preceded by early disruption of the gut bile acid–microbiota axis, characterized by a marked reduction in bile acid levels and enrichment of opportunistic proteobacterial taxa. These findings provide clinical support for the gut–lung axis framework, implicating bile acid–microbiota crosstalk in the early pathogenesis of BPD. As illustrated in Figure 5, this conceptual model summarizes the proposed gut bile acid-microbiota-lung axis and its potential role in BPD pathogenesis. Building on this biology, an interpretable multi-omics model (XGBoost with SHAP) accurately predicted BPD severity and prioritized bile acids (e.g., CDCA, HCA) and the genus Brevundimonas as the most stable contributors. The work identifies candidate biomarker panels for early risk stratification and offers mechanistic insight beyond traditional inflammation-centric views. Future studies should validate these results in large, prospective, multicenter cohorts and evaluate targeted interventions that modulate the bile acid–microbiota axis for prevention or attenuation of BPD.

Figure 5
Conceptual schematic illustrating the gut bile acid–microbiota axis and its link to BPD risk in preterm infants. The left panel shows a balanced state where microbial BSH, 7α/7β-HSDH, and bai-operon activity maintain conversion of CA/CDCA to 7-keto and secondary bile acids, supporting FXR/TGR5-mediated anti-inflammatory signaling. The right panel depicts dysregulation with reduced bile-acid pool, overgrowth of opportunistic taxa such as Brevundimonas and Delftia, and weakened FXR/TGR5 signaling, increasing inflammatory susceptibility. The center represents prematurity-related immaturity (impaired occludin, UGT, and immune development).

Figure 5. Schematic of the gut bile acid-microbiota axis and BPD risk in preterm infants. At day 7 of life, all infants born ≤ 32 weeks share prematurity-related immaturity–reduced intestinal barrier (Occludin↓), delayed hepatic UGT activity, and an immature immune system–which constitutes a common background vulnerability. (Left) despite this baseline, some infants remain relatively balanced; microbial BSH, 7α/7β-HSDH, and the bai operon sustain conversion of CA/CDCA into 7-keto intermediates and secondary bile acids, preserving a balanced bile acid pool and FXR/TGR5-mediated anti-inflammatory signaling. (Right) under the same baseline, a subset shows early dysregulation with overgrowth of opportunistic taxa (e.g., Brevundimonas, Delftia, Achromobacter), loss of key anaerobic functions, and marked reductions across the bile-acid spectrum–including primary, secondary, sulfated, and 7-oxo/7-keto species–resulting in a depleted bile-acid pool and weakened FXR/TGR5 signaling, thereby increasing inflammatory susceptibility and BPD risk during lung development. This is a conceptual model and does not imply proven causality. Arrow thickness represents the relative size of the bile-acid pool (thinner = reduced pool). BA, bile acid; SBA, secondary bile acid; CA, cholic acid; CDCA, chenodeoxycholic acid; CA-3S, cholic acid-3-sulfate; CDCA-3S, chenodeoxycholic acid-3-sulfate; UGT, UDP-glucuronosyltransferase; BSH, bile salt hydrolases; HSDH, hydroxysteroid dehydrogenases; FXR, Farnesoid X Receptor; TGR5, Takeda G-protein-coupled Receptor 5; BPD, bronchopulmonary dysplasia.

Data availability statement

The datasets presented in this study can be found in online repositories. The names of the repository/repositories and accession number(s) can be found at: https://www.ncbi.nlm.nih.gov/, PRJNA1328652.

Ethics statement

The studies involving humans were approved by Ethics Committee of the Second Hospital of Jilin University. The studies were conducted in accordance with the local legislation and institutional requirements. Written informed consent for participation in this study was provided by the participants' legal guardians/next of kin.

Author contributions

HY: Conceptualization, Formal analysis, Writing – original draft, Writing – review & editing. YG: Data curation, Investigation, Writing – review & editing. JL: Data curation, Visualization, Writing – review & editing. RF: Investigation, Validation, Writing – review & editing. YZ: Funding acquisition, Methodology, Supervision, Validation, Writing – review & editing. WG: Conceptualization, Project administration, Writing – review & editing.

Funding

The author(s) declare that financial support was received for the research and/or publication of this article. This work was supported by the Jilin Provincial Department of Science and Technology (Grant No. YDZJ202202CXJD045).

Acknowledgments

We thank all participants and their families for their invaluable contributions to this study. Figure 5 was created using Figdraw (https://www.figdraw.com).

Conflict of interest

The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Generative AI statement

The author(s) declare that no Gen AI was used in the creation of this manuscript.

Any alternative text (alt text) provided alongside figures in this article has been generated by Frontiers with the support of artificial intelligence and reasonable efforts have been made to ensure accuracy, including review by the authors wherever possible. If you identify any issues, please contact us.

Publisher's note

All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article, or claim that may be made by its manufacturer, is not guaranteed or endorsed by the publisher.

Supplementary material

The Supplementary Material for this article can be found online at: https://www.frontiersin.org/articles/10.3389/fmicb.2025.1705965/full#supplementary-material

References

Alnouti, Y. (2009). Bile acid sulfation: a pathway of bile acid elimination and detoxification. Toxicol. Sci. 108, 225–246. doi: 10.1093/toxsci/kfn268

PubMed Abstract | Crossref Full Text | Google Scholar

Assem, M., Schuetz, E. G., Leggas, M., Sun, D., Yasuda, K., Reid, G., et al. (2004). Interactions between hepatic mrp4 and sult2a as revealed by the constitutive androstane receptor and mrp4 knockout mice. J. Biol. Chem. 279, 22250–22257. doi: 10.1074/jbc.M314111200

PubMed Abstract | Crossref Full Text | Google Scholar

Bennett, M. J., McKnight, S. L., and Coleman, J. P. (2003). Cloning and characterization of the nad-dependent 7alpha-hydroxysteroid dehydrogenase from bacteroides fragilis. Curr. Microbiol. 47, 475–484. doi: 10.1007/s00284-003-4079-4

PubMed Abstract | Crossref Full Text | Google Scholar

Bray, J. R., and Curtis, J. T. (1957). An ordination of the upland forest communities of southern wisconsin. Ecol. Monogr. 27, 325–349. doi: 10.2307/1942268

Crossref Full Text | Google Scholar

Callahan, B. J., McMurdie, P. J., Rosen, M. J., Han, A. W., Johnson, A. J. A., Holmes, S. P., et al. (2016). Dada2: high-resolution sample inference from Illumina amplicon data. Nat. Methods 13, 581–583. doi: 10.1038/nmeth.3869

PubMed Abstract | Crossref Full Text | Google Scholar

Cao, Y., Xu, Y., Zhou, J., Fu, X., Zhang, H., Du, X., et al. (2024). Farnesoid x receptor (FXR) as a potential therapeutic target for lung diseases: a narrative review. J. Thorac. Dis. 16, 8026–8038. doi: 10.21037/jtd-24-734

PubMed Abstract | Crossref Full Text | Google Scholar

Chao, A. (1984). Nonparametric estimation of the number of classes in a population. Scand. J. Stat. 11, 265–270.

Google Scholar

Chen, B., Cai, H. R., Xue, S., You, W. J., Liu, B., Jiang, H. D., et al. (2016). Bile acids induce activation of alveolar epithelial cells and lung fibroblasts through farnesoid x receptor-dependent and independent pathways. Respirology 21, 1075–1080. doi: 10.1111/resp.12815

PubMed Abstract | Crossref Full Text | Google Scholar

Chen, S. M., Lin, C. P., and Jan, M. S. (2021). Early gut microbiota changes in preterm infants with bronchopulmonary dysplasia: a pilot case-control study. Am. J. Perinatol. 38, 1142–1149. doi: 10.1055/s-0040-1710554

PubMed Abstract | Crossref Full Text | Google Scholar

Chen, W., Zhang, P., Zhang, X., Xiao, T., Zeng, J., Guo, K., et al. (2024). Machine learning-causal inference based on multi-omics data reveals the association of altered gut bacteria and bile acid metabolism with neonatal jaundice. Gut Microbes 16:2388805. doi: 10.1080/19490976.2024.2388805

PubMed Abstract | Crossref Full Text | Google Scholar

Choi, H. J., Lee, G., Shin, S. H., Lee, S. M., Lee, H. C., Sohn, J. A., et al. (2025). Development and external validation of a machine learning model to predict bronchopulmonary dysplasia using dynamic factors. Sci. Rep. 15:13620. doi: 10.1038/s41598-025-98087-9

PubMed Abstract | Crossref Full Text | Google Scholar

Collado, M. C., Cernada, M., Neu, J., Perez-Martinez, G., Gormaz, M., Vento, M., et al. (2015). Factors influencing gastrointestinal tract and microbiota immune interaction in preterm infants. Pediatr. Res. 77, 726–731. doi: 10.1038/pr.2015.54

PubMed Abstract | Crossref Full Text | Google Scholar

Dankhara, N., Holla, I., Ramarao, S., and Kalikkot Thekkeveedu, R. (2023). Bronchopulmonary dysplasia: Pathogenesis and pathophysiology. J. Clin. Med. 12:4207. doi: 10.3390/jcm12134207

PubMed Abstract | Crossref Full Text | Google Scholar

De Luca, D., Alonso, A., and Autilio, C. (2022). Bile acid-induced lung injury: update of reverse translational biology. Am. J. Physiol. Lung Cell Mol. Physiol. 323, L93–L106. doi: 10.1152/ajplung.00523.2021

PubMed Abstract | Crossref Full Text | Google Scholar

Ding, J., Xu, J., Wu, H., Li, M., Xiao, Y., Fu, J., et al. (2025). The cross-talk between the metabolome and microbiome in a double-hit neonatal rat model of bronchopulmonary dysplasia. Genomics 117:110969. doi: 10.1016/j.ygeno.2024.110969

PubMed Abstract | Crossref Full Text | Google Scholar

Doden, H. L., and Ridlon, J. M. (2021). Microbial hydroxysteroid dehydrogenases: From alpha to omega. Microorganisms 9:469. doi: 10.3390/microorganisms9030469

PubMed Abstract | Crossref Full Text | Google Scholar

Fleishman, J. S., and Kumar, S. (2024). Bile acid metabolism and signaling in health and disease: molecular mechanisms and therapeutic targets. Signal Transduct. Target Ther. 9:97. doi: 10.1038/s41392-024-01811-6

PubMed Abstract | Crossref Full Text | Google Scholar

Geetha, O., Rajadurai, V. S., Anand, A. J., Dela Puerta, R., Huey Quek, B., Khoo, P. C., et al. (2021). New bpd-prevalence and risk factors for bronchopulmonary dysplasia/mortality in extremely low gestational age infants ≤ 28 weeks. J. Perinatol. 41, 1943–1950. doi: 10.1038/s41372-021-01095-6

Crossref Full Text | Google Scholar

Godlewska, U., Bulanda, E., and Wypych, T. P. (2022). Bile acids in immunity: bidirectional mediators between the host and the microbiota. Front. Immunol. 13:949033. doi: 10.3389/fimmu.2022.949033

PubMed Abstract | Crossref Full Text | Google Scholar

Golubkova, A., and Hunter, C. J. (2023). Development of the neonatal intestinal barrier, microbiome, and susceptibility to NEC. Microorganisms 11:274. doi: 10.3390/microorganisms11051247

PubMed Abstract | Crossref Full Text | Google Scholar

Good, I. J. (1953). The population frequencies of species and the estimation of population parameters. Biometrika 40, 237–264. doi: 10.1093/biomet/40.3-4.237

Crossref Full Text | Google Scholar

Guzior, D. V., and Quinn, R. A. (2021). Review: microbial transformations of human bile acids. Microbiome 9:140. doi: 10.1186/s40168-021-01101-1

PubMed Abstract | Crossref Full Text | Google Scholar

Higgins, R. D., Jobe, A. H., Koso-Thomas, M., Bancalari, E., Viscardi, R. M., Hartert, T. V., et al. (2018). Bronchopulmonary dysplasia: executive summary of a workshop. J. Pediatr. 197, 300–308. doi: 10.1016/j.jpeds.2018.01.043

PubMed Abstract | Crossref Full Text | Google Scholar

Hwang, J. K., Kim, D. H., Na, J. Y., Son, J., Oh, Y. J., Jung, D., et al. (2023). Two-stage learning-based prediction of bronchopulmonary dysplasia in very low birth weight infants: a nationwide cohort study. Front. Pediatr. 11:1155921. doi: 10.3389/fped.2023.1155921

PubMed Abstract | Crossref Full Text | Google Scholar

Ikegami, T., and Honda, A. (2018). Reciprocal interactions between bile acids and gut microbiota in human liver diseases. Hepatol. Res. 48, 15–27. doi: 10.1111/hepr.13001

PubMed Abstract | Crossref Full Text | Google Scholar

Katoh, K., Misawa, K., Kuma, K., and Miyata, T. (2002). Mafft: a novel method for rapid multiple sequence alignment based on fast Fourier transform. Nucleic Acids Res. 30, 3059–3066. doi: 10.1093/nar/gkf436

PubMed Abstract | Crossref Full Text | Google Scholar

Kawade, N., and Onishi, S. (1981). The prenatal and postnatal development of UDP-glucuronyltransferase activity towards bilirubin and the effect of premature birth on this activity in the human liver. Biochem. J. 196, 257–60. doi: 10.1042/bj1960257

PubMed Abstract | Crossref Full Text | Google Scholar

Klinger, G., Sokolover, N., Boyko, V., Sirota, L., Lerner-Geva, L., Reichman, B., et al. (2013). Perinatal risk factors for bronchopulmonary dysplasia in a national cohort of very-low-birthweight infants. Am. J. Obstet. Gynecol. 208:115.e1–9. doi: 10.1016/j.ajog.2012.11.026

PubMed Abstract | Crossref Full Text | Google Scholar

Larabi, A. B., Masson, H. L. P., and Baumler, A. J. (2023). Bile acids as modulators of gut microbiota composition and function. Gut Microbes 15:2172671. doi: 10.1080/19490976.2023.2172671

PubMed Abstract | Crossref Full Text | Google Scholar

Leigh, R. M., Pham, A., Rao, S. S., Vora, F. M., Hou, G., Kent, C., et al. (2022). Machine learning for prediction of bronchopulmonary dysplasia-free survival among very preterm infants. BMC Pediatr. 22:542. doi: 10.1186/s12887-022-03602-w

PubMed Abstract | Crossref Full Text | Google Scholar

Li, P., Luo, H., Ji, B., and Nielsen, J. (2022). Machine learning for data integration in human gut microbiome. Microb. Cell Fact. 21:241. doi: 10.1186/s12934-022-01973-4

PubMed Abstract | Crossref Full Text | Google Scholar

Li, Y., Cui, Y., Wang, C., Liu, X., and Han, J. (2015). A risk factor analysis on disease severity in 47 premature infants with bronchopulmonary dysplasia. Intractable Rare Dis. Res. 4, 82–86. doi: 10.5582/irdr.2015.01000

PubMed Abstract | Crossref Full Text | Google Scholar

Licciardi, A., Fiannaca, A., Rosa, M. L., Urso, M. A., and Paglia, L. L. (2024). “A deep learning multi-omics framework to combine microbiome and metabolome profiles for disease classification,” in Artificial Neural Networks and Machine Learning-ICANN 2024, volume 15023 of Lecture Notes in Computer Science, eds. M. Wand, K. Malinovská, J. Schmidhuber, and I. V. Tetko (Cham: Springer), 3–14. doi: 10.1007/978-3-031-72353-7_1

Crossref Full Text | Google Scholar

Liu, L., Liang, L., Liang, H., Wang, M., Zhou, W., Mai, G., et al. (2025). Microbiome-metabolome generated bile acids gatekeep infliximab efficacy in Crohn's disease by licensing m1 suppression and treg dominance. J. Adv. Res. doi: 10.1016/j.jare.2025.08.017. [Epub ahead of print].

PubMed Abstract | Crossref Full Text | Google Scholar

Liu, T., Yuan, Y., Wei, J., Chen, J., Zhang, F., Chen, J., et al. (2024). Association of breast milk microbiota and metabolites with neonatal jaundice. Front. Pediatr. 12:1500069. doi: 10.3389/fped.2024.1500069

PubMed Abstract | Crossref Full Text | Google Scholar

Liu, X., Xu, B., Xu, X., Wang, Z., Luo, Y., Gao, Y., et al. (2023). Attenuation of allergen-specific immunotherapy for atopic dermatitis by ectopic colonization of brevundimonas vesicularis in the intestine. Cell Rep. Med. 4:101340. doi: 10.1016/j.xcrm.2023.101340

PubMed Abstract | Crossref Full Text | Google Scholar

Lozupone, C., and Knight, R. (2005). Unifrac: a new phylogenetic method for comparing microbial communities. Appl. Environ. Microbiol. 71, 8228–8235. doi: 10.1128/AEM.71.12.8228-8235.2005

PubMed Abstract | Crossref Full Text | Google Scholar

Lynch, L. E., Hair, A. B., Soni, K. G., Yang, H., Gollins, L. A., Narvaez-Rivas, M., et al. (2023). Cholestasis impairs gut microbiota development and bile salt hydrolase activity in preterm neonates. Gut Microbes 15:2183690. doi: 10.1080/19490976.2023.2183690

PubMed Abstract | Crossref Full Text | Google Scholar

Mai, V., Torrazza, R. M., Ukhanova, M., Wang, X., Sun, Y., Li, N., et al. (2013). Distortions in development of intestinal microbiota associated with late onset sepsis in preterm infants. PLoS ONE 8:e52876. doi: 10.1371/journal.pone.0052876

PubMed Abstract | Crossref Full Text | Google Scholar

Martin, M. (2011). Cutadapt removes adapter sequences from high-throughput sequencing reads. EMBnet J. 17, 10–12. doi: 10.14806/ej.17.1.200

Crossref Full Text | Google Scholar

Ning, T., Shan, X., Zhuang, X., Li, B., Zhang, Y., Chen, T., et al. (2025). Intestinal microbiota changes in early life of very preterm infants with bronchopulmonary dysplasia: a nested case-control study. Front. Microbiol. 16:1632412. doi: 10.3389/fmicb.2025.1632412

PubMed Abstract | Crossref Full Text | Google Scholar

Pammi, M., Cope, J., Tarr, P. I., Warner, B. B., Morrow, A. L., Mai, V., et al. (2017). Intestinal dysbiosis in preterm infants preceding necrotizing enterocolitis: a systematic review and meta-analysis. Microbiome 5:31. doi: 10.1186/s40168-017-0248-8

PubMed Abstract | Crossref Full Text | Google Scholar

Pielou, E. C. (1966). The measurement of diversity in different types of biological collections. J. Theor. Biol. 13, 131–144. doi: 10.1016/0022-5193(66)90013-0

Crossref Full Text | Google Scholar

Price, M. N., Dehal, P. S., and Arkin, A. P. (2009). Fasttree: computing large minimum evolution trees with profiles instead of a distance matrix. Mol. Biol. Evol. 26, 1641–1650. doi: 10.1093/molbev/msp077

PubMed Abstract | Crossref Full Text | Google Scholar

Ridlon, J. M., Harris, S. C., Bhowmik, S., Kang, D. J., and Hylemon, P. B. (2016). Consequences of bile salt biotransformations by intestinal bacteria. Gut Microbes 7, 22–39. doi: 10.1080/19490976.2015.1127483

PubMed Abstract | Crossref Full Text | Google Scholar

Scaglione, V., Stefanelli, L. F., Mazzitelli, M., Cattarin, L., De Giorgi, L., Naso, E., et al. (2025). Delftia acidovorans infections in immunocompetent and immunocompromised hosts: a case report and systematic literature review. Antibiotics 14:365. doi: 10.3390/antibiotics14040365

PubMed Abstract | Crossref Full Text | Google Scholar

Shannon, C. E. (1948). A mathematical theory of communication. Bell Syst. Tech. J. 27, 379–423. doi: 10.1002/j.1538-7305.1948.tb01338.x

Crossref Full Text | Google Scholar

Simpson, E. H. (1949). Measurement of diversity. Nature 163:688. doi: 10.1038/163688a0

Crossref Full Text | Google Scholar

Tao, Y., Ding, X., and Guo, W. L. (2024). Using machine-learning models to predict extubation failure in neonates with bronchopulmonary dysplasia. BMC Pulm. Med. 24:308. doi: 10.1186/s12890-024-03133-3

PubMed Abstract | Crossref Full Text | Google Scholar

Thekkeveedu, R. K., Guaman, M. C., and Shivanna, B. (2017). Bronchopulmonary dysplasia: a review of pathogenesis and pathophysiology. Respir. Med. 132, 170–177. doi: 10.1016/j.rmed.2017.10.014

PubMed Abstract | Crossref Full Text | Google Scholar

Tirone, C., Pezza, L., Paladini, A., Tana, M., Aurilia, C., Lio, A., et al. (2019). Gut and lung microbiota in preterm infants: immunological modulation and implication in neonatal outcomes. Front. Immunol. 10:2910. doi: 10.3389/fimmu.2019.02910

PubMed Abstract | Crossref Full Text | Google Scholar

Turel, O., Kavuncuoglu, S., Hosaf, E., Ozbek, S., Aldemir, E., Uygur, T., et al. (2013). Bacteremia due to achromobacter xylosoxidans in neonates: clinical features and outcome. Braz. J. Infect. Dis. 17, 450–454. doi: 10.1016/j.bjid.2013.01.008

PubMed Abstract | Crossref Full Text | Google Scholar

Viswanathan, R., Singh, A., Mukherjee, R., Sardar, S., Dasgupta, S., Mukherjee, S., et al. (2010). Brevundimonas vesicularis: a new pathogen in newborn. J. Pediatr. Infect. Dis. 5, 189–191.

Google Scholar

Wang, Y., Chen, W., Yu, D., Forman, B., and Huang, W. (2011). The g-protein-coupled bile acid receptor, gpbar1 (tgr5), negatively regulates hepatic inflammatory response through antagonizing nuclear factor kappa light-chain enhancer of activated b cells (NF-B) in mice. Hepatology 54, 1421–1432. doi: 10.1002/hep.24525

Crossref Full Text | Google Scholar

Wise, J. L., and Cummings, B. P. (2022). The 7-alpha-dehydroxylation pathway: an integral component of gut bacterial bile acid metabolism and potential therapeutic target. Front. Microbiol. 13:1093420. doi: 10.3389/fmicb.2022.1093420

Crossref Full Text | Google Scholar

Xu, Q., Yu, J., Liu, D., Tan, Q., and He, Y. (2022). The airway microbiome and metabolome in preterm infants: potential biomarkers of bronchopulmonary dysplasia. Front. Pediatr. 10:862157. doi: 10.3389/fped.2022.862157

PubMed Abstract | Crossref Full Text | Google Scholar

Yao, J., Ai, T., Zhang, L., Tang, W., Chen, Z., Huang, Y., et al. (2023). Bacterial colonization in the airways and intestines of twin and singleton preterm neonates: a single-center study. Can. J. Infect. Dis. Med. Microbiol. 2023:2973605. doi: 10.1155/2023/2973605

PubMed Abstract | Crossref Full Text | Google Scholar

Yoneda K. Seki T. Kawazoe Y. Ohe K. Takahashi N. Neonatal Research Network of Japan (2024). Immediate postnatal prediction of death or bronchopulmonary dysplasia among very preterm and very low birth weight infants based on gradient boosting decision trees algorithm: a nationwide database study in japan. PLoS ONE 19:e0300817. doi: 10.1371/journal.pone.0300817

Crossref Full Text | Google Scholar

Zecca, E., De Luca, D., Baroni, S., Vento, G., Tiberi, E., and Romagnoli, C. (2008). Bile acid-induced lung injury in newborn infants: a bronchoalveolar lavage fluid study. Pediatrics 121, e146–e149. doi: 10.1542/peds.2007-1220

PubMed Abstract | Crossref Full Text | Google Scholar

Zhang, Z., Jiang, J., Li, Z., and Wan, W. (2022). The change of cytokines and gut microbiome in preterm infants for bronchopulmonary dysplasia. Front. Microbiol. 13:804887. doi: 10.3389/fmicb.2022.804887

PubMed Abstract | Crossref Full Text | Google Scholar

Zhu, Z., He, Y., Yuan, L., Chen, L., Yu, Y., Liu, L., et al. (2024). Trends in bronchopulmonary dysplasia and respiratory support among extremely preterm infants in china over a decade. Pediatr. Pulmonol. 59, 399–407. doi: 10.1002/ppul.26761

PubMed Abstract | Crossref Full Text | Google Scholar

Keywords: bronchopulmonary dysplasia (BPD), preterm infants, bile acids, gut microbiota, gut-lung axis, multi-omics, machine learning, biomarkers

Citation: Yu H, Guo Y, Li J, Fu R, Zhang Y and Guo W (2025) Disruption of the gut bile acid-microbiota axis precedes severe bronchopulmonary dysplasia in preterm infants. Front. Microbiol. 16:1705965. doi: 10.3389/fmicb.2025.1705965

Received: 15 September 2025; Accepted: 28 October 2025;
Published: 24 November 2025.

Edited by:

Renqiang Yu, Affiliated Women's Hospital of Jiangnan University, China

Reviewed by:

Yu Pi, Chinese Academy of Agricultural Sciences (CAAS), China
Chie Matsuguma, Yamaguchi University, Japan
Rochelle Sequeira Gomes, Nemours Children's Hospital, United States

Copyright © 2025 Yu, Guo, Li, Fu, Zhang and Guo. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

*Correspondence: Yunfeng Zhang, emhhbmd5dW5mQGpsdS5lZHUuY24=; Wanxu Guo, aWthaWt1QGpsdS5lZHUuY24=

Disclaimer: All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article or claim that may be made by its manufacturer is not guaranteed or endorsed by the publisher.