- 1The Second School of Clinical Medicine, Southern Medical University, Guangzhou, China
- 2Institute of Pediatrics, Faculty of Pediatrics, The Seventh Medical Center of PLA General Hospital, Beijing, China
- 3National Engineering Laboratory for Birth Defect Prevention and Control of Key Technology, Beijing, China
- 4Beijing Key Laboratory of Pediatric Organ Failure, Beijing, China
- 5Department of Pediatric Cardiology, Faculty of Pediatrics, the Seventh Medical Center of PLA General Hospital, Beijing, China
Bronchopulmonary dysplasia (BPD), also known as chronic lung disease, is the most common cause of respiratory morbidity in preterm infants. Sepsis plays a significant role in the pathogenesis of BPD, and the systemic inflammatory response caused by sepsis is associated with lung development, leading to simplified alveoli and abnormal vascular development, which are the histological hallmarks of BPD. In this study, we conducted a retrospective analysis of the clinical characteristics of 306 preterm infants with BPD treated at our hospital from December 2019 to December 2022. We subsequently utilized ten machine learning (ML) algorithms and used clinical features to acquire models to predict BPD with sepsis. The performance of the model was evaluated according to the mean area under the receiver operating characteristic curve (AUC), sensitivity, specificity, and accuracy. The mean area under the curve (AUC) of the best predictive model was 0.93. A nomogram for sepsis onset was developed in the primary cohort with four factors: invasive respiratory support, CRIB II score, NEC, and chorioamnionitis. By including clinical features, ML algorithms can predict BPD with sepsis, and the random forest (RF) model (sorted by the mean AUC) performs the best. Our prediction model and nomogram can help clinicians make early diagnoses and formulate better treatment plans for preterm infants with BPD.
Introduction
Bronchopulmonary dysplasia (BPD) is usually caused by mechanical ventilation and long-term use of oxygen (1). BPD, resulting from lung injury that disrupts alveolar and pulmonary vascular development, is one of the most common causes of morbidity and mortality in preterm infants. Owing to the increased survival of extremely low-gestational-age newborns, BPD remains the most common complication associated with prematurity, and its prevalence is increasing (1–3).
BPD is a multifactorial pathology influenced by a variety of prenatal and postnatal factors that affect mothers and infants (4). BPD pathology worsens and further promotes reactive oxygen species (ROS) production, and subsequent inflammation leads to sepsis (5). If children with BPD develop an infection, this may further exacerbate the inflammatory response, thereby increasing the risk of sepsis (6).
Sepsis is a clinical syndrome involving organ dysfunction caused by a disordered host response to infection (7). Preterm infant sepsis refers to an infection involving the bloodstream in infants aged <28 days of age (8). Sepsis is divided into early-onset sepsis (EOS) and late-onset sepsis (LOS) on the basis of age at presentation after birth. EOS refers to sepsis in neonates occurring within 72 h (h) of birth (some experts use 7 days), and LOS is defined as sepsis occurring at or after 72 h of life (8–10). EOS is generally caused by an in utero infection or by vertical bacterial transmission from the mother during vaginal delivery, whereas LOS usually occurs not only by vertical bacterial transmission but also by horizontal bacterial transmission from healthcare providers and the environment (11). The clinical manifestations of neonatal sepsis can range from nonspecific or vague symptoms to hemodynamic collapse. The early symptoms may include irritability, lethargy, or poor feeding. Others may quickly progress to respiratory distress, fever, hypothermia, or hypotension, accompanied by poor perfusion and shock (12).
Preterm infants with BPD are more susceptible to developing sepsis because of chronic lung and airway damage. Predicting the risk of BPD with sepsis is crucial for early intervention and improving patient prognosis. Few studies have reported the establishment of predictive models in adults (13–15). However, in neonatal medicine, statistical or machine learning (ML) models for predicting patients who may develop BPD with sepsis are relatively rare. Therefore, such models need to be established to help doctors identify risks early, thereby allowing them to take preventive measures or undergo early treatment.
In this study, we conducted a retrospective analysis of the clinical characteristics of preterm infants with BPD at our hospital from December 2019 to December 2022. We subsequently used ML algorithms to identify high-risk factors for the co-occurrence of BPD and sepsis.
Methods
Study design and study participants
This study was approved by the Ethics Committee of the Seventh Medical Center of the PLA General Hospital (S2024-046-01) and was conducted in accordance with the Declaration of Helsinki. The studies were conducted in accordance with local legislation and institutional requirements. Written informed consent for participation was not required from the participants or their legal guardians/next of kin because of the retrospective design of the study.
The study population included extremely preterm infants with BPD who were born or admitted to the Seventh Medical Center of PLA General Hospital (Beijing, China). Trained neonatologists at our center identified the patients. The following criteria were applied to construct the initial dataset:
The inclusion criteria were as follows: very preterm infants (gestational age ≤32 weeks, >28 weeks) and extremely preterm infants (gestational age ≤28 weeks) who met the diagnostic criteria for BPD.
The exclusion criteria were as follows: very preterm infants and extremely preterm infants who died were withdrawn from treatment or discharged within 28 days after birth. Patients with genetic metabolic diseases. Infants without BPD.
The patients were divided into BPD with sepsis and BPD without sepsis groups on the basis of Doppler echocardiography results after at least 36 weeks of corrected gestation.
Clinical feature collection
Clinical data were collected from the electronic medical records of the Seventh Medical Center of the PLA General Hospital, including maternal pregnancy factors [amniotic fluid disorders, hypertension during pregnancy, gestational diabetes mellitus (GDM), preeclampsia, placental abnormality, delivery method, placenta pathology], newborn clinical data [gestational age (GA), birth weight (BWt), sex, multifetal gestations, 1-min Apgar score (5APGAR), 10 min Apgar score (10APGAR), severity of BPD, pulmonary hypertension (PH) (early-PH and BPD-related PH), patent ductus arteriosus (PDA), severe PDA (defined as needing surgery ligation), intraventricular hemorrhage (IVH), necrotizing enterocolitis (NEC), retinopathy of prematurity (ROP), clinical risk index for babies score II (CRIB II), hospital length of stay (day), invasive respiratory support (day), etc.], related laboratory tests (platelet (PLT), C-reactive protein (CRP), white blood cell count (WBC), neutrophil count (N), hemoglobin (HGB)), signs and symptoms of infection episodes such as difficulty breathing, fever (>37.5°C), hypothermia (<36.5°C), abdominal distension, feeding intolerance, etc., and microbiological characteristics. The diagnosis of chorioamnionitis was confirmed via placental histopathology. Echocardiograms at 4–7 days of age were used to assess “early-PH” (16, 17). Other diagnostic criteria were based on related criteria (18, 19).
Statistical methods
To filter for missing data, the missing data module in Python 3.9.12 was used. In Supplementary Figure S1, each column represents a clinical variable, and the white line represents missing data. The denser the lines in each column are, the greater the number of missing values for that variable. Detailed information regarding missing values is provided in Supplementary Table S1. We removed the antenatal corticosteroid (ANC) variable, which was missing in >25% of the observations, to facilitate and ensure study accuracy.
Continuous data are presented as the mean ± standard deviation (SD) or median (interquartile range, Q1, Q3), and intergroup comparisons of normally distributed continuous data were made via two-sample t tests, with the test value being the t value. Intergroup comparisons of nonnormally distributed continuous data via nonparametric tests were performed via the Mann‒Whitney U test, with the test value being the Z value. Categorical data are represented by the number of cases and percentage (%), and intergroup comparisons were made via the rank sum test, with the test value being χ2. To identify the predictors of sepsis, a variance inflation factor (VIF) was first used to test all predictors for multicollinearity, followed by the inclusion of all predictors in the model via multivariate logistic regression analysis. The odds ratios (ORs) for independent risk factors for sepsis were estimated via a stepwise selection method with a 95% confidence interval (CI). A two-tailed P value less than 0.05 indicated a statistically significant difference. Statistical analyses were performed via SPSS version 27.0.
Binary classification was performed using Scikit-learn (version 0.24.1) in Python (version 3.9.12). Ten ML algorithms were used to differentiate between septic BPD patients and nonseptic BPD patients. The following 10 models are used: logistic regression (LR), random forest (RF), support vector machine (SVM), decision tree (DTREE), AdaBoost (ADB), Gaussian naive Bayes (NB), linear discriminant analysis (LDA), k-nearest neighbors (KNN), gradient boosting classifier (GB), and multilayer perceptron (MLP). Considering the limited number of data samples, each classification model for the ML algorithms was built on all the data via the default parameters in the scikit-learn library. The synthetic minority oversampling technique (SMOTE) expands the number of samples in a minority class to ensure equal representation among groups in ML. The bootstrap method was used 1,000 times for internal validation. The performance of each algorithm was assessed on the basis of average sensitivity, specificity, the mean area under the receiver operating characteristic (ROC) curve, and the mean F1 of the resampled samples 1,000 times for pediatric patients with combined septic and nonseptic BPD. Receiver operating characteristic (ROC) curves were plotted via the matplotlib library (version 3.3.4) in Python as part of the internal validation process. The precise contribution (magnitude and direction) of the feature output by each classifier was determined via Shapley additive explanations (SHAPs). The SHAP values were calculated via the RF algorithm for each classifier. SHAP summary plots were visualized in Python via the Sharp library (version 0.39.0).
Nomogram charts (rms package) were drawn using the selected risk factors. The concordance statistic (C statistic) and calibration curve (rms package) were used to distinguish and calibrate the nomograms. Decision curve analysis (DCA) and a clinical impact curve (CIC) were used to evaluate the clinical utility of the model (20–22). The DCA curve and CIC (ggDCA package) were used to evaluate the effectiveness and clinical applicability of the risk prediction nomogram. Statistical analyses were performed via version 4.5.0 of the R statistical software.
Results
Demographic and clinical features of sepsis BPD patients and nonsepsis BPD patients
A total of 306 patients with BPD were enrolled in this study, including 177 men (57.8%) and 129 women (42.2%). The demographic and clinical features of the patients are summarized in Table 1.

Table 1. Clinical characteristics of infants with BPD with sepsis vs. those without sepsis and EOS vs. LOS in infants with BPD with sepsis.
Our analysis revealed that the lower the birth weight was, the greater the chance of developing sepsis (Figure 1A). The lower the gestational age was, the greater the incidence of sepsis, especially in preterm infants born at ≤27 weeks (Figure 1B).

Figure 1. Distribution of birth weight (A) and gestational age (B) in sepsis and nonsepsis BPD patients. BPD, bronchopulmonary dysplasia.
The features of the patients with BPD are described in Table 1. A comparison of the two groups revealed that several characteristics, such as GA, BWt, delivery mode, hospital stay, duration of invasive ventilator ventilation, oxygen intrathecal time, CRIB II score and 5APGAR, were significantly different (p < 0.05). These results suggest that the earlier the gestational age and the lower the birth weight, the greater the proportion of infants with sepsis, the longer the hospital stay, and the longer the duration of invasive mechanical ventilation. Sex, singleton pregnancy, noninvasive ventilator ventilation time, and 1APGAR and 10APGAR levels were not significantly different between the two groups.
The analysis of the clinical characteristics of maternal and postnatal complications, including chorioamnionitis, varying degrees of ROP, NEC, early postnatal pulmonary hypertension (early PH), severity grading of BPD, and BPD-PH (p < 0.05), revealed significant differences between sepsis patients and nonsepsis BPD patients (Table 1).
Machine learning algorithms and the development and evaluation of a nomogram for BPD patients with sepsis
In previous clinical practices, BWt, GA, invasive respiratory support, CRIB II, 5APGAR, NEC, early PH, and chorioamnionitis were found to be independent risk factors for sepsis in preterm infants with BPD. Furthermore, collinearity diagnostic analysis demonstrated that the VIFs of these risk factors, except BWt and GA, were less than 4, indicating that there was no strong indication of multicollinearity among the variables. Considering that data were missing, we included a sample size of 284 cases for model construction after removing missing data. These four variables, invasive respiratory support, CRIB II, NEC, and chorioamnionitis, were incorporated into the final predictive model on the basis of stepwise regression results. Multivariate analysis revealed that invasive respiratory support (OR, 1.02; 95% CI, 1.00–1.04; p < 0.05), CRIB II (OR, 1.28; 95% CI, 1.12–1.46; p < 0.05), NEC (OR, 3.10; 95% CI, 1.23–7.82; p < 0.05), and chorioamnionitis (OR, 10.40; 95% CI, 2.85–38.02; p < 0.05) independently increased the risk for the development of sepsis in BPD infants (Table 2).
We constructed 10 models of ML by comparing their model performance. In the predictive model built with the above four factors, the mean area under the ROC curve of each model reached or approached 0.8. Among them, the RF model emerged as the most effective predictor, with a mean area under the receiver operating characteristic (ROC) curve of 0.93 (Figure 2A). The mean F1 of the 4/10 models reached 0.8, with the RF model having the best predictive performance, with a mean F1 of 0.87 (Table 3).

Figure 2. Ml and prediction models for sepsis-induced BPD patients. (A) ROC curves after internal validation via bootstrap resampling (1,000 times) of 10 machine learning models. the shading represents the mean AUC of the bootstrap samples, and the line represents the apparent AUC. (B) SHAP heatmap generated via the random forest model. (C) A nomogram was used to predict sepsis in infants with BPD. A binary logistic regression algorithm was used to establish the nomogram. The final score was calculated as the sum of the individual scores for each of the four variables included in the nomogram. (D) Calibration curve of the regression model. The X-axis represents the overall predicted probability of sepsis in infants with BPD, and the Y-axis represents the actual probability. Model calibration is indicated by the degree of fitting of the curve and the diagonal. (E) DCA curve of the logistic regression model. The horizontal axis in the figure represents the threshold probability, and the vertical axis represents the net benefit (NB). The lines' None “and” All “represent two extreme situations, where” None' indicates that all patients have a negative outcome, no intervention has been performed, and NB is 0. All the lines indicate that all patients have a positive outcome and that all have received intervention. Its NB is a negative sloping diagonal line. In this analysis, the decision curve provided a larger net benefit across the range of 0.2–0.80. (F) Clinical impact curve of the logistic regression model. LR, logistic regression; RF, random forest; SVM, support vector machine; DTREE, decision tree; ADB, AdaBoost; NB, Gaussian naive Bayes; LDA, linear discriminant analysis; KNN, k-nearest neighbors; GB, gradient boosting classifier; MLP, multilayer perceptron.
The RF model was used to demonstrate the importance of various features in the model with the best predictive performance. The CRIB II score ranked first, followed by the duration of invasive mechanical ventilation and NEC. These findings indicated that a higher CRIB II score and longer duration of invasive mechanical ventilation were associated with greater risk (Figure 2B). In addition, the coexistence of chorioamnionitis and NEC is a major risk factor for sepsis. These findings suggest that, in the absence of chorioamnionitis in pregnant mothers before birth, actively preventing prenatal infections and NEC after birth can prevent the occurrence of sepsis.
We constructed a nomogram based on the logistic regression model, which allowed for a more intuitive prediction of the risk of sepsis (Figure 2C). A nomogram for sepsis onset was developed in the primary cohort with four factors: invasive respiratory support, CRIB II score, NEC, and chorioamnionitis. These factors were screened via logistic regression analysis. The C-statistic of the nomogram was 0.79. The Brier score was 0.164, which is smaller than 0.25.
The calibration curve of the constructed regression model generated via the bootstrap method (with 1,000 repeated samples) indicated an average absolute error of 0.03, suggesting that the predicted risk of sepsis was quite accurate after calibration and that the model did not overfit (Figure 2D).
According to the DCA curve of the prediction model and the verified DCA curve (Figure 2E), the net benefit corresponding to the curve was above 0 over a wide range of decision thresholds (0.2–0.8). It is far from the two extreme curves of “None” and “All”. The CIC curves of the prediction model validation (Figure 2F) show that the results predicted by the model are close to the actual results in a wide range of risk thresholds (0.5–1.0), which indicates practical application value in a wider range of clinical situations.
Clinical characteristics of BPD patients with EOS and LOS
Next, we analyzed the clinical information regarding birth details, postnatal complications, and symptoms from the BPD complicated with EOS group and BPD complicated with LOS group and found significant differences in the LOS group in terms of biochemical indicators (platelet count, peak values of CRP), symptoms during infection (fever, feeding intolerance, and abdominal distension), and postnatal complications (PDA) (p < 0.05). These findings suggest that infants with BPD and LOS are more likely to have abnormalities in the above indicators and clinical symptoms. Other factors, such as GA, BWt, the Apgar score, sex, the CRIB II score, the total SOFA score, the WBC count, HGB, postnatal complications such as NEC and early postnatal PH, and symptoms during infection, such as oxygen desaturation and bradycardia, were not significantly different between the two groups. However, in both groups of patients with sepsis, the percentage of patients with oxygen desaturation reached over 95%, whereas respiratory arrest and bradycardia were more common in patients with LOS, indicating that these patients require close attention from physicians (Table 1).
The analysis of biochemical indicators from the groups with BPD complicated by EOS and LOS revealed that the peak values of CRP were relatively high in the group with BPD complicated by LOS. In contrast, in the group with BPD complicated by EOS, the proportion of infants with abnormal white blood cell counts was relatively high (Table 4).

Table 4. Laboratory characteristics and clinical signs of EOS vs. LOS in infants with BPD with sepsis.
The etiological characteristic analysis of blood cultures from the groups with BPD complicated by EOS and BPD complicated by LOS revealed that in the group with BPD complicated by LOS, gram-positive bacteria accounted for 37.8%, with Staphylococcus epidermidis and Enterococcus faecalis being the main bacteria, each accounting for 16.2%, followed by Streptococcus hemolyticus and Staphylococcus aureus. Clinically suspected fungal infections accounted for 23.4%, and Candida spp. accounted for 3.6%. A greater proportion of infants with LOS had coinfections with more than one pathogen. In contrast, the clinically suspected rate was higher in patients with EOS (64.4%), indicating that the positive rate of pathogen detection is lower in patients with EOS and that fungal infections are rare (Table 5).

Table 5. Microbiological characteristics of infants and manifestations of EOS and LOS episodes [results are reported as N(%) unless otherwise specified].
Discussion
BPD is a common complication in premature infants and a chronic lung disease in neonates. A recent study revealed that in the United States, the incidence of chronic lung disease in extremely preterm infants born between 24 and 28 weeks of gestation has been increasing since 2012 (23). In the past decade, the survival rate of extremely preterm infants in China has improved significantly. However, the incidence of BPD has not decreased significantly, reaching 40.7% (24). In addition to chronic lung disease, the incidence of long-term complications in BPD patients, such as those involving the cardiovascular, nervous, digestive, and endocrine systems, has also increased (24).
Neonatal sepsis, especially early-onset sepsis (EOS), often presents with subtle and nonspecific symptoms, making prompt diagnosis challenging. ML models can identify high-risk neonates before symptoms become apparent. This early detection enables timely intervention, which is crucial for reducing morbidity and mortality associated with sepsis (25). ML models can integrate various clinical data points, such as vital signs and laboratory indicators, to identify risk factors more accurately. This approach significantly improves diagnostic sensitivity and overall accuracy. In addition, ML models can help allocate medical resources more efficiently by rapidly screening high-risk neonates, ensuring that critical interventions are directed toward those who need them most, enhancing the overall efficiency of healthcare delivery (26).
Inflammation/infection, including chorioamnionitis in utero and postnatal systemic infectious inflammation, has been shown to be a risk factor for neonatal sepsis (27, 28). Sepsis-induced systemic inflammation is a cause of neonatal mortality (29, 30).
In this study, we conducted a retrospective analysis of the clinical characteristics of preterm infants with BPD from December 2019 to December 2021 and used ML methods to rank the importance of various features and fit a predictive model. All risk factors were ranked in terms of their importance in deciding whether to include or reject them. The most important factor is the CRIB II score for neonates, followed by invasive mechanical ventilation, intrauterine infection (chorioamnionitis), and the subsequent development of acute NEC. This finding is consistent with the results of a previous questionnaire survey study (31). Early chorioamnionitis was suspected to be the main cause of EOS triggered by intrauterine infection, whereas late-onset NEC in later stages of enteral feeding was the main cause of LOS. In terms of the predictive performance of the ten models adopted, the mean area under the ROC curve of all the models can approach or reach 0.8. The RF model showed the best predictive performance, with a value of 0.93. Its clinical and practical value has been determined by DCA and CIC curves. Moreover, the nomogram constructed from these findings can intuitively predict the risk of sepsis, and when combined with the results of etiological analysis, it is beneficial to further guide clinical doctors in the diagnosis, treatment, and rational use of antimicrobial drugs. In the etiological analysis of blood for EOS and LOS in preterm infants with BPD in our hospital, gram-positive bacteria were commonly observed in LOS, and clinically suspected fungal infections, such as Candida albicans and Candida parapsilosis, were frequently observed. In this study, the changes in the platelet count and CRP level were the most significant for LOS, whereas the change in the WBC count was more pronounced for the early detection of sepsis. The platelet count has good sensitivity and specificity for diagnosing neonatal sepsis and can be used as a diagnostic tool for neonatal sepsis (32, 33). In addition, the PLT can serve as a diagnostic indicator of late-onset neonatal pneumonia (34). However, considering that CRP is nonspecific for diagnosing sepsis and has poor sensitivity, a single standard for diagnosing sepsis must be more accurate. In clinical practice, a comprehensive assessment is needed, which should be combined with clinical symptoms, blood cell counts, and other indicators.
The advent of big data and the artificial intelligence era has also driven medical progress. In recent years, the combination of ML with medical data has not only aided scientific research output but has also been continuously applied in clinical work. Statistical models are the best way to analyze and predict outcomes (35). A recent study revealed that ML models constructed from the vital signs of newborns within 24 h before infection can also predict sepsis quite well, with the area under the ROC curve reaching 0.82 (36).
In summary, our study indicated that sepsis is a risk factor for BPD. The ML algorithm suggests that the CRIB II score, duration of invasive mechanical ventilation, incidence of chorioamnionitis, and incidence of neonatal necrotizing enterocolitis are high-risk factors for the co-occurrence of sepsis. By combining the nomogram and characteristics of etiology, the risk of sepsis can be calculated, which may further reduce the exposure to and duration of antibiotic use in preterm infants and has a certain guiding significance for clinical diagnosis and treatment.
Limitations
Our study was a retrospective case analysis study from a single center with a relatively small number of case samples. The etiological characteristics of the samples included only peripheral blood and did not include samples from other sources, such as oropharyngeal or tracheobronchial secretions or sputum. Further studies are needed to expand the sample size and to conduct prospective multicenter cohort studies. Additionally, in this study, the subjects were preterm infants, making it difficult to collect sufficient blood samples for the detection of inflammation-related indicators. In the future, given the availability of technologies that can measure these indicators, we will also incorporate these indicators into our analysis. Although ML models have shown promise in predicting sepsis, their generalizability and clinical interpretability still require further research and validation. Moreover, to increase predictive accuracy, future studies may need to incorporate more patient data and vital sign indicators as well as further refine the algorithms and structure of the models.
Data availability statement
The raw data supporting the conclusions of this article will be made available by the authors, without undue reservation.
Ethics statement
The studies involving humans were approved by This study was approved by the Ethics Committee of the Seventh Medical Center of the PLA General Hospital (S2024-046-01) and was conducted in accordance with the Declaration of Helsinki. The studies were conducted in accordance with the local legislation and institutional requirements. The ethics committee/institutional review board waived the requirement of written informed consent for participation from the participants or the participants’ legal guardians/next of kin because of the retrospective design of the study.
Author contributions
YaW: Data curation, Formal analysis, Investigation, Writing – original draft. YiW: Investigation, Project administration, Software, Validation, Writing – original draft. LS: Conceptualization, Formal analysis, Methodology, Resources, Writing – original draft. JL: Conceptualization, Data curation, Formal analysis, Methodology, Writing – original draft. YX: Formal analysis, Methodology, Project administration, Software, Writing – original draft. LY: Formal analysis, Investigation, Methodology, Project administration, Writing – original draft. SH: Conceptualization, Investigation, Methodology, Project administration, Validation, Writing – original draft, Writing – review & editing. FZ: Conceptualization, Formal analysis, Methodology, Resources, Supervision, Validation, Writing – original draft.
Funding
The author(s) declare that financial support was received for the research and/or publication of this article. This study was supported by the National Natural Science Foundation of China (grant number: 82341095). and the Beijing E-Town Cooperation and Development Foundation (grant number: YCXJ-JZ-2023-017).
Acknowledgments
We wish to thank all the parents and infants who supported our research and pushed us to do our best.
Conflict of interest
The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.
Generative AI statement
The author(s) declare that no Generative AI was used in the creation of this manuscript.
Publisher's note
All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article, or claim that may be made by its manufacturer, is not guaranteed or endorsed by the publisher.
Supplementary material
The Supplementary Material for this article can be found online at: https://www.frontiersin.org/articles/10.3389/fped.2025.1566747/full#supplementary-material
References
1. Thébaud B, Goss KN, Laughon M, Whitsett JA, Abman SH, Steinhorn RH, et al. Bronchopulmonary dysplasia. Nat Rev Dis Primers. (2019) 5:78. doi: 10.1038/s41572-019-0127-7
2. Stoll BJ, Hansen NI, Bell EF, Shankaran S, Laptook AR, Walsh MC, et al. Neonatal outcomes of extremely preterm infants from the NICHD Neonatal Research Network. Pediatrics. (2010) 126:443–56. doi: 10.1542/peds.2009-2959
3. Wadhawan R, Vohr BR, Fanaroff AA, Perritt RL, Duara S, Stoll BJ, et al. Does labor influence neonatal and neurodevelopmental outcomes of extremely-low-birth-weight infants who are born by cesarean delivery? Am J Obstet Gynecol. (2003) 189:501–6. doi: 10.1067/S0002-9378(03)00360-0
4. Balany J, Bhandari V. Understanding the impact of infection, inflammation, and their persistence in the pathogenesis of bronchopulmonary dysplasia. Front Med (Lausanne). (2015) 2:90. doi: 10.3389/fmed.2015.00090
5. Choi Y, Rekers L, Dong Y, Holzfurtner L, Goetz MJ, Shahzad T, et al. Oxygen toxicity to the immature lung-part I: pathomechanistic understanding and preclinical perspectives. Int J Mol Sci. (2021) 22:11006. doi: 10.3390/ijms222011006
6. Baud O, Maury L, Lebail F, Ramful D, Moussawi FE, Nicaise C, et al. Effect of early low-dose hydrocortisone on survival without bronchopulmonary dysplasia in extremely preterm infants (PREMILOC): a double-blind, placebo-controlled, multicentre, randomised trial. Lancet. (2016) 387:1827–36. doi: 10.1016/S0140-6736(16)00202-6
7. van der Poll T, Shankar-Hari M, Wiersinga WJ. The immunology of sepsis. Immunity. (2021) 54:2450–64. doi: 10.1016/j.immuni.2021.10.012
9. Hornik CP, Fort P, Clark RH, Watt K, Benjamin DK Jr., Smith PB, et al. Early and late onset sepsis in very-low-birth-weight infants from a large group of neonatal intensive care units. Early Hum Dev. (2012) 88(Suppl 2):S69–74. doi: 10.1016/S0378-3782(12)70019-1
10. Stoll BJ, Gordon T, Korones SB, Shankaran S, Tyson JE, Bauer CR, et al. Late-onset sepsis in very low birth weight neonates: a report from the national institute of child health and human development neonatal research network. J Pediatr. (1996) 129:63–71. doi: 10.1016/S0022-3476(96)70191-9
11. Song W, Jung SY, Baek H, Choi CW, Jung YH, Yoo S. A predictive model based on machine learning for the early detection of late-onset neonatal sepsis: development and observational study. JMIR Med Inform. (2020) 8:e15965. doi: 10.2196/15965
12. Celik IH, Hanna M, Canpolat FE, Mohan P. Diagnosis of neonatal sepsis: the past, present and future. Pediatr Res. (2022) 91:337–50. doi: 10.1038/s41390-021-01696-z
13. Yang Z, Cui X, Song Z. Predicting sepsis onset in ICU using machine learning models: a systematic review and meta-analysis. BMC Infect Dis. (2023) 23:635. doi: 10.1186/s12879-023-08614-0
14. Fleuren LM, Klausch TLT, Zwager CL, Schoonmade LJ, Guo T, Roggeveen LF, et al. Machine learning for the prediction of sepsis: a systematic review and meta-analysis of diagnostic test accuracy. Intensive Care Med. (2020) 46:383–400. doi: 10.1007/s00134-019-05872-y
15. Wang D, Li J, Sun Y, Ding X, Zhang X, Liu S, et al. A machine learning model for accurate prediction of sepsis in ICU patients. Front Public Health. (2021) 9:754348. doi: 10.3389/fpubh.2021.754348
16. Mourani PM, Sontag MK, Younoszai A, Miller JI, Kinsella JP, Baker CD, et al. Early pulmonary vascular disease in preterm infants at risk for bronchopulmonary dysplasia. Am J Respir Crit Care Med. (2015) 191:87–95. doi: 10.1164/rccm.201409-1594OC
17. Kim HH, Sung SI, Yang MS, Han YS, Kim HS, Ahn SY, et al. Early pulmonary hypertension is a risk factor for bronchopulmonary dysplasia-associated late pulmonary hypertension in extremely preterm infants. Sci Rep. (2021) 11:11206. doi: 10.1038/s41598-021-90769-4
18. Xie X, Kong B, Duan T. Obstetrics and Gynecology. 9th Edition. Beijing: People’s Medical Publishing House (2018).
20. Vickers AJ, Elkin EB. Decision curve analysis: a novel method for evaluating prediction models. Med Decis Making. (2006) 26:565–74. doi: 10.1177/0272989X06295361
21. Rousson V, Zumbrunn T. Decision curve analysis revisited: overall net benefit, relationships to ROC curve analysis, and application to case-control studies. BMC Med Inform Decis Mak. (2011) 11:45. doi: 10.1186/1472-6947-11-45
22. Kerr KF, Brown MD, Zhu K, Janes H. Assessing the clinical impact of risk prediction models with decision curves: guidance for correct interpretation and appropriate use. J Clin Oncol. (2016) 34:2534–40. doi: 10.1200/JCO.2015.65.5654
23. Horbar JD, Greenberg LT, Buzas JS, Ehret DEY, Soll RF, Edwards EM. Trends in mortality and morbidities for infants born 24 to 28 weeks in the US: 1997–2021. Pediatrics. (2024) 153:e2023064153. doi: 10.1542/peds.2023-064153
24. Zhu Z, He Y, Yuan L, Chen L, Yu Y, Liu L, et al. Trends in bronchopulmonary dysplasia and respiratory support among extremely preterm infants in China over a decade. Pediatr Pulmonol. (2024) 59:399–407. doi: 10.1002/ppul.26761
25. Le S, Hoffman J, Barton C, Fitzgerald JC, Allen A, Pellegrini E, et al. Pediatric severe sepsis prediction using machine learning. Front Pediatr. (2019) 7:413. doi: 10.3389/fped.2019.00413
26. Sahu P, Raj Stanly EA, Simon Lewis LE, Prabhu K, Rao M, Kunhikatta V. Prediction modelling in the early detection of neonatal sepsis. World J Pediatr. (2022) 18:160–75. doi: 10.1007/s12519-021-00505-1
27. Thomas W, Speer CP. Chorioamnionitis: important risk factor or innocent bystander for neonatal outcome? Neonatology. (2011) 99:177–87. doi: 10.1159/000320170
28. Afonso E, Smets K, Deschepper M, Verstraete E, Blot S. The effect of late-onset sepsis on mortality across different gestational ages in a neonatal intensive care unit: a historical study. Intensive Crit Care Nurs. (2023) 77:103421. doi: 10.1016/j.iccn.2023.103421
29. Salimi U, Dummula K, Tucker MH, Dela Cruz CS, Sampath V. Postnatal sepsis and bronchopulmonary dysplasia in premature infants: mechanistic insights into “New BPD”. Am J Respir Cell Mol Biol. (2022) 66:137–45. doi: 10.1165/rcmb.2021-0353PS
30. Spasojević I, Obradović B, Spasić S. Bench-to-bedside review: neonatal sepsis-redox processes in pathogenesis. Critical Care. (2012) 16:221. doi: 10.1186/cc11183
31. Moftian N, Samad Soltani T, Mirnia K, Esfandiari A, Tabib MS, Rezaei Hachesu P. Clinical risk factors for early-onset sepsis in neonates: an international Delphi study. Iran J Med Sci. (2023) 48:57–69. doi: 10.30476/IJMS.2022.92284.2352
32. Worku M, Aynalem M, Biset S, Woldu B, Adane T, Tigabu A. Role of complete blood cell count parameters in the diagnosis of neonatal sepsis. BMC Pediatr. (2022) 22:411. doi: 10.1186/s12887-022-03471-3
33. Yochpaz S, Friedman N, Zirkin S, Blumovich A, Mandel D, Marom R. C-reactive protein in early-onset neonatal sepsis—a cutoff point for CRP value as a predictor of early-onset neonatal sepsis in term and late preterm infants early after birth? J Matern Fetal Neonatal Med. (2022) 35:4552–7. doi: 10.1080/14767058.2020.1856068
34. Metwali WA, Elmashad AM, Hazzaa SME, Al-Beltagi M, Hamza MB. Salivary C-reactive protein and mean platelet volume as possible diagnostic markers for late-onset neonatal pneumonia. World J Clin Pediatr. (2024) 13:88645. doi: 10.5409/wjcp.v13.i1.88645
35. May M. Eight ways machine learning is assisting medicine. Nat Med. (2021) 27:2–3. doi: 10.1038/s41591-020-01197-2
Keywords: bronchopulmonary dysplasia, sepsis, machine learning algorithms, nomogram, prediction model
Citation: Wang Y, Wang Y, Song L, Li J, Xie Y, Yan L, Hu S and Feng Z (2025) Clinical characteristics of bronchopulmonary dysplasia and the risk of sepsis onset prediction via machine learning models. Front. Pediatr. 13:1566747. doi: 10.3389/fped.2025.1566747
Received: 25 January 2025; Accepted: 10 June 2025;
Published: 27 June 2025.
Edited by:
Christopher G. Wilson, Loma Linda University, United StatesReviewed by:
Samuel Franklin Feng, Paris-Sorbonne University Abu Dhabi, United Arab EmiratesVladimir Pohanka, Retired, Liptovská Teolička, Slovakia
Copyright: © 2025 Wang, Wang, Song, Li, Xie, Yan, Hu and Feng. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.
*Correspondence: Siqi Hu, aHVzaXFpXzIwMDBAMTYzLmNvbQ==; Zhichun Feng, emhqZmVuZ3pjQDEyNi5jb20=
†These authors have contributed equally to this work and share first authorship