AUTHOR=Hwang Jae Kyoon , Kim Dae Hyun , Na Jae Yoon , Son Joonhyuk , Oh Yoon Ju , Jung Donggoo , Kim Chang-Ryul , Kim Tae Hyun , Park Hyun-Kyung TITLE=Two-stage learning-based prediction of bronchopulmonary dysplasia in very low birth weight infants: a nationwide cohort study JOURNAL=Frontiers in Pediatrics VOLUME=Volume 11 - 2023 YEAR=2023 URL=https://www.frontiersin.org/journals/pediatrics/articles/10.3389/fped.2023.1155921 DOI=10.3389/fped.2023.1155921 ISSN=2296-2360 ABSTRACT=Introduction: To develop enhanced machine learning-based prediction models for bronchopulmonary dysplasia (BPD) and its severity through a two-stage approach integrated with the duration of respiratory support (RSd) using prenatal and early postnatal variables from a nationwide very low birth weight (VLBW) infant cohort. Methods: We included 16,384 VLBW infants admitted to the neonatal intensive care unit (NICU) of the Korean Neonatal Network (KNN), a nationwide VLBW infant registry (2013–2020). Overall, 45 prenatal and early perinatal clinical variables were selected. A multilayer perceptron (MLP)-based network analysis, which was recently introduced to predict diseases in preterm infants, was used for modeling and a stepwise approach. Additionally, we applied a complementary MLP network and established new BPD prediction models (PMbpd). The performances of the models were compared using the area under the receiver operating characteristic curve (AUROC) values. The Shapley method was used to determine the contribution of each variable. Results: We included 11,177 VLBW infants (3,724 no (BPD 0), 3,383 milds (BPD 1), 1,375 moderate (BPD 2), and 2,695 severe (BPD 3) cases). Compared to conventional machine learning (ML) models, our PMbpd and two-stage PMbpd with RSd (TS-PMbpd) model outperformed both binary (0 vs. 1,2,3; 0,1 vs. 2,3; 0,1,2 vs. 3) and each severity (0 vs. 1 vs. 2 vs. 3) prediction (AUROC=0.895 and 0.897, 0.824 and 0.825, 0.828 and 0.823, 0.783, and 0.786, respectively). GA, birth weight, and patent ductus arteriosus (PDA) treatment were significant variables for the occurrence of BPD. Birth weight, low blood pressure, and intraventricular hemorrhage were significant for BPD ≥2, and birth weight, low blood pressure, and PDA ligation for BPD ≥3. GA, birth weight, and pulmonary hypertension were the principal variables that predicted BPD severity in VLBW infants. Conclusions: We developed a new two-stage ML model reflecting crucial BPD indicators (RSd) and found significant clinical variables for the early prediction of BPD and its severity with high predictive accuracy. Our model can be used as an optimal predictive model in the practical NICU field.