AUTHOR=Park Kyu Hee , Kim Eun Yeob , Cho Hye Won , Jung Jong Ki , Kim Yu Seon , Choi Byung Min TITLE=A decision tree analysis to predict massive pulmonary hemorrhage in extremely low birth weight infants: a nationwide large cohort database JOURNAL=Frontiers in Pediatrics VOLUME=Volume 13 - 2025 YEAR=2025 URL=https://www.frontiersin.org/journals/pediatrics/articles/10.3389/fped.2025.1529712 DOI=10.3389/fped.2025.1529712 ISSN=2296-2360 ABSTRACT=ObjectiveTo develop a decision tree model using clinical risk factors to predict massive pulmonary hemorrhage (MPH) and MPH-related mortality in extremely low birth weight infants (ELBWIs).MethodWe retrospectively analyzed data from a national multicenter prospective web-based registry using machine learning algorithms with the C5.0 decision tree model to develop a clinical prediction rule for MPH and MPH-related mortality in ELBWIs admitted to participating neonatal intensive care units (NICUs) from January 2013 to December 2020. This C5.0 model was developed through data preprocessing, attribute selection based on splitting criteria, and pruning techniques to minimize overfitting.ResultsA total of 5,752 infants were included. Of them, MPH occurred in 664 (11.5%) infants. Among infants with MPH, 136 (20.5%) infants died due to MPH. The decision tree model for MPH identified “gestational age (GA) ≤ 25+2” as the first discriminator, followed by “APGAR score at 5 min ≤ 7” and “multiple gestation”. The decision tree model for MPH-related mortality identified “GA ≤ 25+2” as the first discriminator, followed by “APGAR score at 5 min ≤2”. The predictive accuracy of the C5.0 MPH model achieved an area under the ROC curve (AUC) of 88.2% on the training set and 89.0% on the test set, while the MPH-related mortality model attained an AUC of 97.7% on the training set and an AUC of 97.4% on the test set.ConclusionsWe developed a C5.0 decision tree model using clinical risk factors to predict MPH and MPH-related mortality in ELBWIs, enabling early identification of high-risk infants and facilitating timely interventions to improve neonatal outcomes. This decision-based risk stratification tool requires additional verification using larger multicenter cohorts to evaluate its practical applicability and clinical effectiveness before routine clinical implementation in NICUs.