AUTHOR=Shen Li , Wu Jiaqiang , Lu Min , Jiang Yiguo , Zhang Xiaolan , Xu Qiuyan , Ran Shuangqin TITLE=Advancing risk factor identification for pediatric lobar pneumonia: the promise of machine learning technologies JOURNAL=Frontiers in Pediatrics VOLUME=Volume 13 - 2025 YEAR=2025 URL=https://www.frontiersin.org/journals/pediatrics/articles/10.3389/fped.2025.1490500 DOI=10.3389/fped.2025.1490500 ISSN=2296-2360 ABSTRACT=BackgroundCommunity-acquired pneumonia (CAP) is a prevalent pediatric condition, and lobar pneumonia (LP) is considered a severe subtype. Early identification of LP is crucial for appropriate management. This study aimed to develop and compare machine learning models to predict LP in children with CAP.MethodsA total of 25 clinical and laboratory variables were collected. Missing data (<2%) were imputed, and the dataset was split into training (60%) and validation (40%) sets. Univariable logistic regression and Boruta feature selection were used to identify significant predictors. Four machine learning algorithms-Logistic Regression (LR), Support Vector Machine (SVM), Extreme Gradient Boosting (XGBoost), and Decision Tree (DT)-were compared using area under the curve (AUC), balanced accuracy, sensitivity, specificity, and F1 score. SHAP analysis was performed to interpret the best-performing model.ResultsA total of 278 patients with CAP were included in this study, of whom 65 were diagnosed with LP. The XGBoost model demonstrated the best performance with an AUC of 0.880 (95% CI: 0.807–0.934) in the training set and 0.746 (95% CI: 0.664–0.843) in the validation set. SHAP analysis identified age, CRP, CD64 index, lymphocyte percentage, and ALB as the top five predictive factors.ConclusionThe XGBoost model showed superior performance in predicting LP in children with CAP. The model enabled early diagnosis and risk assessment of LP, thereby facilitating appropriate clinical decision-making.