ORIGINAL RESEARCH article
Front. Pediatr.
Sec. Pediatric Infectious Diseases
Stratify severe risk in children with respiratory syncytial virus pneumonia——A retrospective study based on machine learning and SHAP interpretation
Provisionally accepted- Sichuan University, Chengdu, China
Select one of your emails
You have multiple emails registered with Frontiers:
Notify me on publication
Please enter your email address:
If you already have an account, please login
You don't have a Frontiers account ? You can register here
Background: Respiratory syncytial virus (RSV) is the primary pathogen causing severe lower respiratory tract infections in children, imposing a significant disease burden worldwide. The clinical manifestation of respiratory syncytial virus is not highly specific, in severe cases, it may cause a severe inflammatory response in the organism, potentially resulting in mortality. Currently, early identification and risk stratification tools for severe RSV-related pneumonia remain inadequate. This study identified potential high-risk factors contributing to severe cases in children with respiratory syncytial virus pneumonia by screening variables and establishing machine learning models, aiming to achieve individualized prevention, diagnosis, and treatment for these patients. Methods: Our study conducted variable screening through univariate analysis and multivariate logistic regression analysis. The performance of five machine learning models in the training and test sets was compared using receiver operating characteristic curves, and the XGBOOST model with the best overall performance was selected as the final model. Finally, shapley additive explanations (SHAP) was employed to quantify and perform clinically interpretable analysis on this black-box model. Results: Twelve key variables were identified in patients with severe respiratory syncytial virus pneumonia. XGBoost demonstrated the best overall performance, selected as the final model for the study, which achieving AUC values of 0.949 and 0.818 in the training and test sets respectively. By SHapley Additive exPlanations (SHAP), it was found that fever duration, diarrhea, hemoglobin concentration, rhinorrhea, age, neutrophil-to-lymphocyte ratio, gestational age, neutrophil count, mode of delivery, and lymphocyte count may be the most important predictive variables for children with severe RSV pneumonia. Conclusion: Our findings demonstrated that prolonged fever duration, presence of diarrhea, decreased hemoglobin concentration (HGB), absence of rhinorrhea, age under 3 months (Age<3m), and elevated neutrophil-to-lymphocyte ratio (NLR) were predictors of severe cases among children with RSV pneumonia.
Keywords: Children, machine learning, Pneumonia, respiratory syncytial virus (RSV), Shapley Additive Explanations (SHAP)
Received: 05 Jan 2026; Accepted: 10 Feb 2026.
Copyright: © 2026 Pan, Yang, Wu, Zhou, Liu and Chen. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.
* Correspondence: Li-Na Chen
Disclaimer: All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article or claim that may be made by its manufacturer is not guaranteed or endorsed by the publisher.
