ORIGINAL RESEARCH article

Front. Neurol.

Sec. Stroke

Volume 16 - 2025 | doi: 10.3389/fneur.2025.1591570

This article is part of the Research TopicFrom bench to bedside: Inflammation in Neurovascular Disorders and StrokeView all 11 articles

Development and Validation of Machine Learning-Based Risk Prediction Model for Stroke-Associated Pneumonia in Elderly Hemorrhagic Stroke

Provisionally accepted
Yi  CaoYi Cao1Haipeng  DengHaipeng Deng1Shaoyun  LiuShaoyun Liu1Xi  ZengXi Zeng1Yangyang  GouYangyang Gou2Weiting  ZhangWeiting Zhang2Yixinyuan  LiYixinyuan Li2Hua  YangHua Yang1*Min  PengMin Peng3*
  • 1Department of Neurosurgery, Affiliated Hospital of Guizhou Medical University, Guiyang, China
  • 2School of Nursing, Affiliated Hospital of Guizhou Medical University, Guiyang, China
  • 3Department of Nursing Quality Management, Affiliated Hospital of Guizhou Medical University, Guiyang, China

The final, formatted version of the article will be published soon.

Objective: To develop and validate an ML - based model for predicting SAP risk in elderly hemorrhagic stroke patients.Methods: We retrospectively collected elderly hemorrhagic stroke patients from three tertiary hospitals in Guiyang (Jan 2019 - Dec 2022) as the modeling cohort, split into training and internal validation sets (7:3). External validation used data from Jan - Dec 2023. After regression analyses, four ML models (Logistic Regression, XGBoost, Naive Bayes, SVM) were built. ROC curves, AUC, Delong's or Bootstrap tests, and various performance metrics were used for evaluation. Calibration curves were used to assess model calibration.Results: 788 patients were enrolled (training: 462, internal validation: 196, external validation: 130). The SAP incidence was 46.7% (368/788). Risk factors included advanced age, smoking, low GCS and Braden scores, and nasogastric tube. The LR model showed the best and most stable performance. It achieved AUCs of 0.883 (training), 0.855 (internal validation), and 0.882 (external validation). H - L test P - values were 0.381, 0.142, and 0.066, indicating satisfactory calibration.Conclusions: A multi - center SAP risk prediction model for elderly hemorrhagic stroke patients was constructed and validated. The LR model had the best performance. Easily - obtainable factors like age, smoking etc. were identified. The model has good generalization ability, and a nomogram was drawn for clinical use to reduce SAP incidence and improve prognosis.

Keywords: machine learning, Aged, hemorrhagic stroke, Stroke-associated pneumonia, Prediction model, Validation

Received: 11 Mar 2025; Accepted: 28 May 2025.

Copyright: © 2025 Cao, Deng, Liu, Zeng, Gou, Zhang, Li, Yang and Peng. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

* Correspondence:
Hua Yang, Department of Neurosurgery, Affiliated Hospital of Guizhou Medical University, Guiyang, China
Min Peng, Department of Nursing Quality Management, Affiliated Hospital of Guizhou Medical University, Guiyang, China

Disclaimer: All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article or claim that may be made by its manufacturer is not guaranteed or endorsed by the publisher.