AUTHOR=Cao Heshan , Wei Junying , Hua Ping , Yang Songran TITLE=Interpretable machine learning for early predicting the risk of ventilator-associated pneumonia in ischemic stroke patients in the intensive care unit JOURNAL=Frontiers in Neurology VOLUME=Volume 16 - 2025 YEAR=2025 URL=https://www.frontiersin.org/journals/neurology/articles/10.3389/fneur.2025.1513732 DOI=10.3389/fneur.2025.1513732 ISSN=1664-2295 ABSTRACT=BackgroundThe incidence of ventilator-associated pneumonia (VAP) in ischemic stroke (IS) patients is linked to a variety of detrimental outcomes. Current approaches for the early identification of individuals at high risk for developing VAP are limited and often lack clinical interpretability. The goal of this study is to develop and validate an interpretable machine learning (ML) model for early predicting VAP risk in IS patients in the intensive care unit (ICU).MethodsData on IS patients were extracted from versions 2.2 and 3.0 of the Medical Information Mart for Intensive Care-IV database, with version 2.2 being used for model training and internal validation and version 3.0 for external testing. The primary outcome was the incidence of VAP post-ICU admission. The Boruta algorithm was used to select features prior to developing 10 ML models. The Shapley Additive Explanation (SHAP) method was employed to assess the global and local interpretability of the model’s decision-making process. The final model and Streamlit were used for developing and launching an online web application.ResultsA total of 419 IS patients were included, with 401 in the derivation and 118 in the test group. Following feature selection, seven clinical characteristics were incorporated in the ML model: systolic and diastolic blood pressure, international normalized ratio, length of stay before mechanical ventilation, dysphagia, antibiotic counts and suctioning counts. Among the 10 evaluated ML models, the Random Forest (RF) model outperformed the others, achieving an internal validation AUC of 0.776, accuracy of 0.704, sensitivity of 0.900, and specificity of 0.588. In external testing, performance dropped to an AUC of 0.644, accuracy of 0.610, sensitivity of 0.688, and specificity of 0.519, raising concerns about the model’s generalizability.ConclusionThe RF model is reliable in early identifying high-risk IS patients for VAP. The SHAP method offers clear and intuitive explanations for individual risk assessment. The web-based tool has the potential to improve clinical outcomes by promptly recognizing patients at increased VAP risk and facilitating early intervention, further multicenter prospective studies are required to validate its generalizability and practical utility.