AUTHOR=Su Tai , Zhang Peng , Zhang Bingyin , Liu Zihao , Xie Zexing , Li Xiaomei , Ma Jixiang , Xin Tao TITLE=Risk prediction of stroke-associated pneumonia in acute ischemic stroke with atrial fibrillation using machine learning models JOURNAL=Frontiers in Artificial Intelligence VOLUME=Volume 8 - 2025 YEAR=2025 URL=https://www.frontiersin.org/journals/artificial-intelligence/articles/10.3389/frai.2025.1595101 DOI=10.3389/frai.2025.1595101 ISSN=2624-8212 ABSTRACT=Stroke-associated pneumonia (SAP) is a serious complication of acute ischemic stroke (AIS), significantly affecting patient prognosis and increasing healthcare burden. AIS patients are often accompanied by basic diseases, and atrial fibrillation (AF) is one of the common basic diseases. Despite the high prevalence of AF in AIS patients, few studies have specifically addressed SAP prediction in this comorbid population. We aimed to analyze the factors influencing the occurrence of SAP in patients with AIS and AF and to assess the risk of SAP development through an optimal predictive model. We performed a case-control study. This study included 4,496 hospitalized patients with AIS and AF in China between January 2020 and September 2023. The primary outcome was SAP during hospitalization. Univariate analysis and LASSO regression analysis methods were used to screen predictors. The patients with AIS and AF were randomly divided into a training set, validation set, and test set. Then, we established logistic regression (LR), random forest (RF), support vector machine (SVM), and extreme gradient boosting (XGBoost) models. The accuracy, sensitivity, specificity, area under the curve, Youden index and F1 score were adopted to evaluate the predictive value of each model. The optimal prediction model was visualized using a nomogram. In this study, SAP was identified in 10.16% of cases. The variables screened by univariate analysis and LASSO regression, variables such as coronary artery disease, hypertension, and dysphagia, identified by univariate and LASSO regression analyses (p < 0.05), were included in the LR, RF, and SVM. The LR model outperformed other models, achieving an AUC of 0.866, accuracy of 90.13%, sensitivity of 79.49%, specificity of 86.11%, F1 score of 0.80. A nomogram based on the LR model was developed to predict SAP risk, providing a practical tool for early identification of high-risk patients, and enabling targeted interventions to reduce SAP incidence and improve outcomes.