AUTHOR=Han Yupeng , Xie Xiyuan , Qiu Jiapeng , Tang Yijie , Song Zhiwei , Li Wangyu , Wu Xiaodan TITLE=Early prediction of sepsis associated encephalopathy in elderly ICU patients using machine learning models: a retrospective study based on the MIMIC-IV database JOURNAL=Frontiers in Cellular and Infection Microbiology VOLUME=Volume 15 - 2025 YEAR=2025 URL=https://www.frontiersin.org/journals/cellular-and-infection-microbiology/articles/10.3389/fcimb.2025.1545979 DOI=10.3389/fcimb.2025.1545979 ISSN=2235-2988 ABSTRACT=BackgroundSepsis associated encephalopathy (SAE) is prevalent among elderly patients in the ICU and significantly affects patient prognosis. Due to the symptom similarity with other neurological disorders and the absence of specific biomarkers, early clinical diagnosis remains challenging. This study aimed to develop a predictive model for SAE in elderly ICU patients.MethodsThe data of elderly sepsis patients were extracted from the MIMIC IV database (version 3.1) and divided into training and test sets in a 7:3 ratio. Feature variables were selected using the LASSO-Boruta combined algorithm, and five machine learning (ML) models, including Extreme Gradient Boosting (XGBoost), Categorical Boosting (CatBoost),Light Gradient Boosting Machine(LGBM), Multilayer Perceptron (MLP), and Support Vector Machines (SVM), were subsequently developed using these variables. A comprehensive set of performance metrics was used to assess the predictive accuracy, calibration, and clinical applicability of these models. For the machine learning model with the best performance, we employed the SHapley Additive Explanations(SHAP) method to visualize the model.ResultsBased on strict inclusion and exclusion criteria, a total of 3,156 elderly sepsis patients were enrolled in the study, with an SAE incidence rate of 48.7%. The mortality rate of elderly sepsis patients who developed SAE was significantly higher than that of patients in the non-SAE group (28.78% vs. 12.59%, P < 0.001). A total of 18 feature variables were selected for the construction of the ML model using the LASSO-Boruta combined algorithm. Compared to the other four models and traditional scoring systems, the XGBoost model demonstrated the best overall predictive performance, with Area Under the Curve(AUC)=0.898, accuracy=0.830, recall=0.819, F1-Score=0.820, specificity=0.840, and Precision=0.821. Furthermore, the results from the Decision Curve Analysis (DCA) and calibration curves demonstrated that the XGBoost model has significant clinical value and stable predictive performance. The ten-fold cross-validation method further confirmed the robustness and generalizability of the model. In addition, we simplified the model based on the SHAP feature importance ranking, and the results indicated that the simplified XGBoost model retains excellent predictive ability (AUC=0.858).ConclusionsThe XGBoost model effectively predicts SAE in elderly ICU patients and may serve as a reliable tool for clinicians to identify high-risk patients.