ORIGINAL RESEARCH article
Front. Neurol.
Sec. Artificial Intelligence in Neurology
Volume 16 - 2025 | doi: 10.3389/fneur.2025.1608341
This article is part of the Research TopicArtificial Intelligence in Neurosurgical Practices: Current Trends and Future OpportunitiesView all articles
Machine Learning-Based Prediction of 6-Month Functional Recovery in Hypertensive Cerebral Hemorrhage: Insights from XGBoost and SHAP Analysis
Provisionally accepted- 1Qinghai University, Xining, Qinghai Province, China
- 2Department of Neurosurgery, Qinghai Provincial People's Hospital, Xining, Qinghai Province, China
- 3Qinghai Provincial People's Hospital, Xining, Qinghai Province, China
Select one of your emails
You have multiple emails registered with Frontiers:
Notify me on publication
Please enter your email address:
If you already have an account, please login
You don't have a Frontiers account ? You can register here
Background: The poor prognosis of hypertensive cerebral hemorrhage (HICH) remains high. The period of 3-6 months after onset is the most rapid phase of neurological recovery in hemorrhagic stroke patients. Accurate early prediction of 6-month functional outcomes is critical for optimizing therapeutic strategies. This study compared the predictive efficacy of multiple machine learning models to identify the optimal model for forecasting long-term prognosis in HICH patients.Methods: We conducted a retrospective analysis of clinical data from 807 HICH patients admitted to Qinghai Provincial People's Hospital's Neurosurgery Department between June 2020 and June 2024. After data preprocessing, data from June 2020 to December 2023 (n=716) were randomly split into training (n=497) and test sets (n=219) at a 7:3 ratio. Data from January to June 2024 (n=91) served as an external validation set. Recursive Feature Elimination (RFE) was performed to identify optimal features, and repeated five-fold cross-validation minimized the risk of overfitting. Model performance was evaluated using Area Under the Curve (AUC) and Decision Curve Analysis (DCA) across XGBoost, Random Forest (RF), Logistic Regression (LR), Support Vector Machine (SVM), and K-Nearest Neighbors (KNN). The optimal model was interpreted via SHapley Additive exPlanations (SHAP).Results: The 6-month poor prognosis rate among 807 HICH patients was 27.51%. The XGBoost model exhibited optimal performance in the training set (AUC=0.921, 95% CI 0.896–0.944) and demonstrated stability in the external validation set (AUC=0.813, 95% CI 0.728–0.899). DCA analysis showed that the XGBoost model provided higher net benefit than other models across threshold probabilities of 0%–20% and 56%–100%. SHAP analysis identified hematoma volume as the most critical predictor, with secondary contributions from Glasgow coma score, white blood cell count, age, serum albumin, and systolic blood pressure, among others.Conclusion: XGBoost models demonstrate powerful accuracy in long-term prognosis prediction of HICH patients. The SHAP framework quantifies the specific contributions of key pathophysiological indicators to individual patient model predictions, enabling individualized risk stratification and strategic allocation of medical resources.
Keywords: Hypertensive cerebral hemorrhage, predictive model, XGBoost, Shap, machine learning
Received: 08 Apr 2025; Accepted: 15 May 2025.
Copyright: © 2025 He, Zhongsheng, Lv, Cheng, Zhang, Jin and Han. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.
* Correspondence: Lu Zhongsheng, Department of Neurosurgery, Qinghai Provincial People's Hospital, Xining, Qinghai Province, China
Disclaimer: All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article or claim that may be made by its manufacturer is not guaranteed or endorsed by the publisher.