Development of an interpretable machine learning model for predicting sarcopenia in patients undergoing maintenance hemodialysis

LIU, SHUQIN; Zhu, Xingyu; Wang, Zhixin; Tang, Wenwu; Zhang, Ying; Xian, Huaming; Li, Mi; Xie, Xisheng

doi:10.3389/fmed.2025.1576081

ORIGINAL RESEARCH article

Front. Med.

Sec. Nephrology

This article is part of the Research TopicMedical Knowledge-Assisted Machine Learning Technologies in Individualized Medicine Volume IIView all 27 articles

Development of an interpretable machine learning model for predicting sarcopenia in patients undergoing maintenance hemodialysis

Provisionally accepted

Zhixin Wang¹

Huaming Xian¹ Mi Li

Mi Li¹

Xisheng Xie^1*

¹Department of Nephrology, Nanchong Central Hospital Affiliated to North Sichuan Medical College, Nanchong, China
²Department of Nephrology, Guangyuan Central Hospital, Guangyuan, China

The final, formatted version of the article will be published soon.

Abstract Background Sarcopenia has a high incidence among patients undergoing maintenance hemodialysis (MHD), significantly increasing the risk of falls, fractures, and mortality. Traditional diagnostic methods, however, are costly and complex, limiting their widespread clinical application. Therefore, developing an efficient and interpretable sarcopenia prediction model using routine clinical and laboratory data is crucial, with explainability techniques applied to further enhance model transparency. Methods This study included 256 MHD patients and developed five machine learning models based on clinical and laboratory data: Logistic Regression, Extreme Gradient Boosting, Random Forest, Support Vector Machine, and Gaussian Naive Bayes. Model performance was assessed using the area under the receiver operating characteristic curve (AUC), calibration curve, and decision curve analysis. Additionally, SHapley Additive exPlanations (SHAP) were employed as an explainability tool to enhance and visualize the interpretability of the optimal model. Results The Logistic Regression model demonstrated the best performance on the validation set (AUC = 0.828, 95% CI: 0.626–0.989). Key predictive factors included body mass index (BMI), age, gender, creatinine (Cr), 25-hydroxyvitamin D3, left ventricular ejection fraction (LVEF), and estimated glomerular filtration rate (eGFR). SHAP analysis revealed that high BMI and 25-hydroxyvitamin D3 levels were protective factors, while low Cr, LVEF, and eGFR levels, as well as female gender, significantly increased the risk of sarcopenia. Conclusion This study developed a Logistic Regression model using an interpretable machine learning approach, offering an efficient tool for early screening of sarcopenia risk in MHD patients and facilitating personalized intervention strategies. However, the single-center design limits the model's external applicability, and further multi-center studies are necessary to validate its generalizability.

Keywords: Interpretable machine learning, logistic regression model, Maintenance hemodialysis, Sarcopenia, SHAP analysis

Received: 16 Feb 2025; Accepted: 24 Oct 2025.

Copyright: © 2025 LIU, Zhu, Wang, Tang, Zhang, Xian, Li and Xie. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

* Correspondence: Xisheng Xie, xishengxie2023@163.com

Disclaimer: All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article or claim that may be made by its manufacturer is not guaranteed or endorsed by the publisher.