Development and External Validation of a Machine Learning Model for Predicting In-Hospital Mortality in ICU Patients with Diabetic Kidney Disease: A Study Utilizing the MIMIC Database and a Chinese Cohort

Han, YuNan; Mao, RuMeng; Xiong, ChengYue; Wang, YongXiang; Li, Lin; Wang, HongLian

doi:10.3389/fendo.2026.1699647

ORIGINAL RESEARCH article

Front. Endocrinol.

Sec. Clinical Diabetes

This article is part of the Research TopicHighlights in Diabetes NephropathyView all 28 articles

Development and External Validation of a Machine Learning Model for Predicting In-Hospital Mortality in ICU Patients with Diabetic Kidney Disease: A Study Utilizing the MIMIC Database and a Chinese Cohort

Provisionally accepted

YuNan Han¹

RuMeng Mao¹

ChengYue Xiong²

YongXiang Wang² Lin Li

Lin Li^1*

HongLian Wang^1*

¹The First Affiliated Hospital of Yangtze University, Jingzhou, China
²Yangtze University, Jingzhou, China

The final, formatted version of the article will be published soon.

Background: Patients with diabetic kidney disease (DKD) admitted to the intensive care unit (ICU) face an exceptionally high risk of in-hospital mortality. Currently, effective tools for their early risk stratification are critically lacking. Therefore, this study aimed to develop and externally validate an interpretable machine learning (ML) model for predicting in-hospital mortality in this high-risk ICU-DKD patient population. Methods: This retrospective cohort study involved developing and evaluating eight ML algorithms. Model performance was rigorously assessed using receiver operating characteristic (ROC) curves, calibration curves, and decision curve analysis (DCA). SHapley Additive exPlanations (SHAP) provided model interpretability. Data from DKD patients with ≥24-hour ICU stays were extracted from the MIMIC-IV database (n=3,403) for model development. An independent external validation cohort (n=260) was collected from the First Affiliated Hospital of Yangtze University (YTU-ICU). The primary outcome was in-hospital mortality. Lasso regression identified key predictors. Model evaluation focused on the area under the ROC curve (AUROC), calibration, and net clinical benefit. Results: Ten features were selected for model development. Among the tested algorithms, XGBoost demonstrated superior predictive performance, achieving an AUROC of 0.738 (internal validation) and 0.746 (external validation), with corresponding accuracies of 72.18% and 72.69%. SHAP analysis highlighted respiratory failure, lymphocyte count, SOFA score, RDW, age, and SAPS II as the six most important predictors. Conclusions: The developed XGBoost model demonstrates good predictive performance for in-hospital mortality in ICU-DKD patients, exhibiting satisfactory generalizability and interpretability. This tool supports early risk stratification and facilitates personalized treatment strategies in critical care settings.

Keywords: All-cause mortality, Diabetes nephropathy, Machinelearning, MIMIC-IV, Shap

Received: 05 Sep 2025; Accepted: 12 Feb 2026.

Copyright: © 2026 Han, Mao, Xiong, Wang, Li and Wang. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

* Correspondence:
Lin Li
HongLian Wang

Disclaimer: All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article or claim that may be made by its manufacturer is not guaranteed or endorsed by the publisher.