ORIGINAL RESEARCH article
Front. Endocrinol.
Sec. Clinical Diabetes
Volume 16 - 2025 | doi: 10.3389/fendo.2025.1657366
This article is part of the Research TopicDigital Technology in the Management and Prevention of Diabetes: Volume IIIView all 5 articles
Characterizing Clinical Risk Profiles of Major Complications in Type 2 Diabetes Mellitus Using Deep Learning Algorithms
Provisionally accepted- 1Xi’an Jiaotong University Health Science Center, Xi'an, China
- 2Xijing Hospital, Xi’an, China
- 3Ninth Hospital of Xi'an, Xi'an, China
- 4Carnegie Mellon University, Pittsburgh, United States
Select one of your emails
You have multiple emails registered with Frontiers:
Notify me on publication
Please enter your email address:
If you already have an account, please login
You don't have a Frontiers account ? You can register here
Objective: To develop a self-reportable risk assessment tool for elderly type 2 diabetes mellitus (T2DM) patients, evaluating risks of diabetic nephropathy (DN), retinopathy (DR), peripheral neuropathy (DPN), and diabetic foot (DF) using machine learning, thereby providing new insights and tools for the screening and intervention of these complications. Materials and Methods: Data from 1,448 T2DM patients at Xi'an No.9 Hospital were used. After preprocessing, five machine learning algorithms (XGBoost, LightGBM, Random Forest, TabPFN, CatBoost ) were applied. Models were trained on 70% of the data and evaluated on 30%, with performance assessed by multiple metrics and SHAP analysis for feature importance. Results: The analysis identified 33 risk factors, including 6 shared risk factors (UACR for DN and DR; diabetes duration for DR, DPN, and DF; IBILI for DF and DPN; history of DN for DR and DF; U-Cr for DR and DF; MCHC for DN and DPN) and 27 unique risk factors. Model performance was robust: for DN, TabPFN achieved an AUC of 0.905 and Random Forest an accuracy of 0.878; for DR, LightGBM attained an AUC of 0.794; for DPN, both TabPFN and CatBoost achieved a perfect recall of 1.000 and F1-score of 0.915; and for DF, LightGBM attaining the highest AUC of 0.704. SHAP analysis highlighted key features for each complication, such as UACR and Y-protein for DN, diabetes duration and TPOAB for DR, history of DN and IBILI for DF, and diabetes duration and SBP for DPN. Conclusion: This study employed interpretable machine learning to characterize risk factor profiles for multiple T2DM complications, identifying both common and distinct factors associated with major complications. The findings provide a foundation for exploring personalised risk management strategies and highlight the potential of data-driven approaches to inform early intervention research in T2DM complications.
Keywords: Type 2 diabetes mellitus (T2DM), diabetic complications, SHAP (Shapley Additive explanation), Machine l earning, risk factors
Received: 01 Jul 2025; Accepted: 25 Aug 2025.
Copyright: © 2025 Liu, Li, Shi, Lei, Wang, Gao, Liu, Zhu, Zhai, Zhang, Li, Wang, Niu, Ma and Zhang. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.
* Correspondence: Haochen Liu, Xi’an Jiaotong University Health Science Center, Xi'an, China
Disclaimer: All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article or claim that may be made by its manufacturer is not guaranteed or endorsed by the publisher.