Your new experience awaits. Try the new design now and help us make it even better

ORIGINAL RESEARCH article

Front. Endocrinol.

Sec. Clinical Diabetes

This article is part of the Research TopicPrevention and Treatment Advancements in Diabetic RetinopathyView all 24 articles

Development of machine learning Predictive Model for Type 2 Diabetic Retinopathy Using the Triglyceride-glucose index explained by SHAP method

Provisionally accepted
Xiaoqin  LiuXiaoqin Liu1Shuying  WuShuying Wu1Yue  YangYue Yang1Yang  LiYang Li1Xinting  ZhangXinting Zhang1Ri-Hui  LiuRi-Hui Liu2Ling  QinLing Qin3*Fei  LiFei Li1*
  • 1The First hospital of Jilin University, Changchun, China
  • 2Guangdong Academy of Medical Sciences, Guangdong Provincial People's Hospital, Guangzhou, China
  • 3Meihekou Central Hospital, Meihekou, China

The final, formatted version of the article will be published soon.

Introduction: This study aimed to develop a diabetic retinopathy (DR) Prediction model using various machine learning algorithms incorporating the novel predictor Triglyceride-glucose index (TyG). Furthermore, the model was interpreted using the SHapley Additive exPlanations (SHAP) method. Method: Real-world data were collected from a general hospital in a major city and a county clinic, then divided into the DR Group (1392) and non-DR group (2358). Baseline data were collected, and variables were selected using Recursive Feature Elimination with Cross-Validation (RFECV). The performance of five machine learning algorithms, including Logistic Regression model (LR), Decision Tree (DT), Random Forest (RF), Support Vector Machine (SVM), and XGBoost (XGB), was assessed based on accuracy, sensitivity, specificity, and Area Under the Curve (AUC) of the Receiver Operating characteristic Curve (ROC). The optimal model was interpreted using SHAP. Result:SVM and LR demonstrated superior performance in both the test set and training set (ROC, 0.85 and 0.82, respectively). The top five predictors identified by SHAP analysis included TyG, This is a provisional file, not the final typeset article Insulin therapy, HbA1c, Diabetes Course, HDL. HDL was identified as a protective factor, while the remaining factors were associated with retinopathy. Conclusion:LR and SVM demonstrated the best performance. To our knowledge, this is the first machine learning-based DR prediction model integrating the triglyceride-glucose index (TyG) as a core predictor, overcoming limitations of insulin resistance (IR) assessment in resource-limited settings. TyG provides a cost-effective alternative to conventional IR biomarkers (e.g., HOMA-IR), enabling practical DR risk stratification in primary care.

Keywords: TyG-index, Diabetic Retinopathy, machine learning, predictive model, Shap

Received: 20 May 2025; Accepted: 27 Oct 2025.

Copyright: © 2025 Liu, Wu, Yang, Li, Zhang, Liu, Qin and Li. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

* Correspondence:
Ling Qin, mhkjwdys@163.com
Fei Li, li_fei@jlu.edu.cn

Disclaimer: All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article or claim that may be made by its manufacturer is not guaranteed or endorsed by the publisher.