Your new experience awaits. Try the new design now and help us make it even better

ORIGINAL RESEARCH article

Front. Endocrinol.

Sec. Clinical Diabetes

Volume 16 - 2025 | doi: 10.3389/fendo.2025.1649988

This article is part of the Research TopicDigital Technology in the Management and Prevention of Diabetes: Volume IIIView all 4 articles

Clinical study Development and internal validation of a machine learning algorithm for the risk of type 2 diabetes mellitus in children with obesity

Provisionally accepted
Jinxia  YangJinxia Yang1,2*Yue  LiuYue Liu1Rong  HuangRong Huang3Haiying  WuHaiying Wu2Ya-Yun  WangYa-Yun Wang2Su-Ying  CaoSu-Ying Cao2Guo-Ying  WangGuo-Ying Wang2Jianmin  ZhangJianmin Zhang2ZiSheng  AiZiSheng Ai1*Hui-Min  ZhouHui-Min Zhou2*
  • 1Tongji University, Basic Medical Science, Shanghi, China
  • 2Children's Hospital of Soochow University, Suzhou, China
  • 3Shanghai First Maternity and Infant Hospital, School of Medicine, Tongji University, Shanghai, China

The final, formatted version of the article will be published soon.

Aim: We aimed to develop and internally validate a machine learning (ML)-based model for the prediction of the risk of type 2 diabetes mellitus (T2DM) in children with obesity. Methods: In total, 292 children with obesity and T2DM were enrolled between July 2023 and February 2024 and followed for at least 1 year. Eight ML algorithms (Decision Tree, Logistic Regression, Support Vector Machine (SVM), Multilayer Perceptron, Adaptive Boosting, Random Forest, Gradient Boosting Decision Tree, and Extreme Gradient Boosting) were compared for their capacity to identify key clinical and laboratory characteristics of T2DM in children and to create a risk prediction model. Results: Forty-nine children were diagnosed with T2DM during the follow-up period. The SVM algorithm was the best predictor of T2DM, with the largest area under the receiver operating characteristic curve (0.98) and accuracy (93.2%). The SVM algorithm identified eight predictors: BMI, creatinine, prealbumin, glucose (180 min), glycosylated hemoglobin A1c, thyrotropin, total thyroxine (T4), and free T4 concentrations. Thus, an ML-based prediction model accurately identifies children with obesity at high risk of T2DM. If externally validated, this tool could facilitate early, personalized interventions aimed at preventing T2DM. Discussion: The rising prevalence of obesity in childhood is associated with an increase in the risk of early-onset T2DM. Therefore, the early identification of individuals at high risk is crucial to prevent the development of this disease. In a comparative analysis of the performance of multiple ML algorithms, we found that the SVM algorithm was the best predictor of the development of T2DM.

Keywords: Obesity, Children, Risk prediction model, machine learning, type 2 diabetes mellitus

Received: 24 Jun 2025; Accepted: 22 Jul 2025.

Copyright: © 2025 Yang, Liu, Huang, Wu, Wang, Cao, Wang, Zhang, Ai and Zhou. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

* Correspondence:
Jinxia Yang, Tongji University, Basic Medical Science, Shanghi, China
ZiSheng Ai, Tongji University, Basic Medical Science, Shanghi, China
Hui-Min Zhou, Children's Hospital of Soochow University, Suzhou, China

Disclaimer: All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article or claim that may be made by its manufacturer is not guaranteed or endorsed by the publisher.