AUTHOR=Yang Jin-Xia , Liu Yue , Huang Rong , Wu Hai-ying , Wang Ya-yun , Cao Su-ying , Wang Guo-ying , Zhang Jian-Min , Ai Zi-Sheng , Zhou Hui-min TITLE=Development and internal validation of a machine learning algorithm for the risk of type 2 diabetes mellitus in children with obesity JOURNAL=Frontiers in Endocrinology VOLUME=Volume 16 - 2025 YEAR=2025 URL=https://www.frontiersin.org/journals/endocrinology/articles/10.3389/fendo.2025.1649988 DOI=10.3389/fendo.2025.1649988 ISSN=1664-2392 ABSTRACT=AimWe aimed to develop and internally validate a machine learning (ML)-based model for the prediction of the risk of type 2 diabetes mellitus (T2DM) in children with obesity.MethodsIn total, 292 children with obesity and T2DM were enrolled between July 2023 and February 2024 and followed for at least 1 year. Eight ML algorithms (Decision Tree, Logistic Regression, Support Vector Machine (SVM), Multilayer Perceptron, Adaptive Boosting, Random Forest, Gradient Boosting Decision Tree, and Extreme Gradient Boosting) were compared for their capacity to identify key clinical and laboratory characteristics of T2DM in children and to create a risk prediction model.ResultsForty-nine children were diagnosed with T2DM during the follow-up period. The SVM algorithm was the best predictor of T2DM, with the largest area under the receiver operating characteristic curve (0.98) and accuracy (93.2%). The SVM algorithm identified eight predictors: BMI, creatinine, prealbumin, glucose (180 min), glycosylated hemoglobin A1c, thyrotropin, total thyroxine (T4), and free T4 concentrations. Thus, an ML-based prediction model accurately identifies children with obesity at high risk of T2DM. If externally validated, this tool could facilitate early, personalized interventions aimed at preventing T2DM.DiscussionThe rising prevalence of obesity in childhood is associated with an increase in the risk of early-onset T2DM. Therefore, the early identification of individuals at high risk is crucial to prevent the development of this disease. In a comparative analysis of the performance of multiple ML algorithms, we found that the SVM algorithm was the best predictor of the development of T2DM.