ORIGINAL RESEARCH article

Front. Endocrinol.

Sec. Clinical Diabetes

Volume 16 - 2025 | doi: 10.3389/fendo.2025.1593068

This article is part of the Research TopicDigital Technology in the Management and Prevention of Diabetes: Volume IIIView all articles

Optimized Prediction of Diabetes Complications Using Ensemble Learning with Bayesian Optimization: A Cost-Efficient Laboratory-Based Approach

Provisionally accepted
Xiaohan  LiXiaohan Li1Yifan  WangYifan Wang1Dapeng  YanDapeng Yan2*Yefei  ZhuYefei Zhu1
  • 1Nanjing Medical University, Nanjing, Jiangsu Province, China
  • 2Nanjing University of Posts and Telecommunications, Nanjing, Jiangsu Province, China

The final, formatted version of the article will be published soon.

Background and Objective: The increasing global prevalence of diabetes has led to a surge in complications, significantly burdening healthcare systems and affecting patient quality of life.Early prediction of these complications is critical for timely intervention, yet existing models often rely heavily on clinical indicators while underutilizing fundamental laboratory test parameters. This study aims to bridge this gap by leveraging the 12 most frequently tested laboratory indicators in diabetic patients to develop an optimized predictive model for diabetes complications.Methods: A comprehensive dataset was established through meticulous data collection from a high-volume tertiary hospital, followed by extensive data cleaning and classification. Various machine learning classifiers, including Random Forest, XGBoost, Support Vector Machine (SVM), and Multilayer Perceptron (MLP), were trained on this dataset to evaluate their predictive performance. We further introduced an ensemble learning model with Bayesian optimization to enhance accuracy and cost-efficiency. Additionally, feature importance analysis was conducted to refine the model by reducing testing costs while maintaining high predictive accuracy.Results: Our ensemble model with Bayesian optimization demonstrated superior performance, achieving over 90% accuracy in predicting various diabetic complications, with an outstanding 98.50% accuracy and 99.76% AUC for diabetic nephropathy. Feature correlation analysis enabled a refined model that not only improved predictive accuracy but also reduced overall medical costs by 2.5% through strategic feature elimination. Conclusions: This study makes three key contributions: (1) Development of a high-quality dataset based on fundamental laboratory indicators, (2) Creation of a highly accurate predictive model 1 Sample et al. Running Title using ensemble learning and Bayesian optimization, particularly excelling in diabetic nephropathy prediction, and (3) Implementation of a cost-efficient diagnostic approach that reduces testing expenses without compromising accuracy. The proposed model provides a strong foundation for future research and practical clinical applications, demonstrating the potential of integrating machine learning with cost-conscious medical testing.

Keywords: Diabetes Complications, Predictive Modeling, machine learning, Bayesian optimization, Cost-Efficient Diagnosis, Clinical laboratory indicators

Received: 13 Mar 2025; Accepted: 22 May 2025.

Copyright: © 2025 Li, Wang, Yan and Zhu. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

* Correspondence: Dapeng Yan, Nanjing University of Posts and Telecommunications, Nanjing, 210003, Jiangsu Province, China

Disclaimer: All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article or claim that may be made by its manufacturer is not guaranteed or endorsed by the publisher.