Your new experience awaits. Try the new design now and help us make it even better

METHODS article

Front. Med.

Sec. Precision Medicine

Volume 12 - 2025 | doi: 10.3389/fmed.2025.1657889

Integrating Convolutional Neural Networks with Ensemble Methods for Enhanced Diabetes Diagnosis: A Multi-Dataset Evaluation

Provisionally accepted
Kaibo  ZhuangKaibo Zhuang1Chenyang  ZhangChenyang Zhang2Zhen  ChenZhen Chen3Tianyu  SheTianyu She4*Min  WangMin Wang5*
  • 1The University of Manchester, Manchester, United Kingdom
  • 2University of Colorado Boulder, Boulder, United States
  • 3Soochow University, Suzhou, China
  • 4Xi'an Electric Power Central Hospital, Xi'an, China
  • 5Chengdu Medical College Second Affiliated Hospital, Chengdu, China

The final, formatted version of the article will be published soon.

Timely and accurate diagnosis of diabetes mellitus remains a pending challenge due to the diversity of patient data and the limitations of traditional screening methods. Objective: To propose a hybrid prediction framework incorporating Convolutional Neural Networks (CNNs) and Integrated Learning with a soft voting strategy to improve the accuracy, robustness and interpretability of diabetes diagnosis. Methods: The model was evaluated on two publicly available datasets—the UCI Pima Indians Diabetes dataset (768 samples, 8 features), the same dataset used to describe the Pima Indians (2000 samples, 8 features) and the Tianchi Medical dataset (5,642 samples, 41 features). After missing-value imputation, z-score standardization, and min–max normalization, CNNs were used for deep feature extraction, followed by integration with multiple classifiers—Logistic Regression (LR), Support Vector Machines (SVM), Random Forest, AdaBoost, XGBoost, LightGBM, and CatBoost—via a weighted soft voting scheme. Training and testing sets were split 75:25, and hyperparameters for each classifier were tuned through grid search. Results: The proposed CNN-Voting integrated model consistently outperforms the individual models, achieving up to 98% accuracy, 0.99 F1 value and 99% recall on the largest dataset. Feature importance analysis revealed that blood glucose, body mass index (BMI), age, and urea were the features with the most predictive value, which was highly consistent with common knowledge in clinical medicine. Conclusion: This hybrid model not only improves predictive performance and generalisability, but also provides a scalable and interpretable solution for clinical decision support in diabetes management.

Keywords: Convolutional Neural Networks, diabetes, Soft voting, machine learning, feature extraction

Received: 02 Jul 2025; Accepted: 30 Aug 2025.

Copyright: © 2025 Zhuang, Zhang, Chen, She and Wang. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

* Correspondence:
Tianyu She, Xi'an Electric Power Central Hospital, Xi'an, China
Min Wang, Chengdu Medical College Second Affiliated Hospital, Chengdu, China

Disclaimer: All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article or claim that may be made by its manufacturer is not guaranteed or endorsed by the publisher.