ORIGINAL RESEARCH article
Front. Med.
Sec. Hematology
Volume 12 - 2025 | doi: 10.3389/fmed.2025.1605868
Development and Application of Machine Learning Models for Hematological Disease Diagnosis Using Routine Laboratory Parameters: A User-Friendly Diagnostic Platform
Provisionally accepted- 1Medical Center of Hematology, Xinqiao Hospital of Army Medical University, Chongqing, China
- 2State Key Laboratory of Trauma, Burns and Combined Injury,Jinfeng Laboratory, Chongqing, China
- 3Chongqing Key Laboratory of Hematology and Microenvironment, Chongqing, China
Select one of your emails
You have multiple emails registered with Frontiers:
Notify me on publication
Please enter your email address:
If you already have an account, please login
You don't have a Frontiers account ? You can register here
Aim: In recent years, with the change of social environment, the incidence and detection rate of hematological disease have shown an increasing trend. Early diagnosis and detection of hematological disease is very important to improve the quality of life and prognosis of patients.In this study, we used 54 clinical and conventional laboratory parameters, and through the optimal combination of multiple feature selection methods and machine learning algorithms, developed 7 machine learning models with different numbers of feature parameters. We comprehensively evaluated the performance of these models, analyzed the interpretability of the optimal and simplified models using SHapley Additive exPlanations (SHAP), and compared these two models with the diagnostic performance of hematologists. Finally, developed a user-friendly diagnostic platform.The results show that the ensemble model_1 with 46 feature parameters (EnMod1-46) and the simple ensemble model_2 with 12 feature parameters (EnMod2-12) have significant performance in diagnosing 16 types of hematological disease. On the temporally distinct test set_1, the EnMod1-46 achieved an accuracy of 0.804 and an Area Under the Curve (AUC) of 0.964, while EnMod2-12 attained an accuracy of 0.784 and an AUC of 0.961. To further validate the model's generalization performance, EnMod1-46 achieved an accuracy of 0.738 and an AUC of 0.973 on the independent external test set_2, while EnMod2-12 yielded an accuracy of 0.705 and an AUC of 0.962. SHAP analysis showed that PLT, WBC, MCV, HGB, age and RBC were significant feature parameters in both models. Comparative analysis of clinical diagnosis revealed that the performance of EnMod1-46 and EnMod2-12 outperformed junior hematologists, while EnMod1-46 was comparable to senior hematologists.Concurrently, based on EnMod2-12, we have developed a user-friendly diagnostic platform to facilitate risk assessment and improve access to accurate diagnosis.This study provides an efficient and accurate screening method for hematological disease, especially in resource-limited countries and regions.
Keywords: Hematological disease, machine learning, Prediction model, Shap, 52 Laboratory parameters
Received: 04 Apr 2025; Accepted: 07 Aug 2025.
Copyright: © 2025 Liu, Gou, Yang, Wang, Zhang, Wu, Liu, Tao, Tang, Yang, Chen, Wang, Feng, Zhang, Liu, Peng and Zhang. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.
* Correspondence:
Shuiqing Liu, Medical Center of Hematology, Xinqiao Hospital of Army Medical University, Chongqing, China
Xiangui Peng, Medical Center of Hematology, Xinqiao Hospital of Army Medical University, Chongqing, China
Xi Zhang, Medical Center of Hematology, Xinqiao Hospital of Army Medical University, Chongqing, China
Disclaimer: All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article or claim that may be made by its manufacturer is not guaranteed or endorsed by the publisher.