ORIGINAL RESEARCH article
Front. Mol. Biosci.
Sec. Molecular Diagnostics and Therapeutics
This article is part of the Research TopicTransforming Chronic Disease Treatment with AI and Big Data, Volume IIView all articles
Disease classification via interpretable machine learning based on multi-center routine coagulation test
Provisionally accepted- 1Beijing Jishuitan Hospital Affiliated to Capital Medical University, Beijing, China
- 2Taizhou Central Hospital, Taizhou, China
- 3Shenzhen People's Hospital, Shenzhen, China
- 4People's Hospital of Xinjiang Uygur Autonomous Region, Urumqi, China
- 5Second Affiliated Hospital of Xi'an Jiaotong University, Xi'an, China
- 6Hainan General Hospital, Haikou, China
- 7The Affiliated Yongchuan Hospital of Chongqing Medical University, Chongqing, China
- 8The Second Affiliated Hospital and Yuying Children's Hospital of Wenzhou Medical University, Wenzhou, China
- 9Beijing Cancer Hospital, Beijing, China
- 10Shenzhen Mindray Bio-Medical Electronics Co Ltd, Shenzhen, China
- 11Beijing Jiaotong University, Beijing, China
- 12Luoyang Central Hospital Affiliated to Zhengzhou University, Luoyang, China
- 13Beijing Hospital, Beijing, China
Select one of your emails
You have multiple emails registered with Frontiers:
Notify me on publication
Please enter your email address:
If you already have an account, please login
You don't have a Frontiers account ? You can register here
Background: This study aims to establish an interpretable disease classification model via machine learning and identify key features related to the disease to assist clinical disease diagnosis based on a multi-center CX9000 routine coagulation test. Methods: Data from 11 hospitals were collected. An unsupervised clustering model was used to extract classification patterns, and clinical experts assigned disease labels. Multiple machine learning models, including Random Forest, SVM, Decision Tree, Naive Bayes, MLP, XGBoost, and LightGBM, were trained. Ten-fold cross validation and external validation were performed. For external validation, models were trained with data from 8 hospitals (˜90%) and tested on the remaining 2 hospitals (˜10%). SHAP and Decision Tree analysis were used for interpretability. Results: Clear clustering patterns were observed for valvular heart disease (VHD) and pulmonary infection (PI). LightGBM achieved the best performance in both tasks. In cross validation, the mean F1-scores were 0.8890 and 0.7233, and the mean AUCs were 0.9500 and 0.8023. External validation showed strong generalization, with mean F1-scores of 0.9259 and 0.7464 and mean AUCs of 0.9493 and 0.8297. The sample visualization by t-SNE and the interpretable analysis by SHAP and Decision Trees identified some key classification features, i.e., international normalized ratio (INR) for VHD and age for PI. Conclusion: Machine learning models based on multi-center coagulation tests provide effective and interpretable disease classification, supporting clinical diagnostic automation.
Keywords: Disease classification, Interpretability analysis, machine learning, multi-center coagulation test, SHapley AdditiveexPlanations
Received: 15 Jan 2026; Accepted: 11 Feb 2026.
Copyright: © 2026 Dong, Zhang, Chen, Wang, Zhang, Gao, Zhang, Jiang, Xu, Yang, Hou, Ma, Li and Wu. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.
* Correspondence: Yutong Hou
Disclaimer: All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article or claim that may be made by its manufacturer is not guaranteed or endorsed by the publisher.
