Disease classification via interpretable machine learning based on multi-center routine coagulation test

Dong, Feng; Zhang, Yaqiong; Chen, Weibu; Wang, Changmin; Zhang, Lei; Gao, Xiaoling; Zhang, Xiaoli; Jiang, Minghua; Xu, Guobin; Yang, Ruichuang; Hou, Yutong; Ma, Jiandang; Li, Zhuanbao; Wu, Jun

doi:10.3389/fmolb.2026.1788536

ORIGINAL RESEARCH article

Front. Mol. Biosci.

Sec. Molecular Diagnostics and Therapeutics

This article is part of the Research TopicTransforming Chronic Disease Treatment with AI and Big Data, Volume IIView all articles

Disease classification via interpretable machine learning based on multi-center routine coagulation test

Provisionally accepted

Feng Dong¹

Yaqiong Zhang²

Weibu Chen³

Changmin Wang⁴

Lei Zhang⁵

Xiaoling Gao⁶

Xiaoli Zhang⁷

Minghua Jiang⁸

Guobin Xu⁹

Ruichuang Yang¹⁰

Yutong Hou^11*

Jiandang Ma¹²

Zhuanbao Li¹³ Jun Wu

Jun Wu¹

¹Beijing Jishuitan Hospital Affiliated to Capital Medical University, Beijing, China
²Taizhou Central Hospital, Taizhou, China
³Shenzhen People's Hospital, Shenzhen, China
⁴People's Hospital of Xinjiang Uygur Autonomous Region, Urumqi, China
⁵Second Affiliated Hospital of Xi'an Jiaotong University, Xi'an, China
⁶Hainan General Hospital, Haikou, China
⁷The Affiliated Yongchuan Hospital of Chongqing Medical University, Chongqing, China
⁸The Second Affiliated Hospital and Yuying Children's Hospital of Wenzhou Medical University, Wenzhou, China
⁹Beijing Cancer Hospital, Beijing, China
¹⁰Shenzhen Mindray Bio-Medical Electronics Co Ltd, Shenzhen, China
¹¹Beijing Jiaotong University, Beijing, China
¹²Luoyang Central Hospital Affiliated to Zhengzhou University, Luoyang, China
¹³Beijing Hospital, Beijing, China

The final, formatted version of the article will be published soon.

Background: This study aims to establish an interpretable disease classification model via machine learning and identify key features related to the disease to assist clinical disease diagnosis based on a multi-center CX9000 routine coagulation test. Methods: Data from 11 hospitals were collected. An unsupervised clustering model was used to extract classification patterns, and clinical experts assigned disease labels. Multiple machine learning models, including Random Forest, SVM, Decision Tree, Naive Bayes, MLP, XGBoost, and LightGBM, were trained. Ten-fold cross validation and external validation were performed. For external validation, models were trained with data from 8 hospitals (˜90%) and tested on the remaining 2 hospitals (˜10%). SHAP and Decision Tree analysis were used for interpretability. Results: Clear clustering patterns were observed for valvular heart disease (VHD) and pulmonary infection (PI). LightGBM achieved the best performance in both tasks. In cross validation, the mean F1-scores were 0.8890 and 0.7233, and the mean AUCs were 0.9500 and 0.8023. External validation showed strong generalization, with mean F1-scores of 0.9259 and 0.7464 and mean AUCs of 0.9493 and 0.8297. The sample visualization by t-SNE and the interpretable analysis by SHAP and Decision Trees identified some key classification features, i.e., international normalized ratio (INR) for VHD and age for PI. Conclusion: Machine learning models based on multi-center coagulation tests provide effective and interpretable disease classification, supporting clinical diagnostic automation.

Keywords: Disease classification, Interpretability analysis, machine learning, multi-center coagulation test, SHapley AdditiveexPlanations

Received: 15 Jan 2026; Accepted: 11 Feb 2026.

Copyright: © 2026 Dong, Zhang, Chen, Wang, Zhang, Gao, Zhang, Jiang, Xu, Yang, Hou, Ma, Li and Wu. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

* Correspondence: Yutong Hou

Disclaimer: All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article or claim that may be made by its manufacturer is not guaranteed or endorsed by the publisher.