Your new experience awaits. Try the new design now and help us make it even better

ORIGINAL RESEARCH article

Front. Mol. Biosci.

Sec. Molecular Diagnostics and Therapeutics

This article is part of the Research TopicTransforming Chronic Disease Treatment with AI and Big Data, Volume IIView all articles

Disease classification via interpretable machine learning based on multi-center routine coagulation test

Provisionally accepted
Feng  DongFeng Dong1Yaqiong  ZhangYaqiong Zhang2Weibu  ChenWeibu Chen3Changmin  WangChangmin Wang4Lei  ZhangLei Zhang5Xiaoling  GaoXiaoling Gao6Xiaoli  ZhangXiaoli Zhang7Minghua  JiangMinghua Jiang8Guobin  XuGuobin Xu9Ruichuang  YangRuichuang Yang10Yutong  HouYutong Hou11*Jiandang  MaJiandang Ma12Zhuanbao  LiZhuanbao Li13Jun  WuJun Wu1
  • 1Beijing Jishuitan Hospital Affiliated to Capital Medical University, Beijing, China
  • 2Taizhou Central Hospital, Taizhou, China
  • 3Shenzhen People's Hospital, Shenzhen, China
  • 4People's Hospital of Xinjiang Uygur Autonomous Region, Urumqi, China
  • 5Second Affiliated Hospital of Xi'an Jiaotong University, Xi'an, China
  • 6Hainan General Hospital, Haikou, China
  • 7The Affiliated Yongchuan Hospital of Chongqing Medical University, Chongqing, China
  • 8The Second Affiliated Hospital and Yuying Children's Hospital of Wenzhou Medical University, Wenzhou, China
  • 9Beijing Cancer Hospital, Beijing, China
  • 10Shenzhen Mindray Bio-Medical Electronics Co Ltd, Shenzhen, China
  • 11Beijing Jiaotong University, Beijing, China
  • 12Luoyang Central Hospital Affiliated to Zhengzhou University, Luoyang, China
  • 13Beijing Hospital, Beijing, China

The final, formatted version of the article will be published soon.

Background: This study aims to establish an interpretable disease classification model via machine learning and identify key features related to the disease to assist clinical disease diagnosis based on a multi-center CX9000 routine coagulation test. Methods: Data from 11 hospitals were collected. An unsupervised clustering model was used to extract classification patterns, and clinical experts assigned disease labels. Multiple machine learning models, including Random Forest, SVM, Decision Tree, Naive Bayes, MLP, XGBoost, and LightGBM, were trained. Ten-fold cross validation and external validation were performed. For external validation, models were trained with data from 8 hospitals (˜90%) and tested on the remaining 2 hospitals (˜10%). SHAP and Decision Tree analysis were used for interpretability. Results: Clear clustering patterns were observed for valvular heart disease (VHD) and pulmonary infection (PI). LightGBM achieved the best performance in both tasks. In cross validation, the mean F1-scores were 0.8890 and 0.7233, and the mean AUCs were 0.9500 and 0.8023. External validation showed strong generalization, with mean F1-scores of 0.9259 and 0.7464 and mean AUCs of 0.9493 and 0.8297. The sample visualization by t-SNE and the interpretable analysis by SHAP and Decision Trees identified some key classification features, i.e., international normalized ratio (INR) for VHD and age for PI. Conclusion: Machine learning models based on multi-center coagulation tests provide effective and interpretable disease classification, supporting clinical diagnostic automation.

Keywords: Disease classification, Interpretability analysis, machine learning, multi-center coagulation test, SHapley AdditiveexPlanations

Received: 15 Jan 2026; Accepted: 11 Feb 2026.

Copyright: © 2026 Dong, Zhang, Chen, Wang, Zhang, Gao, Zhang, Jiang, Xu, Yang, Hou, Ma, Li and Wu. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

* Correspondence: Yutong Hou

Disclaimer: All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article or claim that may be made by its manufacturer is not guaranteed or endorsed by the publisher.