Your new experience awaits. Try the new design now and help us make it even better

ORIGINAL RESEARCH article

Front. Big Data

Sec. Machine Learning and Artificial Intelligence

Volume 8 - 2025 | doi: 10.3389/fdata.2025.1605258

Predicting Deep Vein Thrombosis Using Machine Learning and Blood Routine Analysis

Provisionally accepted
Jie  SuJie Su1Yuechao  TangYuechao Tang2Yanan  WangYanan Wang3Chao  ChenChao Chen3Biao  SongBiao Song3*
  • 1Inner Mongolia Medical University, Hohhot, China
  • 2Baoding Second Central Hospital, Zhuozhou, Hebei Province, China
  • 3Medical Intelligent Diagnostics Big Data Research Institute, Huhhot, China

The final, formatted version of the article will be published soon.

Objective: Lower limb deep vein thrombosis (DVT) is a serious health problem, causing local discomfort and hindering walking. It can lead to severe complications, including pulmonary embolism, chronic post-thrombotic syndrome, and limb amputation, posing risks of death or severe disability. This study aims to develop a diagnostic model for DVT using routine blood analysis and evaluate its effectiveness in early diagnosis. Methods: This study retrospectively analyzed patient medical records from January 2022 to June 2023, including 658 DVT patients (case group) and 1418 healthy subjects (control group). SHAP (SHapley Additive exPlanations) analysis was employed for feature selection to identify key blood indices significantly impacting DVT risk prediction. Based on the selected features, six machine learning models were constructed: k-Nearest Neighbors (kNN), Logistic Regression (LR), Decision Tree (DT), Random Forest (RF), Support Vector Machine (SVM), and Artificial Neural Network (ANN). Model performance was assessed using the area under the curve (AUC). Results: SHAP analysis identified ten key blood routine indices. The six models constructed using these indices demonstrated strong predictive performance, with AUC values exceeding 0.8, accuracy above 70%, and sensitivity and specificity over 70%. Notably, the RF model exhibited superior performance in assessing the risk of DVT. Conclusions: Our study successfully developed machine learning models for predicting DVT risk using routine blood tests. These models achieved high predictive performance, suggesting their potential for early DVT diagnosis without additional medical burden on patients. Future research will focus on further validation and refinement of these models to enhance their clinical applicability.

Keywords: deep vein thrombosis, machine learning, Blood routine, Prediction model, SHAPanalysis

Received: 09 Apr 2025; Accepted: 18 Sep 2025.

Copyright: © 2025 Su, Tang, Wang, Chen and Song. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

* Correspondence: Biao Song, songbiao_511@163.com

Disclaimer: All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article or claim that may be made by its manufacturer is not guaranteed or endorsed by the publisher.