AUTHOR=Yu Tao , Shen Runnan , You Guochang , Lv Lin , Kang Shimao , Wang Xiaoyan , Xu Jiatang , Zhu Dongxi , Xia Zuqi , Zheng Junmeng , Huang Kai 

TITLE=Machine learning-based prediction of the post-thrombotic syndrome: Model development and validation study

JOURNAL=Frontiers in Cardiovascular Medicine

VOLUME=Volume 9 - 2022

YEAR=2022

URL=https://www.frontiersin.org/journals/cardiovascular-medicine/articles/10.3389/fcvm.2022.990788

DOI=10.3389/fcvm.2022.990788

ISSN=2297-055X

ABSTRACT=Background: Prevention plays a key role in reducing the incidence of post-thrombotic syndrome (PTS). We aimed to develop accurate models with machine learning (ML) algorithms to predict the occurrence of PTS in 24 months.
Methods: The clinical data used for model building was obtained from ATTRACT study and the external validation cohort was acquired from Sun Yat-sen Memorial hospital in China. The main outcome was defined as occurrence of PTS event (Villalta score ≥5). 23 clinical variables were comprised and four ML algorithms were applied to build models. Discrimination and calibration, F score were exploited to evaluate the prediction ability of the models. Based on risk estimate deciles, the external validation cohort was divided into ten groups to identify the hazard threshold.
Results: 555 deep vein thrombosis (DVT) patients were included to build models with ML algorithms and the models were further validated in a Chinese cohort comprising 117 patients. When predicting PTS in 2 years after acute DVT, logistic regression (LR) based on gradient descent and L1 regularization got highest area under the curve (AUC) of 0.83 (95% CI: 0.76-0.89) in external validation. When considering model performance in both training and external validation cohort, eXtreme gradient boosting (XGBoost) and gradient boosting decision tree (GBDT) models had similar results and presented better stability and generalization. The external validation cohort was divided into low, intermediate, high-risk groups with the prediction probability of 0.3 and 0.4 as critical points.
Conclusion: ML models built for PTS had accurate prediction ability and stable generalization, which can further facilitate clinical decision-making, with potentially important implications on selecting patients that will benefit from endovascular surgery.