AUTHOR=Sun Zhoujian , Dong Wei , Shi Hanrui , Ma Hong , Cheng Lechao , Huang Zhengxing 

TITLE=Comparing Machine Learning Models and Statistical Models for Predicting Heart Failure Events: A Systematic Review and Meta-Analysis

JOURNAL=Frontiers in Cardiovascular Medicine

VOLUME=Volume 9 - 2022

YEAR=2022

URL=https://www.frontiersin.org/journals/cardiovascular-medicine/articles/10.3389/fcvm.2022.812276

DOI=10.3389/fcvm.2022.812276

ISSN=2297-055X

ABSTRACT=Objective: To compare the performance, clinical feasibility, and reliability of statistical and machine learning (ML) models in predicting heart failure (HF) events.
Background: Although ML models have been proposed to revolutionize medicine, their promise in predicting HF events has not been investigated in detail.
Methods: A systematic search was performed on Medline, Web of Science, and IEEE Xplore for studies published between January 1, 2011 to July 14, 2021 that developed or validated at least one statistical or ML model that could predict all-cause mortality or all-cause readmission of HF patients. Prediction Model Risk of Bias Assessment Tool was used to assess the risk of bias, and random effect model was used to evaluate the pooled c-statistics of included models.
Result: Two-hundred and two statistical model studies and 78 ML model studies were included from the retrieved papers. The pooled c-index of statistical models in predicting all-cause mortality, ML models in predicting all-cause mortality, statistical models in predicting all-cause readmission, ML models in predicting all-cause readmission were 0.733 (95% confidence interval 0.724-0.742), 0.777 (0.752-0.803), 0.678 (0.651-0.706), and 0.660 (0.633-0.686), respectively, indicating that ML models did not show consistent superiority compared to statistical models. The head-to-head comparison revealed similar results. Meanwhile, the immoderate use of predictors limited the feasibility of ML models. The risk of bias analysis indicated that ML models' technical pitfalls were more serious than statistical models’. Furthermore, the efficacy of ML models among different HF subgroups is still unclear.
Conclusions: ML models did not achieve a significant advantage in predicting events, and their clinical feasibility and reliability were worse.