AUTHOR=Mohammadi Ida , Farahani Setayesh , Karimi Asal , Jahanian Saina , Firouzabadi Shahryar Rajai , Alinejadfard Mohammadreza , Fatemi Alireza , Hajikarimloo Bardia , Akhlaghpasand Mohammadhosein TITLE=Mortality prediction of heart transplantation using machine learning models: a systematic review and meta-analysis JOURNAL=Frontiers in Artificial Intelligence VOLUME=Volume 8 - 2025 YEAR=2025 URL=https://www.frontiersin.org/journals/artificial-intelligence/articles/10.3389/frai.2025.1551959 DOI=10.3389/frai.2025.1551959 ISSN=2624-8212 ABSTRACT=IntroductionMachine learning (ML) models have been increasingly applied to predict post-heart transplantation (HT) mortality, aiming to improve decision-making and optimize outcomes. This systematic review and meta-analysis evaluates the performance of ML algorithms in predicting mortality and explores factors contributing to model accuracy.MethodA systematic search of PubMed, Scopus, Web of Science, and Embase identified relevant studies, with 17 studies included in the review and 12 in the meta-analysis. The algorithms assessed included random forests, CatBoost, neural networks, and others. Model performance was evaluated using pooled area under the curve (AUC) values, with subgroup analyses for algorithm type, validation methods, and prediction timeframes. The risk of bias was assessed using the QUADAS-2 tool.ResultsThe pooled AUC of all ML algorithms was 0.65 (95% CI: 0.64, 0.67), with no significant difference between machine learning and deep learning models (p = 0.67). Among the algorithms, CatBoost demonstrated the highest accuracy (AUC 0.80, 95% CI: 0.74, 0.86), while K-nearest neighbor had the lowest accuracy (AUC 0.53, 95% CI: 0.50, 0.55). A meta-regression indicated improved model performance with longer post-transplant periods (p = 0.008). When pooling only the best-performing models, the AUC improved to 0.73 (95% CI: 0.68, 0.78). The risk of bias was high in eight studies, with the flow and timing domains most commonly contributing to bias.ConclusionML models demonstrate moderate accuracy in predicting post-HT mortality, with CatBoost achieving the best performance. While ML shows potential for improving predictive precision, significant heterogeneity and biases highlight the need for standardized methods and further external validations to enhance clinical applicability.Systematic review registrationhttps://www.crd.york.ac.uk/PROSPERO/view/CRD42024509630, CRD42024509630