AUTHOR=Yordanov Tsvetan R. , Ravelli Anita C. J. , Amiri Saba , Vis Marije , Houterman Saskia , Van der Voort Sebastian R. , Abu-Hanna Ameen 

TITLE=Performance of federated learning-based models in the Dutch TAVI population was comparable to central strategies and outperformed local strategies

JOURNAL=Frontiers in Cardiovascular Medicine

VOLUME=Volume 11 - 2024

YEAR=2024

URL=https://www.frontiersin.org/journals/cardiovascular-medicine/articles/10.3389/fcvm.2024.1399138

DOI=10.3389/fcvm.2024.1399138

ISSN=2297-055X

ABSTRACT=Background: Federated Learning (FL) is a technique for learning prediction models without sharing records between hospitals. Compared to centralized training approaches, adoption of FL could negatively impact model performance.Aim: Evaluate four types of multicenter model development strategies in predicting 30-day mortality for Transcatheter Aortic Valve Implantation (TAVI) patients; 1) Central: learning one model from a centralized dataset of all hospitals, 2) local: learning one model per hospital, 3) FedAvg: averaging of local model coefficients, and 4) ensemble: aggregating local-model predictions.Methods: Data was used from all 16 Dutch TAVI hospitals from 2013-2021 in the Netherlands Heart Registration (NHR). All approaches were internally validated. For the central and federated approaches, external geographic validation was also performed. Predictive performance in terms of discrimination (AUC), and calibration (intercept and slope, and calibration graph) was measured.The dataset comprised of 16,661 TAVI records with a 30-day mortality rate of 3.4%. In internal validation AUCs of central, local, FedAvg, and ensemble models were 0.68, 0.65, 0.67, and 0.67, respectively. Central and local models were miscalibrated by slope, while FedAvg and ensemble models were miscalibrated by intercept.During external geographic validation, central, FedAvg, and ensemble all achieved a mean AUC of 0.68. Miscalibration was observed for central, FedAvg, and ensemble models, in 44%, 44%, and 38% of the hospitals, respectively.: Compared to centralized training approaches, FL techniques like FedAvg and ensemble demonstrated comparable AUC and calibration. Use of FL techniques should be considered as a viable option for clinical prediction model development. Federated Learning Model Strategy Central Local FedAvg Ensemble Aspect No Recalibration Recalibration No Recalibration Recalibration Sharing Predictor data Yes (by design) No No No No No Outcome data Yes (by design) No No Yes No Yes Model parameters Yes (by design) No Yes Yes No No Predictions Yes (by design) No No Yes Yes Yes Optional: other models parameters Central imputation No Local imputation Local imputation; Central recalibration Local imputation; Local imputation; Central recalibration Calibration Yes (by design) Yes (by design, per center).