AUTHOR=Dykstra Steven , Satriano Alessandro , Cornhill Aidan K. , Lei Lucy Y. , Labib Dina , Mikami Yoko , Flewitt Jacqueline , Rivest Sandra , Sandonato Rosa , Feuchter Patricia , Howarth Andrew G. , Lydell Carmen P. , Fine Nowell M. , Exner Derek V. , Morillo Carlos A. , Wilton Stephen B. , Gavrilova Marina L. , White James A. TITLE=Machine learning prediction of atrial fibrillation in cardiovascular patients using cardiac magnetic resonance and electronic health information JOURNAL=Frontiers in Cardiovascular Medicine VOLUME=Volume 9 - 2022 YEAR=2022 URL=https://www.frontiersin.org/journals/cardiovascular-medicine/articles/10.3389/fcvm.2022.998558 DOI=10.3389/fcvm.2022.998558 ISSN=2297-055X ABSTRACT=Background: Atrial fibrillation (AF) is a commonly encountered cardiac arrhythmia associated with morbidity and substantial healthcare costs. While patients with cardiovascular disease experience the greatest risk of new-onset AF, no risk model has been developed to predict AF occurrence in this population. We hypothesized that a patient-specific model could be delivered using cardiovascular magnetic resonance (CMR) disease phenotyping, contextual patient health information, and machine learning. Methods: 9,448 patients referred for CMR imaging were enrolled and followed over a 5-year period. 7,639 had no prior history of AF and were eligible to train and validate machine learning algorithms. Random survival forests (RSFs) were used to predict new-onset AF and compared to Cox proportional-hazard (CPH) models. The best performing features were identified from 115 variables sourced from three data domains: i) CMR-based disease phenotype, ii) patient health questionnaire, and iii) electronic health records. We evaluated discriminative performance of optimized models using C-index and time-dependent AUC (tAUC). Results: A RSF-based model of 20 variables (CIROC-AF-20) delivered an overall C-index of 0.78 for the prediction of new-onset AF with respective tAUCs of 0.80, 0.79, and 0.78 at 1-, 2- and 3-years. This outperformed a novel CPH-based model and historic AF risk scores. At 1-year of follow-up, validation cohort patients classified as high-risk of future AF by CIROC-AF-20 went on to experience a 17.3% incidence of new-onset AF, being 24.7-fold higher risk than low risk patients. Conclusions: Using phenotypic data available at time of CMR imaging we developed and validated the first described risk model for the prediction of new-onset AF in patients with cardiovascular disease. Complementary value was provided by variables from patient-reported measures of health and the electronic health record, illustrating the value of multi-domain phenotypic data for the prediction of AF.