Electronic health record-wide association study for atrial fibrillation in a British cohort

Background Atrial fibrillation (AF) confers a major healthcare burden from hospitalisations and AF-related complications, such as stroke and heart failure. We performed an electronic health records-wide association study to identify the most frequent reasons for healthcare utilization, pre and post new-onset AF. Methods Prospective cohort study with the linked electronic health records of 5.6 million patients in the United Kingdom Clinical Practice Research Datalink (1998–2016). A cohort study with AF patients and their age-and sex matched controls was implemented to compare the top 100 reasons of frequent hospitalisation and primary consultation. Results Of the 199,433 patients who developed AF, we found the most frequent healthcare interactions to be cardiac, cerebrovascular and peripheral-vascular conditions, both prior to AF diagnosis (41/100 conditions in secondary care, such as cerebral infarction and valve diseases; and 33/100 conditions in primary care), and subsequently (47/100 conditions hospital care and 48 conditions in primary care). There was a high representation of repeated visits for cancer and infection affecting multiple organ systems. We identified 10 novel conditions which have not yet been associated with AF: folic acid deficiency, pancytopenia, idiopathic thrombocytopenic purpura, seborrheic dermatitis, lymphoedema, angioedema, laryngopharyngeal reflux, rib fracture, haemorrhagic gastritis, inflammatory polyneuropathies. Conclusion Our nationwide data provide knowledge and better understanding of the clinical needs of AF patients suggesting: (i) groups at higher risk of AF, where screening may be more cost-effective, and (ii) potential complications developing following new-onset AF that can be prevented through implementation of comprehensive integrated care management and more personalised, tailored treatment. Clinical trial registration NCT04786366


Background
Atrial fibrillation (AF) is the most frequently sustained cardiac arrhythmia worldwide (1).AF is a clinically heterogeneous condition that can have multiple distinct presentations, ranging from an absence of symptoms to palpitations, and from the development of heart failure or stroke to other cardiovascular and non-cardiovascular complications (2).Identification of disease associations or risk factors can be done following a hypothesis-driven approach based on the understanding of the disease pathophysiology, or through a "hypothesis-free" method, which may be advantageous in cases where there is an incomplete understanding of the pathophysiology, for example, genome-wide association studies (GWAS) to identify novel associations & involved pathways (3).
Similarly to GWAS, electronic health records (EHR) have been previously used for such an approach (4), and could be used for further characterizing the clinical heterogeneity of AF in the UK population through the identification of disease associations.This is important given that AF confers a major healthcare burden from hospitalisations and AF-related complications, such as stroke and heart failure.
We therefore conducted an EHR-wide association study to investigate the most frequent reasons for healthcare interactions pre-and post-AF diagnosis, as compared to individuals without AF, in both primary and secondary care (i.e., primary care consultations and hospitalizations).The findings of the study would provide a better understanding of additional healthcare utilization required in individuals prone to AF and may offer opportunities for screening and potentially preventing or delaying the development of the arrhythmia.Meanwhile, frequent clinical visits following an AF diagnosis help identify the progression of the disease and may be used to define AF clinical sub-phenotypes in routine care, whereas groups of AF patients with specific health service interaction patterns may benefit from tailored holistic or integrated care management approaches.

Methods
The Clinical Practice Research Datalink (CPRD) was established in 1987 (5) and as of 2018 includes 7,998,501 patients in the UK with linked data of primary care consultation, hospital data (Hospital Episodes Statistics, HES), national cancer registry (National Cancer Intelligence Network) and death registry data (Office for National Statistics, ONS) (6,7).The data are generally representative of the age, gender and geographic distribution of the UK population (5), and showed high quality and completeness of clinical information recorded (6)(7)(8).The present study was approved by the Medicines and Healthcare products Regulatory Agency Independent Scientific Advisory Committee [17_205].
Our cohort was composed of individuals aged 18 years or older registered in the current primary care practice for at least one year.The study period was between 1 January 1998 and 31 May 2016, and individuals were excluded if they had a prior history of AF before study entry.AF was defined from the International Classification of Diseases (ICD), tenth revision as I48 from HES and Read codes G573400, G573500, 3,272.00,G573000, G573300, G573.00,G573z00 from CPRD.
We implemented a matched case-control study for investigating the most common problems and comorbidities of AF patients when compared to controls.The primary diagnoses at their general practice (GP) consultation and hospitalisation within five years before and after the diagnosis date in individuals with AF were compared with that of their age and sex-matched controls.For each AF patient, the most frequent GP visits recorded in CPRD and primary diagnosis for hospital admissions documented in HES were identified within five years before and after the initial AF diagnosis.Similarly, we summarised the top conditions for most frequent clinical visits pre and post-index date in matched controls.
We reported the frequencies (%) of the conditions as the most frequent reasons for GP visits and hospital admissions in AF patients and their matched controls.We then summarised the differences between individuals with and without AF by relative frequency (frequency ratio) and reported the leading 100 conditions with descending frequency ratios.The uncertainty of the ratios was estimated with bootstrap distributions based on 2,000 samples with the Balanced Bootstrap Resampling method.We reported the clinical conditions requiring hospitalisation by disease groups (Table 1).We performed the analyses in Statistical Analysis System (version 9.4) and R (version 3.6.1).

Results
Between 1 January 1998 and 31 May 2016, 7,998,501 participants in CPRD were eligible for linkage.After excluding individuals aged <18 at study entry (n = 1,061,689), not registered with primary care during the study period (n = 407,430), with invalid entry and exit dates into the study (n = 927,579) and with a previous diagnosis of AF (n = 44,238) we obtained a study cohort of 5,557,405 individuals.Over a median of 10.3 years of follow-up (interquartile range: 4.8-15.0)there were 199,433 (3.6%) patients with a diagnosis of new-onset AF, who constituted our cases.These were matched to an equal number of age and sex-matched non-AF controls.The mean age was 75.8 years (SD 12.7) in patients and 75.7 years (SD 12.7) in matched controls.
We observed a predominance of repeated hospitalisation or GP consultation owing to cardiac, cerebrovascular and peripheral vascular conditions prior to AF diagnosis (33 of the top 100 causes in primary care and 41 of the top 100 causes in secondary care) and also subsequently (47 of the top 100 causes in primary care and 48 of the top 100 causes in secondary care).A synthesis of our findings by disease groups is presented in Tables 1, 2.
For frequent GP consultations, compared with matched controls, chronic obstructive pulmonary disease, primary pulmonary hypertension, and acute non-ST myocardial infarction were the leading conditions among AF patients prior to diagnosis.(Frequency ratio: 76.1, 35.1, 28.9, respectively.Supplementary Table S4) Individuals with AF were also more likely to frequently visit GP for conditions that seemed unclear at the time (uncertain diagnosis, frequency ratio: 48).After diagnosis, cardiomyopathy, conjunctivitis and multiple organ failure were the leading reasons for frequent GP consultations among AF patients vs. matched controls.(Supplementary Table S5).
The four iris-plots in Figure 1 represent the top reasons (displayed as relative risk vs. controls, the length of each bar, based on Supplementary Tables S2-S5) for healthcare utilization (primary and secondary care) prior and post new-onset AF.The 14 diagnostic groups are organized by different colours and clockwise.The left upper-panel shows an over-representation of cardiac and cancer-related hospitalizations prior to new-onset AF and the right upper-panel shows over-representation of cardiac and cerebrovascular hospitalizations following new-onset AF.In primary care we observed is over-representation of cardiacrelated visits both pre and post new-onset AF.Also, representation of consultations due to infectious disease diagnoses seems to increase following new-onset AF.

Discussion
We report, using a hypothesis-free approach, the most frequent reasons for repeated hospitalisations or primary care consultations by comparing patients with AF and their non-AF-matched peers.We found a predominance of repeated clinical visits due to cardiac, cerebrovascular and peripheral vascular conditions in AF patients vs. matched controls, suggesting the development of arrhythmia may contribute to the aggravation or worsening cardiovascular or cerebrovascular health.Our findings are in agreement with previous literature providing evidence on important AF complications and comorbidities, such as stroke (9), vascular dementia (10), valvular heart disease (11), myocardial infarction (12), hypertrophic cardiomyopathy (13), and heart failure ( 14), but also raise hypotheses towards some other unknown or less reported associations with other vascular disorders, such as peripheral artery disease, aortic aneurysms, which may require validation in a different cohort.Some of these associations, like peripheral artery disease (15), aortic aneurysms (16), and ventricular arrhythmias (17) have been suggested in the literature.Also, we have observed other less frequent causes that have been episodically reported (summarised in Supplementary Table S1).
Our data show excess repeated clinical visits affecting multiple organ systems, including for cancer (especially prior to an AF diagnosis) and frequent visits due to infection (mainly after newonset AF).The association between AF and cancer had previously been reported (18,19).Similarly, sepsis (20) and other forms of infection (21) have been reported as important complications associated with new-onset AF.
Our findings further reinforce the importance and high prevalence of comorbidities such as obesity (22) and associated conditions like obstructive sleep apnoea (23), and dysglycaemia in the AF population (24).Some of the reasons for repeated clinical visits occurring prior to an AF diagnosis are potentially reversible, with folic acid or vitamin D deficiency, hyponatraemia, hypomagnesemia, hyperkalaemia, iron deficiency, impaired fasting glycemia and glucose tolerance, being a few examples.Other reasons can have their onset delayed with proper intervention, such as type 2 diabetes mellitus (25) and hypertension.Our results suggest the timely detection and treatment or prevention of some of these conditions may potentially prevent or defer the development of AF.For example, the reduction of AF progression with weight loss has been previously demonstrated in the REVERSE-AF study (26).Some of the reasons for clinical visits we observed may be the consequence of AF-related interventions.Following the diagnosis of AF, we observed an increase in repeated clinical visits for bleeding events, potentially resulting from the increase in the Frontiers in Cardiovascular Medicine uptake of anticoagulants.Similarly, we also observe visits due to thyrotoxicosis after the diagnosis of AF.The association of hyperthyroidism with AF is well known (27), and with the frequent use of amiodarone in our cohort (28), thyrotoxicosis could partly be due to its concomitant use.
To the best of our knowledge, some of the reasons for frequent clinical visits experienced by AF patients in this study (folic acid deficiency, pancytopenia, idiopathic thrombocytopenic purpura, seborrheic dermatitis, lymphoedema, angioedema, laryngopharyngeal reflux, rib fracture, haemorrhagic gastritis and inflammatory polyneuropathies) have not yet been reported in the literature, and further studies would be recommended.
We reported for the first time the four iris-plots of healthcare utilization for AF patients.Each of these four iris-plots together provides an "iris pattern" similar to the one observed in the human eye.The "iris pattern" is considered unique to each individual, utilized as a biometric measure used for authentication, and known to more accurate than a fingerprint (29).

Practice implications and future research
The current practice and management of AF emphasise more comprehensive management, as highlighted in the ABC (Atrial fibrillation Better Care) pathway (30) that promotes a holistic or integrated care approach that has been associated with improved clinical outcomes (31, 32) leading to its recommendations in guidelines (33,34).Also, greater focus on comorbidities, such as the recent HEAD-2-TOES (35), going beyond the arrhythmia to associated cardiovascular and non-cardiovascular health conditions.We utilised nationwide data to provide knowledge and a better understanding of how AF patients frequently interact with the health system.Insights from healthcare contacts may facilitate improvements in preventive and treatment strategies for better management and prevention of subsequent outcomes.The shifting patterns of reasons for excess clinical visits in AF patients may contribute to the elucidation of the clinical heterogeneity of AF.Grouping the reasons AF patients seek medical attention by organ system or disease type, we propose the following AF clinical subphenotypes: (1) Vascularassociated with atherosclerotic, cerebrovascular and peripheral artery disease; (2) Myopathicassociated with heart failure, cardiomyopathies; (3) Valvularassociated with heart valve involvement; (4) Neoplasticassociated with cancer, potentially as part of inflammation or paraneoplastic syndrome; (5) Infectiousarising as a result of infection and in patients more prone to develop infection; (6) Endocrine or Metabolicassociated with obesity and impaired glucose tolerance or diabetes; (7) Senileoccurring as a result of aging and associated decline and comorbidities.Due to their nature and associated comorbidities, the partial overlap is likely and multiple clinical subphenotypes and treatment needs can coexist in the same individual.
As an example, one patient may have AF with both Vascular and Metabolic clinical sub-phenotype, whilst a different patient can present with an overlap of Infectious and Senile AF clinical subphenotypes.Further studies are required for understanding the risk factors and prognosis of the proposed AF clinical phenotypes, and methods like cluster analyses or equivalent may be applicable.An in-depth understanding of these clinical phenotypes could not only improve the knowledge of the pathways involved and their interplay but optimise AF prevention and treatment.
A better understanding of the conditions and causes for frequent clinical visits preceding the diagnosis of AF may help identify higher-risk subgroups of patients, such as patients with cancer, cardiovascular, cerebrovascular, peripheral vascular disease, respiratory or gastrointestinal disease, for whom a targeted screening strategy may derive more pronounced benefits, including savings in healthcare costs (36).This is important given systematic screening of the general or certain population subgroups has yielded low or negative net benefits (37).

Research implications
An electronic health record (EHR)-wide association study has been previously conducted for better characterizing COVID-19 outcomes (4) but has not been utilized in the field of cardiovascular disease.In this first AF EHR-wide association study, we found the excess clinical visits associated with AF differed by the primary and secondary care and by before and after the diagnosis date.Our experience indicates the importance of considering the type of healthcare and timing of individuals interacting with the care system in applying the EHR-wide association study method.
In this paper we present the "iris pattern" for AF as a disease interacting with primary and secondary care.Even though "iris patterns" have not yet been reported for other diseases, we can hypothesize that as risk factors and their combination differ across different diseases, different "iris patterns" of healthcare utilization, possibly unique to each disease entity, may also exist.This hypothesis requires further clarification, but if proven may help tackle disease specific aspects at primary and secondary prevention level (i.e., prior and post disease onset).

Strength and limitations
The strengths of our research are the power of electronic health records enabling high-resolution analyses for detailed clinical conditions.The generalisability of our findings based on population electronic health records from routine care allows the findings to be applicable to the general clinical practice in comparable populations.The comprehensive scope of conditions that occur in the AF population and their non-AF peers as documented in our analyses raises the possibility and need for a more individualized system contemplating the needs of specific patients or groups of patients (AF clinical sub-phenotypes), some of whom represent clinically complex patients (38).However, there were also some limitations, including the absence of complete information on AF types (paroxysmal, persistent, permanent).However, the clinical utility of such a temporal pattern/episode-based classification remains inconclusive (39).Additionally, even though CPRD patients are thought to be broadly representative of the UK general population in terms of age, sex and ethnicity (40), and a recent study suggesting that CPRD populations may be representative of the general UK populations in terms of socioeconomic status and rural-urban classification (41), we lack the evidence to say if this population is truly representative of the total AF population in the UK.An ongoing study to develop an artificial intelligence model to identify AF patients in the UK may provide an answer to this important area of uncertainty (42).The study will compare two CPRD datasets: CPRD-Gold (which we have utilized to address our research protocol aiming at clarifying the natural history of AF in the UK) and CPRD-Aurum (established in 2017 and comprising 26.9 million additional individuals).Finally, our analyses did not utilize a competing risk framework and have not accommodated for this potential source of bias.Mortality is known to be higher in AF patients when compared to peers (12).Therefore, it is expected that mortality will be a competitive event in relation to the evaluated diagnoses in the post-AF comparisons performed in our analyses.The earlier mortality of AF patients may have precluded the occurrence of the events of interest being assessed, potentially leading to sub-diagnosis.

Conclusion
In this EHR-wide study for AF, we found cardiac, cerebrovascular and other vascular problems are still amongst the most frequent comorbidities in this population, but other disease groups, such as infection and cancer were also frequent.The need for early detection of AF and management of comorbidities may inform targeted early diagnosis and optimal care strategies for AF.

FIGURE 1
FIGURE 1Top 100 reasons for hospitalisation (2ary care) and GP consultations (1ary care) in atrial fibrillation patients, compared to controls, within 5 years before and in the 5 years after new-onset AF diagnosis.The four iris-plots represent the top reasons (displayed as relative risk vs. controls) for healthcare utilization (primary and secondary care) prior and post new-onset AF.The 14 diagnostic groups are organized by different colours (legend) and clockwise as follows, as per the upper-left iris-plot: bleeding/haemorrhagic (12 h), cancer (1 h to 2 h), cardiac (3-4 h), cerebrovascular (5 h), "endocrine, nutritional or metabolic" & "frailty or multimorbidity" (6 h), gastrointestinal (7 h), haematological and infectious (8 h), osteoarticular & other (9 h), "peripheral other vascular" (10 h), renal (11 h), and respiratory (11-12 h).Different scales were used in the 4 iris-plots to accommodate with the different range of relative risks.

TABLE 1
Leading 100 frequent causes for hospitalisation or general practise consultation prior to the index date comparing AF and matched controls.

TABLE 2
Leading 100 frequent causes for hospitalisation or general practise consultation following the index date comparing AF and matched controls.