Comorbidity Trajectories Associated With Alzheimer’s Disease: A Matched Case-Control Study in a United States Claims Database

Background: Trajectories of comorbidities among individuals at risk of Alzheimer’s disease (AD) may differ from those aging without AD clinical syndrome. Therefore, characterizing the comorbidity burden and pattern associated with AD risk may facilitate earlier detection, enable timely intervention, and help slow the rate of cognitive and functional decline in AD. This case-control study was performed to compare the prevalence of comorbidities between AD cases and controls during the 5 years prior to diagnosis (or index date for controls); and to identify comorbidities with a differential time-dependent prevalence trajectory during the 5 years prior to AD diagnosis. Methods: Incident AD cases and individually matched controls were identified in a United States claims database between January 1, 2000 and December 31, 2016. AD status and comorbidities were defined based on the presence of diagnosis codes in administrative claims records. Generalized estimating equations were used to assess evidence of changes over time and between AD and controls. A principal component analysis and hierarchical clustering was performed to identify groups of AD-related comorbidities with respect to prevalence changes over time (or trajectory), and differences between AD and controls. Results: Data from 186,064 individuals in the IBM MarketScan Commercial Claims and Medicare Supplementary databases were analyzed (93,032 AD cases and 93,032 non-AD controls). In total, there were 177 comorbidities with a ≥ 5% prevalence. Five main clusters of comorbidities were identified. Clusters differed between AD cases and controls in the overall magnitude of association with AD, in their diverging time trajectories, and in comorbidity prevalence. Three clusters contained comorbidities that notably increased in frequency over time in AD cases but not in controls during the 5-year period before AD diagnosis. Comorbidities in these clusters were related to the early signs and/or symptoms of AD, psychiatric and mood disorders, cerebrovascular disease, history of hazard and injuries, and metabolic, cardiovascular, and respiratory complaints. Conclusion: We demonstrated a greater comorbidity burden among those who later developed AD vs. controls, and identified comorbidity clusters that could distinguish these two groups. Further investigation of comorbidity burden is warranted to facilitate early detection of individuals at risk of developing AD.


INTRODUCTION
Alzheimer's disease (AD) is the most common cause of dementia, accounting for 60-70% of cases. The prevalence of AD increases with age, with a global prevalence of 5-8% in people 60 years and older [World Health Organization [WHO], 2021]. While AD has previously been considered to have discrete and clearly defined clinical stages, it is now more usually considered to be a seamless continuum from an asymptomatic phase through a long preclinical period, to a symptomatic phase in which cognitive and then functional impairment become increasingly evident (Dubois et al., 2016;Aisen et al., 2017;Jack et al., 2018). Furthermore, while the terms "mild cognitive impairment (MCI)" or "prodromal AD (pAD)" and "mild AD" have traditionally been used in clinical trials to describe the early stages of AD, these are often studied together and referred to as "early AD" patients (Siemers, 2021).
Evidence suggests that treatment earlier in the disease continuum is likely to achieve greater disease modification and slow the rate of cognitive and functional decline (Dubois et al., 2016;Aisen et al., 2017;Jack et al., 2018). However, AD is only usually diagnosed once clinical symptoms become apparent, which may be as long as 15 years after the first pathological changes occur, leading to delays in treatment and potentially lost clinical benefit (Dubois et al., 2016;Aisen et al., 2017). Even after symptoms of AD become clinically evident, there exists a large population living with dementia who remain undiagnosed (Lang et al., 2017;Amjad et al., 2018;Genovese et al., 2018;Grandal Leiros et al., 2018). It is thought that among older adults with probable dementia (including AD), most (58.7%) were either undiagnosed (39.5%) or unaware of the diagnosis (19.2%) (Amjad et al., 2018). A meta-analysis of 23 studies conducted between 1988 and 2015 in community and residential settings reported a 61.7% pooled rate of undetected dementia (Lang et al., 2017).
This underdiagnosis may be due in part to a low dementia diagnosis rate in primary care (Boise et al., 1999;Geldmacher and Kerwin, 2013;Jørgensen et al., 2015;Lang et al., 2017;Small, 2017). Currently in the United States (US), a diagnosis of dementia in primary care is largely reliant on the selfpresentation of a patient on the basis of symptoms or caregiver concerns (Iliffe et al., 1991;McCormick et al., 1994;Bradford et al., 2009), such that many cases go undiagnosed until late in the disease (McCormick et al., 1994). For those patients who do present, referral to a specialist then requires the primary care physician to act on a clinical suspicion (Brayne et al., 2007), which is itself prone to being missed or delayed (Iliffe et al., 1991;Callahan et al., 1995;Bradford et al., 2009). The availability of specialist tools to help evaluate whether a patient needs to be referred for specialist care may save time and expedite any decision-making process, potentially increasing the rate of diagnosis of AD.
Certain chronic medical conditions, including type 2 diabetes (T2DM), hypertension, coronary artery disease, and depression, are established risk factors for cognitive decline (Artero et al., 2008;Vicini Chilovi et al., 2009;Li et al., 2012;Roberts and Knopman, 2013;Imtiaz et al., 2014;Johnson et al., 2015;Vassilaki et al., 2015;Fan et al., 2017). These conditions are also common in multimorbidity (defined as at least two comorbid conditions) in older adults, which may also be associated with biomarkers of the preclinical AD stages (Sperling et al., 2011;Jack et al., 2014;Sperling et al., 2014) and suspected non-amyloid pathophysiology (Jack et al., 2016;Vassilaki et al., 2019), even before clinically detectable cognitive decline becomes apparent. Not only is there an increase in the prevalence of comorbidities among patients at risk of AD, but multimorbidity, a distinctive hallmark of aging and potentially a clinical marker of accelerated aging (Fabbri et al., 2015), is also associated with increased risk of cognitive impairment (Palmer et al., 2007;Vassilaki et al., 2015;Santiago and Potashkin, 2021). There is also evidence suggesting that the trajectories of comorbidities among individuals at risk of AD differ from those who are simply undergoing the normal process of aging (Oveisgharan and Hachinski, 2010;Velayudhan et al., 2010;Xu et al., 2010). Therefore, an evaluation of comorbidities and their trajectories during the early stage of disease is highly relevant in characterizing the natural history of AD dementia. In this way, identifying distinctive patterns of comorbidities, including signs and symptoms of early AD, may enable more timely cognitive assessment and specialist referral for an evaluation of AD diagnosis.
A data-driven approach was used in this analysis to identify comorbidities that occur before AD diagnosis that are associated with the development of AD. Incident AD cases and matched non-AD controls from the general population were identified in a US claims database and used to investigate comorbid diagnoses that occurred during the 5 years prior to a first diagnosis of AD. A window of 5 years to capture patients with early AD was set on the basis that the median duration between the onset of dementia-related symptoms and assessment or diagnosis is typically up to 3 years, according to literature reports (Boise et al., 1999;Knopman et al., 2000;Wackerbarth and Johnson, 2002;Wilkinson et al., 2004;Fiske et al., 2005;Speechly et al., 2008;Carpentier et al., 2010;van Vliet et al., 2013;Zhao et al., 2016). The methodology used in this analysis is a new application of a standard method used to identify patterns inherent in data.
This analysis has two primary objectives: (1) to compare the prevalence of comorbidities between AD cases and non-AD controls during the 5 years prior to diagnosis; and (2) to identify FIGURE 1 | Flowchart of case inclusion process. *Cases were required to have at least two claims for AD on separate days at age ≥ 50 years (ICD-9-CM: 331.0 or ICD-10-CM: G30.x) and a 5-year period of continuous enrollment prior to first AD diagnosis, while eligible controls for each case were required to have no claims for AD (ICD-9-CM: 331.0 or ICD-10-CM: G30.x) during the 5-year window prior to the respective cases index date. Matching was based on sex; year of birth; insurance plan type at index; relationship to insurance plan holder; (previous) employment industry; and US region. AD, Alzheimer's disease; ICD, International Classification of Diseases.
comorbidities with a time-dependent prevalence trajectory during the 5 years prior to AD diagnosis that is differential among cases, compared with controls.

Study Design and Setting
This was a retrospective, observational, case-control study conducted in the US using data from the IBM MarketScan R Commercial Claims and Medicare Supplementary databases.

Study Population
The study population consisted of individuals with AD ("cases") and a matched group of individuals without AD ("controls") ( Figure 1). Cases were required to have at least two claims for AD on separate days at age ≥ 50 years [International Classification of Diseases (ICD)-9-CM: 331.0 or ICD-10-CM: G30.x] and a 5-year period of continuous enrollment prior to first AD diagnosis, while eligible controls for each case were required to have no claims for AD (ICD-9-CM: 331.0 or ICD-10-CM: G30.x) during the 5-year window prior to the respective cases index date. All eligible cases from the database were included in the analysis. The index date for cases was the first AD diagnosis date. For controls, the index date was set to the same date as the individually matched case.

Matching
For each AD case, a control (1:1) was selected randomly and without replacement from the pool of eligible controls, as defined above. Matching was based on sex; year of birth (hence also age, given the same index date); insurance plan type at index [e.g., Health Maintenance Organization (HMO); Preferred Provider Organization (PPO); Point of Service (POS); comprehensive]; relationship to insurance plan holder (employee or spouse/other); employment industry; and US region (West, Northeast, Midwest, or South).

Data Source
Data for this analysis were extracted from the IBM MarketScan R Commercial Claims ("Commercial") database and the Medicare Supplementary ("Medicare") database. The commercial database contains active employees, early retirees, and dependents insured by employer−sponsored plans, while the Medicare database covers Medicare-eligible retirees (≥ 65 years) with employersponsored Medicare Supplementary plans. Both data sets were analyzed together in order to allow patients to be tracked from employment through into retirement.
Because the database is based on insurance claims, individuals are able to drop in and out of enrollment in the database. The "continuous enrollment period" was therefore defined on an individual-person level, based upon medical insurance coverage. Gaps in enrollment of up to 62 days (2 months) were allowed so long as this gap was contained by periods of documented enrollment before and afterward. Continuous enrollment periods had maximum boundaries of the study period (January 1, 2000 to December 31, 2016).

Comorbidity Definitions
The presence of individual comorbidities was evaluated in each of the 5-yearly intervals prior to AD diagnosis, based upon the occurrence of at least one diagnosis claim (code) in the relevant time period. Diagnosis recorded in the database was based upon the International Classification of Diseases, Ninth Revision, Clinical Modification (ICD-9-CM) up until 30 September 2015, and thereafter was based on ICD-10-CM criteria. For the purposes of this study, all ICD-10-CM codes were converted to ICD-9-CM prior to further grouping.
Comorbidities were grouped into approximately 1,200 categories, based upon the first three digits of each ICD-9-CM code. The decodes for these three-digit sub-chapters can be found in the "icd" r package (Wasey et al., 2021). Comorbidities with a total 5-year occurrence of ≥5% (across both AD cases and controls, combined) were kept.

Ethical Considerations
This was a retrospective, observational study of secondary use data. All personal information used was de-identified (with no possibility of linkage back to individual identified patients) and is compliant with the US Health Insurance Portability and Accountability Act (HIPAA).

Descriptive Analyses
Demographic/personal characteristics of AD cases and non-AD controls from both cohorts were summarized using means and standard deviations (SDs) for continuous variables and frequencies and percentages for categorical variables. For both AD cases and non-AD controls, the proportion of patients with each comorbidity with ≥ 5% prevalence, over the 5 years prior to index date, were reported.

Associations Between Alzheimer's Disease and Comorbidities
Generalized estimating equations (GEE) were used to estimate the odds of each comorbidity as a function of time (yearly intervals prior to index), AD diagnosis status, and their interaction. Odds ratios (ORs) were estimated, and hypothesis tests were conducted on the following model: where pijk = proportion of subjects who reported a claim for comorbidity i at year k prior index date (k = 1,. . ., 5), having AD diagnosis status (j = 1, 2). η is the logit's general mean, α is the log odds of AD vs. control, β is the change in the log odds by change in 1 year, and γ is the difference of log odds changes per year between AD cases and controls. Standard errors were computed using the sandwich robust variance estimator (Liang and Zeger, 1986), assuming an unstructured withinsubject covariance matrix. Multiple testing correction (False Discovery Rate) was applied to account for multiple testing across comorbidities.

Multivariate Analysis: Principal Component Analysis and Hierarchical Cluster Analysis
In order to identify groups of comorbidities associated with AD that varied in terms of frequency changes over time and between AD and non-AD control groups, four metrics from the GEE models were considered; two denoting the magnitude of the differences between AD vs. controls (#1 and #2) and two related to the level of evidence of such differences (#3 and #4): (1) the difference in log odds of AD vs. non-AD controls (i.e., coefficient α centered at the mean follow-up time prior to AD diagnosis); (2) the log 10 scaled p-value associated with the hypothesis test of the centered α; (3) the interaction term, γ, which denotes the difference of slopes between changes over time of AD patients vs. controls; (4) the log 10 scaled p-value associated with the hypothesis test of γ. Metrics 1 and 2 assessed the overall difference in AD vs. controls comorbidities over the period prior to index date, whereas metrics 3 and 4 assessed the difference in the comorbidity trajectory over time between AD and controls, during the 5-year period prior to index date.
A data matrix of dimensions n (number of comorbidities) and p = 4 (the four metrics selected) was created and submitted to PCA, where a new set of orthogonal variables were obtained. The new data matrix was analyzed by hierarchical clustering with Ward's grouping algorithm along with Euclidean distances in order to identify AD comorbidities by their changes over time and between groups.

Demographic Characteristics
Data from 186,064 individuals in the IBM MarketScan R Commercial Claims and Medicare Supplementary databases were analyzed (93,032 AD cases and 93,032 non-AD controls) ( Table 1). Overall, 59% of the population was female. The study population was predominantly older adults, with an average age of 82 years. Seventeen percent of participants were aged 90 years or older. The majority of the included population (57%) had comprehensive insurance, followed by PPO (32%). Most participants resided in the North-Central US region (45%), followed by the Southern states (29%).

Prevalence and Association of Comorbidities, Signs, and Symptoms During 5 Years Prior to Alzheimer's Disease Diagnosis
In total, 177 comorbidities were identified with a prevalence of ≥5% (Supplementary Table 1). Of these, the individual comorbidities [(ICD-9 code; prevalence (%)] with the highest 5-year pooled prevalence prior to index date across AD cases and in controls were: essential hypertension (401; 74.5%), general symptoms (780; 66.9%), symptoms involving the respiratory system (786; 65.6%), disorders of lipoid metabolism (272; 54.8%), and other and unspecified disorders of joint (719; 53.3%) ( Table 2). However, the comorbidities with highest ORs in AD cases compared with controls in the period prior to index date were: persistent mental disorders due to conditions classified elsewhere (294), other non-organic psychoses (298), other cerebral degenerations (331), transient mental disorders due to conditions classified elsewhere (293), general symptoms (780), other conditions of the brain (348), episodic mood disorders (296), and depressive disorders not elsewhere classified (331), among others (Supplementary Figure 1).

Multivariate Analysis
Four principal components (PCs) were obtained from the PCA analysis conducted on the four metrics used to differentiate comorbidities. The first, second, third, and fourth PCs explained 70.3, 15.4, 11.0, and 3.3% of variance in the data, respectively. Supplementary Figures 2A,B display the distribution of the five clusters of comorbidities in biplots for the first and second PCs and in the first and third PCs, respectively. Five main clusters of comorbidities were found from the hierarchical cluster analysis conducted on the four PCs (Figure 2 and Supplementary  Figure 3). Clusters 1, 3, and 5 consisted of comorbidities with higher ORs and smaller p-values for the comparison of AD vs. controls mid-term prior to index date (Figures 2C,D). Although clusters 1, 2, and 3 contained comorbidities with higher ORs for the differential time trajectories between AD vs. controls, cluster 1 stood out as the collection of comorbidities with the largest differential time-dependent trajectories among AD vs. controls and smaller p-values for the interaction terms (Figures 2A,B).  Figure 4. Cluster 1 contained 18 comorbidities with a large difference in trajectories, increasing rapidly in the AD group prior to diagnosis, but not in the controls (Supplementary Figure 4A). Comorbidities were quite prevalent overall (generally >15% in AD and non-AD combined groups). This cluster included terms such as persistent mental disorder (294), non-organic  Table 1).
Cluster 2 contained 19 comorbidities with marked differences in time trajectories between AD vs. non-AD (Supplementary Figure 4B), lower OR for the overall AD vs. non-AD comparison in the period prior to AD diagnosis (Figure 3A), and with lower overall prevalence (generally <15% in AD and non-AD combined, Figure 3B). This cluster included diseases of the kidneys, such as chronic (585) and acute (584), and other (593) kidney disease; and lungs, such as emphysema (492), pneumonia (486), and chronic bronchitis (491). Other comorbidities not related to lung or kidney were also included, such as bacterial infections (Wilkinson et al., 2004), septicemia (Speechly et al., 2008), vertebral fractures (805), and diseases of white blood cells (288) (Supplementary Figure 4B and Supplementary Table 1).
Cluster 3 contained 22 comorbidities, also with large trajectory differences between AD cases and controls (Supplementary Figure 4C). The difference with cluster 1 was that the statistical significance of the trajectory differences were not as strong and that comorbidities in this group had lower prevalence ( Figure 3B). Comorbidities included psychiatric comorbidities such as depression (311) and anxiety (300) but also included terms related to care and rehabilitation procedures (V57), personal history of hazard to health (V15), nutrition, metabolism and development (783) Table 1).
Cluster 4 was the largest group with 65 comorbidities. This cluster included comorbidities with the most similar trajectories over time (Figure 3A and Supplementary Figure 4D), and a more similar prevalence between cases and controls. In addition, some comorbidities became less prevalent in both groups over time (e.g., benign neoplasm of skin; 216).
Cluster 5 comorbidities had both trajectory differences and overall AD vs. non-AD prevalence differences that were more subtle than shown for comorbidities in cluster 3 (Figure 3A and Supplementary Figure 4E), despite the relatively high prevalence of some symptoms such as symptoms of the digestive system (787), osteoarthritis (715), disorders of lipoid metabolism (272), and cataract (366). This cluster was also large (51 comorbidities), and contained a more heterogeneous group of comorbidities, both in terms of prevalence and the affected system and/or organ ( Figure 3B and Supplementary Table 1).

DISCUSSION
This project used a novel application of longitudinal data modeling and multivariate analysis to identify clusters of comorbidities that occur in patients' records prior to diagnosis FIGURE 3 | (A) Odds ratios for the overall AD cases vs. non-AD controls comparison of comorbidities and for the time trajectories of comorbidities during the 5-year period before AD diagnosis. The colored circled numbers refer to the cluster that the comorbidity belongs to (i.e., red-colored 1 is cluster 1, etc.). (B) Distribution of the average comorbidity prevalence during the 5-year period before AD diagnosis by cluster. AD, Alzheimer's disease; Ctr, Control.
with AD. We identified clusters of AD comorbidities that may diverge in terms of frequency changes over time and between AD and non-AD control groups.
Cluster 1 offered the largest overall differences between AD cases and controls prior to diagnosis and, moreover, differential trajectories over time in AD cases compared with controls. The comorbidities in cluster 1 were fairly prevalent overall (>15% in AD and non-AD combined). Some comorbidities could be related to the early signs/symptoms of AD, such as persistent mental disorders and other cerebral degenerations (including MCI). However, also in cluster 1, with similar prevalence and trajectory differences prior to AD diagnosis, were serious comorbidities of other organs classes such as the respiratory and cardiovascular systems. These comorbidity classes have previously been shown to be associated with progression (Jutkowitz et al., 2017a,b;Koskas et al., 2017) and risk of AD and dementia (Bauer et al., 2014;Ruthirakuhan et al., 2019).
Similarly, other comorbidities in clusters 2, 3, and 5 showed differences between cases and controls, even if not directly related to AD. This shows that the comorbidity burden starts years prior to AD, during the early AD or MCI phase of progression or even earlier. For example, not only were relatively less frequent comorbidities of the kidneys and lungs (among others) more prevalent prior to AD diagnosis (cluster 2), but so were depression, anxiety, and comorbidities related to accidents or injuries (for example, history of personal hazards, open head wounds, and contusion to various body parts; cluster 3). These comorbidities have been associated with lower health-related quality of life in patients with AD (Barbe et al., 2018), as well as risk factors for cognitive impairment (Krell-Roesch et al., 2021). Falls are considered a marker of cognitive impairment, and an increased risk of falls has been reported among adults within the early, preclinical stage of AD (Stark et al., 2013). Together these and our findings reflect the need for additional measures to ensure patient safety around the home or in care facilities, respectively.
A potential limitation of this study is that comorbidities may be recorded more frequently in some patients' records, simply due to more encounters with the healthcare system during the work up of an AD diagnosis. However, cluster 4 was the largest group (65 comorbidities), and contains comorbidities without large differences between cases and controls. Small imbalances that do remain in cluster 4 are much lower in magnitude than for the differences between cases and controls in the other clusters, thus providing evidence of true elevated comorbidity burden in early AD, above any systematic differences due to reporting bias.
There is a real need to better characterize patients either at risk of developing AD or with AD early in the course of their disease, to allow early intervention that could slow the rate of cognitive and functional decline (Dubois et al., 2016;Aisen et al., 2017;Jack et al., 2018). However, AD still tends to be diagnosed at a relatively advanced stage, meaning that the opportunity for early intervention is lost (Bature et al., 2018;Barnes et al., 2020).
AD usually has a slow progression (No authors listed, 2020), and although what defines AD as a unique neurodegenerative disease (among others conditions that could lead to dementia) are the β-amyloid plaques and the neurofibrillary tau deposits (Jack et al., 2018), there is still a lot of work to be done to delineate the AD pathogenesis causal pathways. Although, not all current findings are amenable to an easy interpretation, novel research (Wang et al., 2021) suggests that multiple pathological pathways could be involved in AD pathogenesis such as unresolved neuroinflammation, abnormal glucose metabolism, vascular alterations, mitochondria dysfunction; pathological processes present in many comorbidities (e.g., vascular conditions, diabetes, infections) in the present study.
Considerable research has been devoted over recent years to the use of biomarkers to diagnose AD early, and this approach has shown promising results (Frisoni et al., 2017;Blennow and Zetterberg, 2018;Giorgio et al., 2020). However, as most cases of AD are diagnosed in primary care, it is important to have a simple and convenient tool that is readily available to GPs to identify patients who may be at risk of progressing to AD dementia (Iliffe et al., 1991;McCormick et al., 1994;Bradford et al., 2009). It may be possible in a primary care setting to flag a patient's chart if a pattern of comorbidities is observed within a short period of time. This would then prompt healthcare professionals to inquire about memory concerns and possibly refer the patient to a specialist for cognitive testing and/or any imaging or fluid biomarkers available, including, but not limited to, magnetic resonance imaging (MRI), positron emission tomography (PET), and/or cerebrospinal fluid (CSF) Aβ and tau tests (Leocadi et al., 2020;Leuzy et al., 2021). The results of the current study add to and expand previous work toward the development of such a tool.
The study findings need to be viewed in light of the following limitations. There was an opportunity for misclassification of AD status among cases in this study because status was defined only by the presence of a diagnosis code for AD and not any biomarker or pathology data. Thus, the AD cases in this study are likely to have "Alzheimer's clinical syndrome, " or what has been previously referred to as "clinically probably AD" (Jack et al., 2018). Another potential reason for misclassification of AD status is that the study cohort is skewed toward older age where seventeen percent are 90 years or older. It is possible that the older AD cases actually have other forms of dementia and/or neurodegenerative disorders with similar clinical presentation to AD and that are more common in older adults, such as limbic-predominant age-related TDP-43 encephalopathy (Nelson et al., 2019). Other forms of neurodegenerative disorders may be associated with a unique set of comorbidities, confounding interpretations. The statistical analysis also has a number of limitations, including the hierarchical clustering used to define groups. Although we used robust distance metrics on orthogonal data coordinates, alternative grouping-algorithms might have achieved different groupings. In addition, patients were grouped at the 3-digit ICD level as a short-hand for the more detailed patient histories collected. Although further refinement to lower-level codes may have yielded insights on more specific comorbidities, the scope of this study using an exploratory statistical approach was best suited for analyses with the higher-level groupings. Another potential limitation is that clusters were defined based on metrics related to overall differences in prevalence and trajectories; although our model assumed linear trajectories over time, and p-values are dependent on prevalence of comorbidities, not only the effect size. Finally, all comorbidities with prevalence < 5% across cases and controls were excluded, and no terms that captured associations other than linear shapes (or differences) were included based on the four metrics described in this study.
Strengths of the study include the large study size and follow up, taken from objectively and systematically collected data sources. Cases and controls were matched based on a number of factors including sex; year of birth; insurance plan type at index date; relationship to insurance plan holder; (previous) employment industry; and US region.

CONCLUSION
Although we demonstrated a greater comorbidity burden among those who later developed AD (vs. those who did not), it cannot be ruled out that the observed relationship between comorbidity burden and AD was due in part to residual confounding by underlying factors and/or mechanisms related to aging, given that multimorbidity is associated with accelerated aging (Fabbri et al., 2015). We also identified clusters of comorbidities that could distinguish AD cases and non-cases. Further investigation of comorbidity clusters is warranted to facilitate early detection of individuals at risk of developing AD.

DATA AVAILABILITY STATEMENT
The data analyzed in this study is subject to the following licenses/restrictions: Commercial claims databases are available with subscription payment. Requests to access these datasets should be directed to RH, richard.houghton@roche.com.

AUTHOR CONTRIBUTIONS
LB, RH, and GD-P developed the study concept, protocol, wrote the manuscript, and statistical analysis plan. LB, RH, AA, GD-P, and MV assisted in the interpretation of the findings and provided critical revision to the manuscript. GD-P and AA performed the statistical analyses. All authors read and approved the final manuscript.

FUNDING
This study received funding from F. Hoffmann-La Roche AG. Editorial assistance for the manuscript was funded by F. Hoffmann-La Roche AG. The funder was not involved in the study design, collection, analysis, interpretation of data, the writing of this article or the decision to submit it for publication.