Genetic Overlap Between Alzheimer’s Disease and Depression Mapped Onto the Brain

Background: Alzheimer’s disease (AD) and depression are debilitating brain disorders that are often comorbid. Shared brain mechanisms have been implicated, yet findings are inconsistent, reflecting the complexity of the underlying pathophysiology. As both disorders are (partly) heritable, characterising their genetic overlap may provide aetiological clues. While previous studies have indicated negligible genetic correlations, this study aims to expose the genetic overlap that may remain hidden due to mixed directions of effects. Methods: We applied Gaussian mixture modelling, through MiXeR, and conjunctional false discovery rate (cFDR) analysis, through pleioFDR, to genome-wide association study (GWAS) summary statistics of AD (n = 79,145) and depression (n = 450,619). The effects of identified overlapping loci on AD and depression were tested in 403,029 participants of the UK Biobank (UKB) (mean age 57.21, 52.0% female), and mapped onto brain morphology in 30,699 individuals with brain MRI data. Results: MiXer estimated 98 causal genetic variants overlapping between the 2 disorders, with 0.44 concordant directions of effects. Through pleioFDR, we identified a SNP in the TMEM106B gene, which was significantly associated with AD (B = −0.002, p = 9.1 × 10–4) and depression (B = 0.007, p = 3.2 × 10–9) in the UKB. This SNP was also associated with several regions of the corpus callosum volume anterior (B > 0.024, p < 8.6 × 10–4), third ventricle volume ventricle (B = −0.025, p = 5.0 × 10–6), and inferior temporal gyrus surface area (B = 0.017, p = 5.3 × 10–4). Discussion: Our results indicate there is substantial genetic overlap, with mixed directions of effects, between AD and depression. These findings illustrate the value of biostatistical tools that capture such overlap, providing insight into the genetic architectures of these disorders.


INTRODUCTION
Alzheimer's disease (AD) is a highly disabling neurodegenerative disease characterised by memory loss and a gradual cognitive, functional, and behavioural decline (Reitz and Mayeux, 2014). Its prevalence increases rapidly with age, affecting 13% of the population at age 80, and 37% of the population at age 90 (Von Strauss et al., 1999). Individuals with AD often have comorbid major depressive disorder (MDD), present in 22-59% of cases (Zubenko et al., 2003;Starkstein et al., 2005), while MDD has an estimated lifetime prevalence of 11-15% in the general population (Bromet et al., 2011). MDD is a heterogeneous disorder; in addition to the core symptoms of low mood, anhedonia, and loss of energy, it comprises behavioural, physiological, and psychological signs and symptoms that include changes in appetite, sleeping, and psychomotor patterns, fatigue, lack of concentration, feelings of worthlessness or guilt, and suicidal ideation (American Psychiatric Association, 2013).
It has been long discussed whether a history of depressive symptoms is a risk factor for later development of AD, or rather an early prodromal manifestation of AD (Devanand et al., 1996;Chen et al., 1999). While bidirectional effects between the two disorders is likely, there is more evidence that midlife onset depressive symptoms and/or MDD are a risk factor for AD than vice versa (Jorm, 2001;Wilson et al., 2002;Green et al., 2003;Ownby et al., 2006;Barnes et al., 2012;Gracia-García et al., 2015). Furthermore, AD patients with depressive symptoms show accelerated cognitive decline and neurodegeneration, with significantly more plaques and tangles in the hippocampus than non-depressed individuals with AD (Rapp et al., 2006), while AD symptom count (Verkaik et al., 2007) or tau pathology (Kramberger et al., 2012) does not appear to contribute to the incidence or severity of depressive disorders.
Neuroimaging studies have provided scattered evidence that AD and depressive disorders share neurobiological pathways. Early stage AD is associated with atrophy of the hippocampus, para-hippocampal regions (Jack et al., 1999), and temporoparietal cortex (Acharya et al., 2019), with atrophy becoming generalised in later stages of the disease, including cortical thinning in primary motor and sensory regions (Fox et al., 2001;Sabuncu et al., 2011). Similarly, MDD and recurrent major depression (MD) are related to smaller hippocampal volumes (Sheline et al., 1999;Bremner et al., 2000;Mervaala et al., 2000), amygdala, and parahippocampal areas (Andreescu et al., 2008) as well as lower cortical thickness in medial orbitofrontal cortex, fusiform gyrus, insula, rostral, and caudal anterior and posterior cingulate cortex, temporal lobe in MDD (Schmaal et al., 2020), many of these changes correlating positively with the duration of the disease (Andreescu et al., 2008). In AD patients with comorbid symptoms of depression, MRI studies have shown specifically thinner cortex in temporal and parietal areas when compared to non-depressed AD patients (Lebedeva et al., 2014). Conjunction analysis on the brain morphological changes that overlap between AD and late-life onset depression has shown that, in addition to the previously mentioned structures, both conditions are associated with hippocampal atrophy (Boccia et al., 2015). Yet, the risk of developing AD in MDD does not seem to be mediated by hippocampal or amygdala volumes (Geerlings et al., 2008).
Both AD and depressive disorders are heritable, with twin studies indicating 37% broad heritability for MDD (Sullivan et al., 2000) and 74% for AD (Gatz et al., 1997). Molecular genetics studies show that both disorders have complex genetic architectures. AD has recently been characterised as oligogenic, with estimates indicating the involvement of relatively few genetic variants, in addition to the well-known, strong APOE-e4 risk variant (Holland et al., 2020;Zhang et al., 2020). MDD on the other hand has been estimated to be the most polygenic of all major brain disorders (Holland et al., 2020), involving many genetic variants with small effects that explain a small amount of its heritability (Wray et al., 2018). Regardless, given the high comorbidity and indications of shared neurobiological pathways, substantial genetic overlap is to be expected, which may be leveraged to better understand these disorders . Indeed, several candidate gene studies have identified shared genetic risk factors (Ye et al., 2016) that implicate hypothesised shared mechanisms, such as chronic neuroinflammatory changes in the brain (Leszek et al., 2018). While negligible genetic overlap between AD and MDD has been reported (Gibson et al., 2017;Lutz et al., 2020), substantial genetic overlap may remain hidden from measures of global genetic correlation due to mixed directions of effects. Here, we assess the genetic overlap between AD and depression across the genome through tools that capture the extent of overlap or specific loci, regardless of directions of effect. This was followed-up by analyses of the associations between shared loci and regional brain morphology in the UK Biobank (UKB) population study, providing valuable insights into their shared neurobiology.

GWAS Summary Statistics
To investigate the genetics of AD, we made use of the phase 1 summary statistics from a recent genome-wide association study (GWAS) that combined samples from the Psychiatric Genomics Consortium (PGC), the International Genomics of Alzheimer's Project (IGAP), and the Alzheimer's Disease Sequencing Project (ADSP) (Jansen et al., 2019). The phase 1 sample of this GWAS was chosen as it did not include any UKB participants, thereby preventing sample overlap with our follow-up analyses in the UKB. The summary statistics contained 9,862,739 SNPs and was based on 24,087 late-onset AD cases and 55,058 controls with European ancestry.
For the depression phenotype of the GWAS data, we obtained the summary statistics from the PGC MDD GWAS from 2019, including the 23andMe cohort (Wray et al., 2018). The construct of depression here is based on data from cohorts with MDD as well as self-reported depression, thereby closely aligning to the measure of depression that we constructed from the UKB data. We used a version of the meta-analysed summary statistics where the UKB sample was left out, to prevent sample overlap in downstream analyses. This version contained 15,507,882 SNPs for 121,198 individuals with depression and 329,421 controls.
For the post-GWAS analyses, we excluded the major histocompatibility complex (MHC) region (chr6: 26-34MB) from both summary statistics, as well as the APOE locus (chr19: 45-45.8 MB) from the AD GWAS, in accordance with recommendations .
For the estimate of r g , we applied cross-trait linkage disequilibrium score regression (LDSR) (Bulik- Sullivan et al., 2015). We further applied Gaussian mixture modelling, as implemented in the MiXeR tool, to the GWAS summary statistics, to estimate distributions of causal genetic variants, i.e., unobserved functional genetic variants that influence the phenotypes under investigation (Frei et al., 2019). MiXeR achieves this by fitting Gaussian curves to the GWAS summary statistics to optimally model null and non-null effects. The shapes of these Gaussians are then used to estimate the polygenicity (the number of causal genetic variants involved) and discoverability (average effect size of the causal variants, as h 2 ) of AD and depression. We further estimated the genetic overlap between AD and depression, as the number of causal variants shared regardless of direction of effects, through bivariate MiXeR. For the calculations of the MiXeR parameters, we made use of 9,997,231 SNPs from the 1000 Genomes Phase 3 data. Please see the MiXeR design paper for more details (Frei et al., 2019).
We conducted conjunctional false discovery rate (cFDR) analysis through the pleioFDR tool using default settings . We set an FDR threshold of 0.05 as wholegenome significance, in accordance with recommendations. 1

Participants
We made use of data from participants of the UKB population cohort, under accession number 27412. The composition, setup, and data gathering protocols of UKB have been extensively described elsewhere (Sudlow et al., 2015). We selected all individuals with White European ancestry, as determined by a combination of self-identification as "White British" and similar genetic ancestry based on genetic principal components (UKB field code 22006), with good quality genetic data.
We constructed a proxy measure of AD case-control status, combining information on International Classification of Disease, version 10 (ICD-10) diagnoses of dementia of the participants together with parental age and parental AD status, as described previously (Jansen et al., 2019). Based on lifetime hospital inpatient records linked to the UKB data, we made use of the ICD-10 to assign a score of 2 to any participants with a diagnosis of AD (F00 and/or G30; n = 782). All other participants received a 1-U increase for each biological parent reported to have (had) AD. Further, the contribution for each unaffected parent to the score was inversely weighted by the parent's age/age at death, namely (100-age)/100, giving us an approximate score between 0 and 2. This approach was taken in order to account for possible late-life onset AD, i.e., to minimise the labelling of individuals that will develop AD as controls. This proxy measure has been shown to be highly genetically correlated to AD status (r g = 0.81) (Jansen et al., 2019). Participants with missing data on any of the relevant questions were excluded 1 https://github.com/precimed/pleiofdr from these analyses (n = 19,332). The final sample size was n = 390,284, with a mean age of 57.33 years (SD = 7.49), and 52.02% was female.
The depression phenotype utilised in this study was constructed by assigning case status to any UKB participant with an ICD10 diagnosis of depression (F32-34, F38-39), n = 15,238, as well as any additional participants that answered affirmative to the question whether they had ever seen a general practitioner or psychiatrist for nerves, anxiety, tension, or depression (UKB field codes 2090 and 2010), during any UKB testing visit (n = 159,063). Control status was assigned to anyone who had answered "no" to these questions at all testing visits. This definition of depression is identical to the "broad depression phenotype" described by Howard in 2018 (Howard et al., 2018), based on a GWAS of depression in the UKB, which reported that this definition led to the largest number of genome-wide significant hits, while still being highly genetically correlated with a GWAS using a strict clinical definition of MDD, r g = 0.85 (Howard et al., 2018). We excluded anyone with any missing data on these questions (n = 6,587). The final sample size was n = 403,029, with a mean age of 57.21 years (SD = 7.49), and 52.02% was female.
Our sample size for the neuroimaging analyses, following preprocessing as described below and excluding individuals with brain disorders, was n = 30,699. As the neuroimaging data collection took place several years after the initial data collection, this subsample had a mean age of 64.32 years (SD = 7.48), and 52.06% was female.

Genetic Data Pre-processing
We made use of the UKB v3 imputed data, which has undergone extensive quality control procedures as described by the UKB genetics team (Bycroft et al., 2018). After converting the BGEN format to PLINK binary format, we additionally carried out standard quality check procedures, including filtering out individuals with more than 10% missingness, SNPs with more than 5% missingness, SNPs with an INFO score below 0.8, and SNPs failing the Hardy-Weinberg equilibrium test at p = 1 × 10 −9 . We further set a minor allele frequency threshold of 0.001, leaving 12,245,112 SNPs.

Image Acquisition
For the analyses involving neuroimaging data, we made use of MRI data from UKB released up to March 2020. T 1 -weighted scans were collected from four scanning sites throughout the United Kingdom, all on identically configured Siemens Skyra 3T scanners, with 32-channel receive head coils. The UKB core neuroimaging team has published extensive information on the applied scanning protocols and procedures, which we refer to for more details (Miller et al., 2016).
The T 1 -weighted scans were stored locally at the secure computing cluster of the University of Oslo. We applied the standard "recon-all-all" processing pipeline of Freesurfer v5.3, performing automated surface-based morphometry and subcortical segmentation (Fischl et al., 2002;Desikan et al., 2006). From the output, we extracted all commonly studied global, subcortical, and cortical morphology measures, as listed We excluded individuals with bad structural scan quality as indicated by an age and sex-adjusted Euler number [a measure of segmentation quality based on surface reconstruction complexity (Rosen et al., 2018)] more than three standard deviations lower than the scanner site mean, or with a global brain measure more than five standard deviations from the sample mean, n = 717.

Statistical Analyses
All downstream analyses were carried out in R v3.6.1. In all follow-up analyses, involving UKB data, we adjusted for age, sex, and the first 20 genetic principal components to control for population stratification. For the neuroimaging analyses, we additionally adjusted for scanner site, Euler number (Rosen et al., 2018), and a measure-specific global estimate for the regional measures (total surface area, mean cortical thickness, or intracranial volume). The latter was done to ensure that we are studying associations with regional brain morphology rather than global effects.
To correct for multiple comparisons, we applied spectral decomposition to the Pearson's correlation matrix of the 96 regional brain measures (Nyholt, 2004). Based on the observed eigenvalues, we estimated the effective number of independent traits in our neuroimaging analyses to be 51. We therefore set an alpha of.001 for these analyses.

Global Genetic Overlap
Eighteen loci were genome-wide significant in the AD GWAS, which had an estimated SNP-based heritability, h 2 , of 0.05 (SE = 0.01). The depression GWAS summary statistics contained 33 significant loci, with an h 2 of 0.05 (SE = 0.002), see Figure 1A. These numbers are in line with the results from the original GWAS studies (Wray et al., 2018;Jansen et al., 2019). Using LDSC, the two disorders showed a negligible genetic correlation of −0.03 (SE = 0.06, p = 0.60).
Through univariate mixture modelling, we found that AD has an estimated 261 causal genetic variants, with a discoverability of 2.1 × 10 −4 . Depression was estimated to involve 15,228 variants, with a discoverability of 6.8 × 10 −6 . In other words, depression was estimated to be over 50 times more polygenic and its genetic determinants were estimated to be approximately 30 times less discoverable than AD. Expected sample sizes needed to explain half of the genetic variance for AD was 0.5 million, for depression 10 million, see Figure 1B.
Bivariate mixture modelling indicated that there were 98 causal variants overlapping between the two traits, i.e., 38% of all variants for AD and 1% of all variants for depression, see Figure 1C. Given the size of the reference genome, we estimate that by chance the overlap would be approximately four variants. The fraction of concordant directions of effects for the shared variants was 0.44. The bivariate density plot, Figure 1D, illustrates the presence of mixed directions of associations for many SNPs; some SNPs have the same direction of association for both traits, while others are positively associated with AD and negatively associated with depression or vice versa. The net result of this is a negligible negative correlation, despite a large proportion of AD's causal variants overlapping with depression.

Locus Overlap
Through conjunctional FDR analysis, we discovered a SNP at chromosome 7, rs5011436, located at an intron of the TMEM106B gene, that was significantly associated with both traits. We replicated this association with both traits using UKB data; for AD, we found a negative relation with the number of copies of the C allele (B = −0.002, SE = 6.5 × 10 −4 , p = 9.1 × 10 −4 ), whereas for depression we found a positive relation (B = 0.007, SE = 0.001, p = 3.2 × 10 −9 ), in accordance with the directions of effects as reported in the two original GWAS.

DISCUSSION
Here we employed state-of-the-art biostatistical tools to improve our knowledge of the genetic underpinnings of the relation between AD and depression. In line with previous reports we have identified large differences in the genetic architecture of these disorders. We add new knowledge by revealing the presence of genetic overlap between them. We further illustrated how conjunctional analysis may be used to discover specific shared genetic loci, and substantially expanded on previous efforts by mapping the effects onto the brain in order to identify neurobiological mechanisms that contribute to the relation between these disorders.
We found that many of the causal variants for AD are overlapping with depression. This partly contradicts the previously reported negligible genetic correlation between AD and MD (Gibson et al., 2017), as well as the overall low genetic correlation reported between neurologic and psychiatric disorders (Anttila et al., 2018). However, whereas genetic correlations rely on globally consistent directions of effects between the two traits under investigation, bivariate Gaussian mixture modelling estimates the number of causal variants that have an effect on both, regardless of directions of effects. High  (bottom half, red). The x-axis shows the relative genomic location, grouped by chromosome, and the red dashed lines indicate the whole-genome significance threshold of 5 × 10 −8 . (B) Estimated percent of genetic variance explained by SNPs surpassing the genome-wide significance threshold, on the y axis, as a function of sample size, depicted on the x axis on a log 10 scale, for AD and depression. Current sample sizes and percentages of genetic variance explained by discovered SNPs are shown in parentheses. (C) Venn diagram depicting the estimated number of causal variants shared between AD and depression and unique to either of them. Below the diagram, we show the estimated genetic correlation. (D) Bivariate density plot, illustrating the relationship between the observed GWAS Z-values for AD (on the x-axis) and depression (on the y-axis).
levels of mixed directions of effects is likely to be commonplace for complex traits such as brain disorders. This can be seen in, for instance, another psychiatric disorder like schizophrenia, which has been estimated to share virtually all causal variants with educational attainment, despite a near-zero genetic correlation (Frei et al., 2019). The strong heterogeneity of depressive disorders is likely to contribute to the mixed directions of effects between the two traits, which appears in contradiction to the high levels of reported comorbidity with AD. The heterogeneity of the depression phenotype is evident from its wide range of signs and symptoms and the likely existence of several depression subtypes. This may explain how the extent of genetic overlap can be large for the less polygenic AD, yet very small for depression, which fits with numerous reports that depressive disorders are an important predictor of AD pathology while the opposite is less true (Rapp et al., 2006;Verkaik et al., 2007;Kramberger et al., 2012). We speculate that some depressive disorder subtypes will be shown as genetically more concordant with AD than others, in line with indications that depressive disorders subtypes have significantly different genetic architectures (Jang et al., 2004;Milaneschi et al., 2016). The wide range of reported levels of comorbidity with AD across studies (Mega et al., 1999;Zubenko et al., 2003;Starkstein et al., 2005;Verkaik et al., 2007;Gracia-García et al., 2015) may also be due to this heterogeneity, as they differ in defining and subtyping of depression. A direct investigation of the relation of AD comorbidity and depressive disorders subtypes, coupled to neurobiological data, would be valuable. We postulate that studies using more narrow depressive disorder subtypes would find lower polygenicity and more concordant directions of effects with AD for specific subtypes.
Our use of cFDR to identify a specific locus shared by AD and depression is an example of how we may use genetics to improve our understanding of the neurobiology underlying the relation between these two disorders. The SNP rs5011436 is located in an intron of the gene TMEM106B, which encodes the transmembrane protein 106B. TMEM106B was the first genetic risk factor to be identified for fronto-temporal lobar degeneration (FTLD; Van Deerlin et al., 2010). Since then, it has also been reported in GWAS of both AD (Jun et al., 2016) and MD (Howard et al., 2019). The protein TMEM106b is thought to regulate lysosomal function, with a role in the clearance of TPD-43 (Nicholson and Rademakers, 2016). Both lysosomal function and specifically TDP-43 are highly related with the pathogenesis of AD (Nixon et al., 1992;Wilson et al., 2011) and MD (Modrego and Ferrández, 2004). TMEM106B expression has been shown to be downregulated in brains of individuals with AD (Satoh et al., 2014), while it has been found to be upregulated in individuals with MD (Dall'Aglio et al., 2020).
The evidence for involvement of TMEM106B in both disorders is further substantiated by our neuroimaging analyses, indicating effects on several brain regions that have been tied to both AD and depression. In particular the corpus callosum was implicated by our analyses, in line with previous neuroimaging findings on TMEM106B (Adams et al., 2014), with higher volume of several callosal subregions for carriers of the rs5011436 c-allele, the allele that we found to convey risk for depression and to be protective for AD. MDD is associated with abnormal cerebral lateralisation, and individuals with familial MDD have been found to have significantly larger callosal volume than individuals with non-familial forms of MDD (Lacerda et al., 2005), while AD has been repeatedly linked to degeneration of the corpus callosum (Di Paola et al., 2010). Thus, specific, genetically mediated, forms of depression have been found to have opposing directions of effects on the corpus callosum than other forms of depression and AD.
Our estimates of heritability, polygenicity, and discoverability highlight the complexity of the genetic architecture of both disorders. While twin studies have indicated high broad heritability of AD (Gatz et al., 1997) and MD (Sullivan et al., 2000), we replicated previous findings of low SNP-based heritability, as captured by GWAS data (Lambert et al., 2013;Wray et al., 2018;Jansen et al., 2019). This possibly implicates an important role for rare variants, as well as a high degree of genetic and environmental interaction effects. Contrasting the two disorders, it is clear that AD polygenicity is relatively low, with a recent study even qualifying late-onset AD as oligogenic (Zhang et al., 2020). Depressive symptoms on the other hand are highly polygenic, partly reflecting the substantial clinical heterogeneity which may capture a broad range of conditions, each likely with partly distinct genetic determinants (Jang et al., 2004).
Regardless of these differences in genetic architectures, our analyses made clear that, with current approaches, GWAS sample sizes will need to reach millions of individuals to uncover a substantial fraction of the common genetic variance influencing both disorders.
Our findings once again reiterate the complexity of the genetic architectures of brain disorders, highlighting the limitations of the GWAS approach. Our power analyses suggest that, despite tremendous efforts from worldwide consortia to bring together large samples, we are only at the very beginning of uncovering the genetic determinants of AD and depression through the standard GWAS approach. Clearly, more powerful biostatistical tools are needed, ones that better match this complexity and that leverage genetic signal shared across traits of interest (Frei et al., 2019;Smeland et al., 2020;van der Meer et al., 2020) in order to lower the required sample sizes and provide more meaningful metrics. While approaches like Gaussian mixture modelling are a step in the right direction, the current implementation does still suffer from oversimplified assumptions about the nature of the genetic architecture of brain disorders; AD is enriched for rare variants, while the MiXeR analysis focuses on common variants only. Further, low polygenicity implies a handful of large genetic effects -there is a bigger chance that the distribution of those effect sizes won't follow a Gaussian distribution, violating model assumptions. We are developing extensions of this method that will handle such characteristics.
To conclude, in this study we provided further insights into the genetic relationship between AD and depression, providing evidence of significant genetic overlap, and neuropathological effects reflected in brain morphological changes, warranting further genetic research. However, it seems that the complex relation between AD and depression will require future research to employ larger sample sizes, cleaner phenotype definitions and further improvements of biostatistical tools. It will also be important to study interaction effects between genetic variants and between genetic and environmental factors, as well as the dynamic interplay between relevant factors over the lifespan. These all will influence the underlying biological mechanisms that account for the complex relationship between these disorders. Ultimately this knowledge may provide a path toward more effective treatments, thereby reducing the enormous burden that AD and depression place on patients and their care-givers.

DATA AVAILABILITY STATEMENT
Publicly available datasets were analysed in this study. This data can be found on the UK biobank datasets under accession number 27412. This data can be accessed following a formal application at https://www.ukbiobank.ac.uk/enable-yourresearch.

ETHICS STATEMENT
The studies involving human participants were reviewed and approved by UK North West Multi-centre Research Ethics Committee. The patients/participants provided their written informed consent to participate in this study.

AUTHOR CONTRIBUTIONS
JM-S and DM conceived the study. DM and TK preprocessed the data. JM-S and DM performed all analyses, with conceptual input from MS and DL. All authors contributed to interpretation of results. JM-S and DM drafted the manuscript and all authors contributed to and approved the final manuscript.