The MAPT H1 Haplotype Is a Risk Factor for Alzheimer’s Disease in APOE ε4 Non-carriers

An ancestral inversion of 900 kb on chromosome 17q21, which includes the microtubule-associated protein tau (MAPT) gene, defines two haplotype clades in Caucasians (H1 and H2). The H1 haplotype has been linked inconsistently with AD. In a previous study, we showed that an SNP tagging this haplotype (rs1800547) was associated with AD risk in a large population from the Dementia Genetics Spanish Consortium (DEGESCO) including 4435 cases and 6147 controls. The association was mainly driven by individuals that were non-carriers of the APOE ε4 allele. Our aim was to replicate our previous findings in an independent sample of 4124 AD cases and 3290 controls from Spain (GR@ACE project) and to analyze the effect of the H1 sub-haplotype structure on the risk of AD. The H1 haplotype was associated with AD risk (OR = 1.12; p = 0.0025). Stratification analysis showed that this association was mainly driven by the APOE ε4 non-carriers (OR = 1.15; p = 0.0022). Pooled analysis of both Spanish datasets (n = 17,996) showed that the highest AD risk related to the MAPT H1/H2 haplotype was in those individuals that were the oldest [third tertile (>77 years)] and did not carry APOE ε4 allele (p = 0.001). We did not find a significant association between H1 sub-haplotypes and AD. H1c was nominally associated but lost statistical significance after adjusting by population sub-structure. Our results are consistent with the hypothesis that genetic variants linked to the MAPT H1/H2 are tracking a genuine risk allele for AD. The fact that this association is stronger in APOE ε4 non-carriers partially explains previous controversial results and might be related to a slower alternative causal pathway less dependent on brain amyloid load.


INTRODUCTION
Dementia is related to many underlying pathologies, with Alzheimer's disease (AD) being the most common. AD is pathologically defined by the deposits of two proteins: tau which accumulates intracellularly and β-amyloid that accumulates extracellularly and within the walls of the blood vessels of the central nervous system. Dementia of AD-type is a complex entity with a common clinical syndrome that is likely to be reached by different routes influenced by genetic and environmental factors. This complexity is likely to have contributed to the constant failures of AD clinical trials. In particular, therapies based on the amyloid cascade hypothesis have not demonstrated any disease-modifying effect, despite some of these attempts have been proved to be effective in permanently removing brain β-amyloid plaques (Nicoll et al., 2019). This has led the pharmaceutical industry to focus on other therapeutic targets such as the tau protein (Corvol and Buée, 2019).
Neurofibrillary tangles composed of truncated and hyperphosphorylated tau proteins are hallmarks of AD pathology (Goedert, 2004). Tau protein plays an essential role in the central nervous system by promoting microtubule assembly and stability in neuronal cells. Emerging evidence supports that tau function is essential for normal synaptic mechanisms and it may be dysregulated in AD potentially through interaction with genetic risk factors in an Aβ-dependent or Aβ-independent manner (Dourlen et al., 2019). Additionally, tau has been proposed to spread through the brain from neuron to neuron by a "prion-like" mechanism (Clavaguera et al., 2015).
The H1 haplotype is further divided into sub-haplotypes, of those H1c has been associated with several neurodegenerative diseases Heckman et al., 2019). H1c has been associated with higher levels of tau in plasma and CSF (Myers et al., 2007;Chen et al., 2017) and inconsistently with AD .
Although recent genetic studies show that several AD GWASassociated genes, especially BIN1, are potentially involved in tau pathways (Dourlen et al., 2019), MAPT itself has not emerged until recently as a locus associated with AD. The IGAP consortium found a significant association between an SNP tagging MAPT H1 haplotype (rs2732703) and AD in subjects not carrying APOE ε4, however, the authors concluded that their conditional analysis pointed out that MAPT was probably not the causal gene (Jun et al., 2016).
Microtubule-associated protein tau H1/H2 haplotype frequency varies according to populations, with H2 frequency being maximum in the Mediterranean region and decreasing gradually as we move away from that area (Donnelly et al., 2010). These differences might contribute to explain controversial results among studies, as genetic stratification in genetically heterogeneous populations might constitute an important confounder. It is therefore essential to study these variants in large genetically homogeneous populations. In a previous study, we showed that H1 MAPT haplotype was strongly associated with risk of PSP, PD, and AD in 4435 cases and 6147 controls from Spain (Pastor et al., 2016). Therefore, it seems that MAPT H1/H2 haplotypes might play a relevant role within the genetic architecture of several neurodegenerative pathologies in our country. It is worth mention that the prevalence of the haplotype H2 in our control sample was one of the highest reported worldwide (29%) (Pastor et al., 2016).
In the present study, we aimed to replicate our previous findings in an independent sample. To do that, we assessed the association between the AD risk and the MAPT H1/H2 haplotype and H1 sub-haplotypes in 4,124 AD cases and 3,290 controls from Spain (GR@ACE/DEGESCO project).

MATERIALS AND METHODS
A detailed description of the methods and population of the GR@ACE study has been published elsewhere (Moreno-Grau et al., 2019) 1 .

Population
The GR@ACE study comprises 4,120 AD cases and 3,289 control individuals. Cases were recruited from Fundació ACE, Institut Català de Neurociències Aplicades (Barcelona, Spain). Diagnoses were established by a neurology working-group according to the DSM-IV criteria for dementia and to the National Institute on Aging and Alzheimer's Association's (NIA-AA) 2011 guidelines for defining AD. In the present study, we considered AD cases, dementia individuals diagnosed with probable or possible AD at any moment of their clinical course.
Control individuals were recruited from three centers: Fundació ACE (Barcelona, Spain), Valme University Hospital (Seville, Spain) and the Spanish National DNA Bank Carlos III (University of Salamanca, Spain) 2 . Written informed consent was obtained from all participants. The Ethics and Scientific Committees have approved this research protocol (Acta 25/2016, Ethics Committee, Hospital Clinic I Provincial de Barcelona, Barcelona, Spain).

Genotyping, Quality Control, Imputation, and Statistical Analysis
Participants were genotyped using the Axiom 815K Spanish Biobank array (Thermo Fisher). Genotyping was performed in the Spanish National Center for Genotyping (CeGEN, Santiago de Compostela, Spain).
We removed samples with genotype call rates below 97%, excess heterozygosity, duplicates, samples genetically related to other individuals in the cohort or sample mixup (PIHAT > 0.1875). If a sex discrepancy was detected, the sample was removed unless the discrepancy was safely resolved. To detect population outliers of non-European ancestry (>6 SD from European population mean), principal component analysis (PCA) was conducted using SMARTPCA from EIGENSOFT 6.1.4.
We removed variants with a call rate < 95% or that grossly deviated from Hardy-Weinberg equilibrium in controls (Pvalue ≤ 1 × 10 −4 ), markers with a different missing rate between case and control (P-value < 5 × 10 −4 for the difference) or minor allele frequency (MAF) below 0.01. Imputation was carried out using Haplotype reference consortium (HRC) panel in Michigan Imputation servers 3 . Only common markers (MAF > 0.01) with a high imputation quality (R 2 > 0.30) were selected to conduct downstream association analyses.

Statistical Analysis
Allelic and genotypic frequencies were compared using χ 2 statistics. Adjusted analyses were performed using multiple logistic regression. We used rs1800547 to tag MAPT H2 haplotype. Additionally, we used six tagging variants (rs1467967, rs242557, rs3785883, rs2471738, rs8070723, rs7521), to construct most common MAPT H1 sub-haplotypes as previously described Allen et al., 2014). To control for population sub-structure, results were co-variated by the main four principal components detected in this population (Moreno-Grau et al., 2019). All analyses were performed in PLINK 1.7.

RESULTS
We included 3290 controls with a mean age of 54.3 ± 14.4 years, and 48.9% of females, and 4124 AD cases with a mean age of 79.0 ± 7.5 years, 69.6% of females. No gross deviation from Hardy Weinberg equilibrium was found in controls for any of the MAPT studied variants (Table 1). Table 2 shows the allelic and genotypic frequency distribution of the SNPrs1800547 tagging the H1/H2 haplotype. We found a statistically significant overrepresentation of the MAPT H1 haplotype, present in 73.3% of AD compared to 71.1% of controls (p = 0.00025). When we stratified the sample by APOE ε4 status, the association of the H1 haplotype was driven by noncarriers of APOE ε4 (p = 0.0022) ( Table 2). The association followed exactly the same pattern as our previous study (Pastor et al., 2016). Pooling both Spanish population confirmed that MAPT H1 was significantly more common in AD compared to controls (73.5 versus 70.7% respectively; p = 1.0 × 10 −5 ), and this 3 https://imputationserver.sph.umich.edu association was predominantly due to the APOE ε4 non-carriers (p = 8.0 × 10 −5 ) ( Table 2). Table 3 shows the sub-haplotypes of MAPT in cases and controls. In addition to the protective effect of H2, only H1c was statistically significantly associated with AD. However, when we adjusted by the four main genetic components H1c was not statistically significant. After stratifying by APOE ε4 these results did not change substantially, and in addition to H2, only two rare sub-haplotypes (H1u and H1v) were nominally associated with AD. After adjustment, none of the associations survived multiple comparisons correction (Table 4). Figure 1 shows the genotypic distribution of the SNP rs1800547 (MAPT H1/H2) stratified by APOE ε4 across age tertiles for the pooled population. In the available entire sample of 15,522 individuals, we appreciated that within the APOE ε4 noncarriers the association between MAPT H1/H2 and AD increased with age, and it was stronger in the oldest individuals.

DISCUSSION
Our data, from a large and homogeneous single country population, shows that individuals carrying the H1 MAPT haplotype are at higher risk to develop AD dementia. The association is predominantly present in the APOE ε4 non-carriers and it is stronger in the eldest. These results replicate our previous findings (Pastor et al., 2016) and are concordant with the IGAP study (Jun et al., 2016) showing an association of another SNP tagging H1 (rs2732703) with AD only in the APOE ε4 non-carriers.
Our pooled analysis, including 15522 individuals, strongly support an etiological role of the MAPT region in AD. This association has been difficult to replicate, and results of previous studies assessing MAPT H1/H2 haplotype as a risk factor for AD have been controversial, although some of them were considerably under-powered (Russ et al., 2001;Mukherjee et al., 2007;Babić Leko et al., 2018). A robust statistical association only emerged in large sample studies and meta-GWAS, after stratifying by APOE ε4 (Jun et al., 2016;Pastor et al., 2016). An alternative interpretation of our results would be that the causal variant could be in linkage disequilibrium with the MAPT H1 haplotype but outside the MAPT gene, as other authors have suggested (Jun et al., 2016). A less likely explanation might be contamination of non-AD tauopathies in APOE ε4 non-carriers.
There are several factors that might be related to our findings. Taken together, our two studies, comprise one of the largest single-country population assessing MAPT H1/H2 and AD risk to date. This is of special value, as the inversion haplotype frequency has been shown to differ significantly across populations, and it is estimated to be 20% in Europeans, 6% in Africans, and less than 1% in East Asians (Holzer et al., 2004). This ethnic variability might cause population sub-structure biasing the results in countries with a high degree of population admixture. This is likely to be less problematic in our study as our sample come from a single country, and we have tested population sub-structure in Spain which does not represent a substantial problem for common genetic variants analyses . Additionally, in our sub-haplotype analysis, we controlled for population sub-structure by adjusting for genetic principal components. It is worth mentioning that, despite it is commonly reported that the inversion is found at a frequency of around 20% throughout Europe, it shows a great range of frequencies within Europe (from 5 to 37.5%). The H2 haplotype, which identifies the inversion, is most frequent around the Mediterranean decreasing outward in all directions (Donnelly et al., 2010). Our controls presented the inversion in nearly 30% of individuals, this high frequency of the H2 haplotype has increased our power to detect the association compared to other populations in which this variant is less prevalent. A potential limitation of our study is the age difference between the cases and the controls, which are significantly younger. However, this could only jeopardize the validity of our findings in the case of a survival bias. But to our knowledge, the MAPT H1/H2 haplotype has not been associated with mortality. The most likely consequence of our age unbalance is a decrease in our power to detect the association, as some of the controls carrying the H1 haplotype may still develop AD in the future. It is therefore likely that we are underestimating the true association between the H1/H2 haplotype and AD. Microtubule-associated protein tau H1 has been associated with many neurodegenerative diseases: PSP, PD, CBD, FTD, and AD. It has been reported that the MAPT H1 is more efficient at driving gene expression than the H2 haplotype (Kwok et al., 2004). This has been shown to be particularly true with the H1c sub-haplotype (Myers et al., 2007). However, the H1 sub-haplotype association with AD is controversial, and H1c findings have been difficult to replicate Abraham et al., 2009;Allen et al., 2014). It is likely that population stratification might have played a role. In our data H1c was nominally associated with AD, however, the statistical significance was lost after adjusting by the principal components supporting the notion that no H1sub-haplotype is specifically increasing AD-risk, and stratification by APOE ε4 did not change substantially these results.
Our results add further evidence for an etiological role of MAPT gene variants in clinical AD, supporting the role of the APOE ε4 allele as a modulator of this association. As seen in our previous study in population from Spain (Pastor et al., 2016) and by the international consortium IGAP (Jun et al., 2016), this association is significantly stronger in APOE ε4 noncarriers. Recent studies with tau and amyloid PET supports a view of AD as a tauopathy driven by amyloid, suggesting that tau pathology would appear in middle temporal lobe earlier than amyloid deposits, but the co-occurrence of both would be needed for tau pathology to expand beyond the temporal lobes (Schöll et al., 2016). This is also coherent with the new findings of AD meta-GWAS which shows the involvement of both amyloid and tau pathways (Dourlen et al., 2019;Kunkle et al., 2019). Therefore, it seems that tau and amyloid deposits might follow, at least initially, independent trajectories up to the point when they reach a threshold in which β-amyloid might accelerate tau pathology. A recent single case publication showing that a patient with a presenilin 1 mutation was resistant to cognitive impairment, likely due to a homozygous mutation in APOE, supports the hypothesis that APOE might play an important role in this tau pathology acceleration (Arboleda-Velasquez et al., 2019). On the other hand, APOE status is associated to prevalence of brain amyloid pathology, as shown by a large PET and CSF study in non-demented population that found that APOE ε4 carriers had two to three times higher prevalence than noncarriers (Jansen et al., 2015).
We hypothesize that the MAPT H1 variant might increase the risk of tau pathology which might be related to different amyloid thresholds to disparate tau pathology. We speculate that the association is only significant in the APOE ε4 noncarriers because in the carriers the amyloid would mask the H1 MAPT effect, while in the non-carriers the "tau etiologic pathway" would play a more relevant role and might be able to increase AD risk with a lower amyloid involvement. It is likely that this phenomenon might take longer to develop which might explain why this association is not significant in the youngest patients and stronger in the third tertile. This hypothesis could be tested studying the trajectory of individuals classified according to APOE ε4 status and MAPT H1/H2 haplotype in prospective cohorts with sequential amyloid and tau PET assessments.
Our study highlights the complexity of AD and suggests the existence of different pathogenic routes influenced by the genetic background. To look for successful therapeutic strategies it will be very important to take into account this mechanistic diversity and to combat specifically the pathology demonstrated on each afflicted individual.

DATA AVAILABILITY STATEMENT
The datasets generated for this study can be accessed from the https://ega-archive.org/studies/EGAS00001003424.

ETHICS STATEMENT
The studies involving human participants were reviewed and approved by Acta 25/2016, Ethics Committee, Hospital Clinic I Provincial de Barcelona, Barcelona, Spain. The patients/participants provided their written informed consent to participate in this study.

AUTHOR CONTRIBUTIONS
PS-J and AR: data collection, data analysis, study design, manuscript drafting, and manuscript critical review. SM, IR, IH, SV, MA, LM, PG, CL, SL-G, ER-R, and AO: data collection, data analysis, and manuscript critical review. LT and MB: data collection, data analysis, manuscript critical review, and study design.

ACKNOWLEDGMENTS
The Genome Research at Fundacio ACE/Dementia Genetics Spanish Consortium (GR@ACE/DEGESCO) would like to thank patients and controls who participated in this project. GR@ACE/DEGESCO GWAS program was funded by Grifols SA, Fundacion Bancaria "La Caixa, " and Fundació ACE, Institut Català de Neurociències Aplicades. PS-J and AR have also received support by grant PI16/01861. Accion Estrategica en Salud integrated in the Spanish National I+D+i Plan and financed by Instituto de Salud Carlos III (ISCIII) -Subdireccion General de Evaluacion and the Fondo Europeo de Desarrollo Regional (FEDER -"Una Manera de Hacer Europa"). PS-J was supported by IDIVAL, Instituto de Salud Carlos III [Fondo de Investigacion Sanitario, PI08/0139, PI12/02288, PI16/01652, JPND (DEMTEST PI11/03028)], and the CIBERNED program. We thank Biobanco Valdecilla for their support. LM was supported by Consejería de Salud de la Junta de Andalucía (Grant PI-0001/2017). DEGESCO was also sponsored by the Centro de Investigación Biomédica en Red sobre Enfermedades Neurodegenerativas (CIBERNED, Spain). Control samples and data from patients included in this study were provided in part by the National DNA Bank Carlos III (www.bancoadn.org, University of Salamanca, Spain) and Hospital Universitario Virgen de Valme (Sevilla, Spain) and they were processed following standard operating procedures with the appropriate approval of the Ethical and Scientific Committee. The genotyping service to generate GR@ACE/DEGESCO GWAS data was carried out at CEGEN-PRB3-ISCIII; it was supported by grant PT17/0019, of the PE I+D+i 2013-2016, funded by ISCIII and ERDF. GR@ACE/DEGESCO consortia would also like to thank to all researchers contributing to this project. A complete list of collaborators involved in the GR@ACE/DEGESCO GWAS can be found at https://ciberned.es/proyectos/ degesco.html.