Recent Consanguinity and Outbred Autozygosity Are Associated With Increased Risk of Late-Onset Alzheimer’s Disease

Prior work in late-onset Alzheimer’s disease (LOAD) has resulted in discrepant findings as to whether recent consanguinity and outbred autozygosity are associated with LOAD risk. In the current study, we tested the association between consanguinity and outbred autozygosity with LOAD in the largest such analysis to date, in which 20 LOAD GWAS datasets were retrieved through public databases. Our analyses were restricted to eight distinct ethnic groups: African–Caribbean, Ashkenazi–Jewish European, European–Caribbean, French–Canadian, Finnish European, North-Western European, South-Eastern European, and Yoruba African for a total of 21,492 unrelated subjects (11,196 LOAD and 10,296 controls). Recent consanguinity determination was performed using FSuite v1.0.3, according to subjects’ ancestral background. The level of autozygosity in the outbred population was assessed by calculating inbreeding estimates based on the proportion (FROH) and the number (NROH) of runs of homozygosity (ROHs). We analyzed all eight ethnic groups using a fixed-effect meta-analysis, which showed a significant association of recent consanguinity with LOAD (N = 21,481; OR = 1.262, P = 3.6 × 10–4), independently of APOE∗4 (N = 21,468, OR = 1.237, P = 0.002), and years of education (N = 9,257; OR = 1.274, P = 0.020). Autozygosity in the outbred population was also associated with an increased risk of LOAD, both for FROH (N = 20,237; OR = 1.204, P = 0.030) and NROH metrics (N = 20,237; OR = 1.019, P = 0.006), independently of APOE∗4 [(FROH, N = 20,225; OR = 1.222, P = 0.029) (NROH, N = 20,225; OR = 1.019, P = 0.007)]. By leveraging the Alzheimer’s Disease Sequencing Project (ADSP) whole-exome sequencing (WES) data, we determined that LOAD subjects do not show an enrichment of rare, risk-enhancing minor homozygote variants compared to the control population. A two-stage recessive GWAS using ADSP data from 201 consanguineous subjects in the discovery phase followed by validation in 10,469 subjects led to the identification of RPH3AL p.A303V (rs117190076) as a rare minor homozygote variant increasing the risk of LOAD [discovery: Genotype Relative Risk (GRR) = 46, P = 2.16 × 10–6; validation: GRR = 1.9, P = 8.0 × 10–4]. These results confirm that recent consanguinity and autozygosity in the outbred population increase risk for LOAD. Subsequent work, with increased samples sizes of consanguineous subjects, should accelerate the discovery of non-additive genetic effects in LOAD.


INTRODUCTION
The impact of consanguinity on reproduction and Mendelian disorders is well known and documented (Bittles and Black, 2010). In contrast, very little has been published on the effects of consanguinity on late-onset diseases, even though inbreeding may have a prominent influence on late-onset traits (Rudan et al., 2003). Recessive inheritance of complex phenotypes can be linked to long [≥1-megabase (Mb)] runs of homozygosity (ROHs), which are indicative of recent consanguinity (Ghani et al., 2013). Levels of homozygosity vary by population owing to the evolutionary distance of different populations from the ancient migration events that led to elevated homozygosity (Pemberton et al., 2012;Kang et al., 2016). Several studies have been carried out in late-onset Alzheimer's Disease (LOAD) cohorts from different ethnicities, including Caribbean-Hispanics (Ghani et al., 2013), African Americans (Ghani et al., 2015), Wadi-Ara Arabs (Sherva et al., 2011), and Northern-Europeans (Nalls et al., 2009a;Sims et al., 2011) with the aim of determining the impact of ROHs on LOAD. The studies carried out in Caribbean-Hispanic (Ghani et al., 2013) and African American (Ghani et al., 2015) populations both demonstrated an association of long ROHs with LOAD, thus suggesting a link between recent consanguinity and LOAD. An association between consanguinity and LOAD was also demonstrated in a genealogical study of the Saguenay region in Québec (Vézina et al., 1999). Conversely, in the small ethnic isolate of Wadi-Ara Arabs (Sherva et al., 2011), the average degree of inbreeding was significantly higher in controls compared to cases. Moreover, the two studies carried out in Caucasians (Nalls et al., 2009a;Sims et al., 2011) showed discordant findings: the British-Irish study (Sims et al., 2011) displayed no association of number of ROHs with LOAD, while the mainly North-Western European cohort of neuropathologically verified subjects from the TGenII cohort (Nalls et al., 2009a) showed a suggestive increased number of ROHs in LOAD cases compared to controls. In sum, the results vary considerably by ancestral background, thus failing to provide a clear picture of the overall impact of homozygosity on LOAD risk.
In the present work we tested the association of consanguinity and autozygosity with LOAD by leveraging a large collection of publicly available GWAS data. To this aim, we determined the individual ancestry of subjects belonging to 20 independent GWAS datasets and pooled the consanguineous subjects according to their respective ethnic group. This step was followed by an association analysis between consanguinity in LOAD cases against older, cognitively healthy controls. We also tested the overall impact of genome-wide autozygosity in the outbred population. Finally, we leveraged Whole-Exome Sequencing (WES) data from the Alzheimer's Disease Sequencing Project (ADSP) (Beecham et al., 2017) to test the global burden of rare minor homozygote variants in LOAD and to perform a two-stage recessive GWAS using 201 consanguineous subjects in the discovery phase followed by validation in 10,469 subjects.

MATERIALS AND METHODS Subjects
Twenty LOAD GWAS datasets were obtained from publicly available data repositories (Supplementary Table S1). The 20 datasets have been described in previous studies (Li et al., 2008;Filippini et al., 2009;Lee et al., 2011;Naj et al., 2011;Zhang et al., 2013;Clark et al., 2014;Proitsi et al., 2014;Saykin et al., 2015). Details of the participating studies and genotyping platforms used are provided in Supplementary Table S1. The Alzheimer's Disease Neuroimaging Initiative (ADNI) (Saykin et al., 2015) was launched in 2003 as a public-private partnership, led by Principal Investigator Michael W. Weiner, MD. The primary goal of ADNI has been to test whether serial magnetic resonance imaging, positron emission tomography, other biological markers, and clinical and neuropsychological assessment can be combined to measure the progression of mild cognitive impairment and early Alzheimer's disease. Whole-exome sequencing from the discovery phase of the Alzheimer's Disease Sequencing Project (ADSP) (Beecham et al., 2017) was obtained through the National Institute on Aging Genetics of Alzheimer's Disease Data Storage Site (NIAGADS) and it includes 5,096 Alzheimer's Disease (AD) cases and 4,965 controls, with an additional enriched sample set comprised of 853 AD cases from multiple affected families and 171 Hispanic controls. This was a re-analysis of de-identified data available from shared data repositories. The study protocol was granted an exemption by the Stanford Institutional Review Board because the analyses were carried out on "de-identified, off-the-shelf " data.
Inclusion Criteria, Quality Control (QC) Pipeline, Ancestry Determination and Imputation Frontiers in Genetics | www.frontiersin.org (Chang et al., 2015). A comprehensive flowchart of the data QC/harmonization/ancestry-determination steps applied to the full dataset is reported as Figure 1.
Subjects with autosome missingness (≥5%) and/or X-chromosome missingness (≥5%) within the same dataset, age below 60 years, age information missing, or phenotype inconsistency [missing phenotype, diagnosis of mild cognitive impairment or other neurodegenerative phenotype] were excluded from the analysis ( Supplementary Table S2).
Individual ancestry was determined using SNPweights v.2.1 (Chen et al., 2013) using reference populations from the 1000 Genomes Project (1KGP) (1000Genomes Project Consortium, Auton et al., 2015. By applying an ancestry percentage cut-off ≥ 80%, the samples were stratified into the five super populations, South-Asians (SAS), East-Asians (EAS), Americans (AMR), Africans (AFR), and Europeans (EUR) (Supplementary Table S2). Since most of the samples belonged to the European population, we also determined their ancestry percentage according to four major ethnicities, North-Western, South-Eastern, Ashkenazi-Jewish, and Finnish Europeans, using reference populations available both from SNPweights v.2.1 (Chen et al., 2013) and 1KGP (1000Genomes Project Consortium, Auton et al., 2015. European subjects were stratified into the above-mentioned ethnicities when their ancestry percentage was attributable with an ancestry percentage cut-off ≥ 50% (Supplementary Table S3).
We assigned French-Canadian (FCN) ancestry to subjects included in GenADA GWAS (Li et al., 2008) when they both reported Canada as country of origin for all the grandparents and French as their first spoken language.
Most subjects belonging to the Columbia University Study of Caribbean Hispanics with Familial and Sporadic Late Onset Alzheimer's disease (CIDR) (Lee et al., 2011) had admixed ancestry (Supplementary Table S2); therefore, we stratified the subjects into three groups according to their prevalent ancestral background. This stratification allowed the definition of one dataset composed of African-Caribbean (African ancestry ≥ 50%), one dataset composed of European-Caribbean (European ancestry ≥ 50%) and one dataset including highly admixed subjects (ancestry percentage less than 50% attributable to a unique super-population from 1KGP, 1000Genomes Project Consortium, Auton et al., 2015. Only the African-Caribbean and the European-Caribbean datasets were considered for the analyses. Subjects with genetic ancestry estimates discordant from self-reported ancestry were excluded from the analyses. Next, datasets were tested for presence of consanguineous subjects using FSuite v.1.0.3 (Gazal et al., 2014). Consanguineous female subjects were flagged to avoid their exclusion because of apparent sex-inconsistency (e.g., due to increased homozygosity at X-chromosome SNPs). Subjects showing sex-inconsistency were excluded along with the possibly contaminated samples {heterozygosity F ≤ −0.03; more than 25 related [(identity-bydescent (IBD) ≥ 0.0625, equivalent to 3rd degree relative] within the same dataset}.
With the aim of maximizing the efficiency of quality control procedures and harmonizing the GWAS results after imputation, we collapsed the genotyping data from the 20 GWAS (when needed), according to sample ethnicity and the number of SNPs shared across the SNP-array platforms. Thus, we defined five groups, reported in Supplementary Table S4, where the subjects from different GWAS were collapsed and further QCed to remove the SNPs with a call rate ≤ 95%; Minor Allele Frequency (MAF) ≤ 1%; SNPs with MAF deviating more than 10% from the MAF reported in 1KGP for the relative population; SNPs with differential missingness between cases and controls (P < 0.05); SNPs deviating from Hardy-Weinberg Equilibrium (HWE) in controls (P < 5 × 10 −5 ); tri-allelic SNPs; and SNPs where the alleles are mismatched compared to the 1KGP reference sequence. A/T and C/G SNPs were removed prior to imputation.
All the datasets were phased and imputed using the Michigan Imputation Server , considering the Haplotype Reference Consortium r1.1. 2016 European panel (McCarthy et al., 2016) for Europeans, 1KGP Phase 3 African panel (1000Genomes Project Consortium, Auton et al., 2015 for African Indianapolis-Ibadan and the Consortium on Asthma among African-ancestry Populations in the Americas (CAAPA) (Mathias et al., 2016) for admixed Caribbean. After imputation, SNPs with a r 2 quality score ≤ 0.7 and MAF ≤ 0.01 were excluded. Results of the imputation process are provided in Supplementary Table  S5 and Supplementary Figure S1.
For the statistical analyses, inter-dataset duplicates (IBD ≥ 0.95) were removed from the dataset having the lowest SNP coverage, while, in case of relatedness (IBD ≥ 0.0625) the affected or older subject were kept, independently of SNP coverage (Supplementary Tables S6-S13).

Consanguinity Determination
Consanguinity determination was performed using FSuite v1.0.3 (Gazal et al., 2014), both at the pre-and post-imputation stage on QCed SNPs, according to each subjects' ancestral background. Results were concordant in 98% of the subjects analyzed. Discordant subjects were kept in the subsequent analyses as outbred, since their ROHs may be somatic Copy Number Variations or linked to a specific ancestral background (e.g., ethnic minorities -Acadians, Sardinians, etc.) not captured by our ancestry-determination pipeline.
Subjects showing a homozygous region over 10 Mb on only one chromosome were considered carriers of putative uniparental isodisomy (UPD), according to the homozygosity cut-off previously reported (Papenhausen et al., 2011). UPD carriers were excluded from the association testing of consanguinity with LOAD since they represent subjects affected by chromosomal alterations, not the result of consanguineous unions.

ROH Calling and Burden Analysis
Runs of homozygosity (≥1 Mb) were determined for each ethnic group separately using PLINK1.9 (Chang et al., 2015) and according to the guidelines recently reported (Keller et al., 2012). GWAS datasets were pruned for strong LD (MAF ≥ 0.01, r 2 ≤ 0.1) and ROHs were defined as being ≥ 65 consecutive homozygous SNPs with no heterozygote calls allowed and a density greater than 1 SNP per 200 kb. For each subject, we summed the total length of all their ROHs in the autosome and divided by the total SNP-mappable autosomal distance (2.77 × 10 9 bases) to derive F ROH , the proportion (between 0 and 1) of the autosome in ROHs, as previously described (Chang et al., 2015). F ROH was used as the predictor of case-control status in ROH burden analyses.

Population Genetics, WES Variant Annotation and Statistical Analyses
Inter-population structure was examined using ADMIXTURE (Alexander et al., 2009). Intra-population structure for each ancestral group was determined using principal components (PCs) obtained from EIGENSOFT v.6.1 (Price et al., 2006) using pruned (r 2 ≤ 0.1), directly genotyped, SNPs. The association of consanguinity/autozygosity with LOAD was carried out using three different logistic regression models: (a) MODEL1: adjusting for each subject's age at LOAD onset (when not available, we used the subject's age at last visit or death), sex, the first three PC eigenvalues from population structure and GWAS imputation group (only for Ashkenazi-Jewish Europeans, Finnish Europeans, North-Western Europeans and South-Eastern Europeans, since those subjects were genotyped on multiple platforms); (b) MODEL2: including APOE * 4 dose to the list of covariates used in MODEL1; (c) MODEL3: including "years of education" (EDU) to the list of covariates used in MODEL2.
Since EDU is available only for 43.1% of the full dataset (9,260 out of 21,492 subjects), we fitted MODEL1 and MODEL2 to this restricted subset of subjects as well, to allow an appropriate comparison with MODEL3 of the effect estimates. The association of EDU with F ROH was tested by linear regression, considering MODEL2 and adjusting for diagnosis status. The analyses were carried out for each ethnic group separately and combined using a fixed-effect meta-analysis implemented in GWAMA (Mägi and Morris, 2010).
GWAS participants were mapped to ADSP WES participants through IBD estimation using ∼30K common, overlapping SNPs (MAF > 0.01) between the two datasets. We considered matching samples as those with a pair-wise coefficient of relatedness above 99%. The burden of rare recessive variants was tested using a general linear model, adjusting for sex, age, APOE * 4 dose and ethnicity. ADSP WES data were annotated using Variant Effect Predictor (VEP) (McLaren et al., 2016). Recessive variants having a CADD (Rentzsch et al., 2019) score ≥ 15 and a MAF < 10% were considered when testing the global burden of rare recessive variants in LOAD and in a two-stage GWAS by applying the Recessive-Allele Frequency Test (RAFT) statistic (Lim et al., 2014).
Statistical significance was set at P < 0.05 for all the association testing, while for the two-stage GWAS, we applied Bonferroni's correction according to the number of recessive variants tested in the discovery [P < 1.8 × 10 −5 (0.05/2767)] and replication [P < 0.007 (0.05/7)] phases.

Ancestry Determination and Ethnic-Specific Differences
After quality control procedures, our analyses were restricted to eight distinct ethnic groups, namely African-Caribbeans from Dominican Republic ( Ethnic groups showed significant differences (P < 0.00001) in mean age, EDU and APOE allele frequency, independently of diagnostic status (Figure 3). Both consanguinity rates and consanguinity degree differed significantly (P < 0.00001) across the eight ethnic groups analyzed, with percentages of consanguineous subjects ranging from 1.2% in YRI to 31.7% in ECD ( Table 1). As expected, consanguineous subjects displayed higher F ROH estimates (0.024 ± 0.025 vs. 0.003 ± 0.002) and more ROH (N ROH , 9.4 ± 6.3 vs. 4.4 ± 2.5) compared to outbred subjects (P < 0.00001, Table 2). When considering exclusively the outbred population, both F ROH estimates and N ROH significantly differed across the eight ethnic groups analyzed (P < 0.00001, Figure 3), with ACD and ECD showing the lowest F ROH (0.0007 ± 0.0012) and N ROH (0.4 ± 0.8), respectively. Conversely, the highest F ROH and N ROH were found in FIN (0.0053 ± 0.0038) and FCN (8.4 ± 2.9), respectively (Figure 3).
No significant differences were found for mean age between inbred and outbred groups. Consanguinity rates reported for the eight analyzed ethnic groups largely fall within the ranges reported in the literature (Goldschmidt et al., 1960;Gazal et al., 2015;Vardarajan et al., 2015).
The distribution of APOE genotypes and alleles did not show deviations from HWE (Supplementary Table S14). In addition, APOE genotypes and allele counts differed significantly between outbred and consanguineous subjects only for AJE controls, FCN cases and FIN controls (Supplementary Table S14).
When testing the association of degree of consanguinity with LOAD by separating close (first cousin/double-first cousin/avuncular offspring) from distant (second cousin offspring) consanguinity, it became clear that the association was driven by close consanguinity (close: N = 19,227, OR = 1.713, P = 0.002; distant: N = 21,284, OR = 1.207, P = 0.007, Table 3). The association reported for close and distant consanguinity was independent of APOE * 4 and EDU (Table 3). When considering the analyses carried out in the smaller EDU subset, the inclusion of APOE * 4 does not reduce statistical estimates of the associations (Table 3). Conversely, the inclusion of EDU as a variable slightly decreases all the associations reported in MODEL3 compared to MODEL2, such that the association of distant consanguinity with LOAD in MODEL3 trends in the same direction but is no longer statistically significant (OR = 1.170, P = 0.158, Table 3). Table 4 reports the results obtained from the meta-analysis of the association of genome-wide autozygosity determined both by F ROH estimates and by N ROH across the eight ethnic groups. When considering the full dataset, both F ROH and number of ROHs are significantly associated with LOAD, independently of APOE * 4 (Table 4). However, when testing the association of F ROH and the number of ROHs with LOAD in the subset with information on EDU, the meta-analysis results are not statistically significant for Table 4, likely reflecting a lack of power rather than an effect of education given that MODELS 1 and 2 are also no longer significant with these sample sizes.

LOAD Genome Is Not Enriched in Rare Recessive Damaging Variants
Given the consistent association of LOAD with consanguinity and autozygosity, we leveraged ADSP WES data to establish whether LOAD subjects showed an enrichment of damaging recessive variants compared to the control population. After merging GWAS imputed data from NWE, SEE, AJE, FIN, and FCN groups with ADSP WES data, we determined that 4,969 subjects were overlapping between the two datasets (AJE = 287; FIN = 47; NWE = 4,424; SEE = 211).  Previous studies have shown that long ROHs are enriched for damaging homozygous variants, with the majority having a MAF ≤ 5% (Pemberton et al., 2012;Pemberton and Szpiech, 2018). Thus, we determined the number of rare, deleterious minor homozygote variants (RMHV) for each GWAS subject that was also whole-exome sequenced through ADSP. The four ethnic groups differed significantly in the average individual number of RMHV in their respective outbred population (N = 4,753, P = 0.0002), independently of diagnostic status (AJE: 18.97 ± 0.31; FIN:16.65 ± 0.74; NWE: 14.88 ± 0.08; SEE: 18.16 ± 0.38). As expected, consanguineous subjects displayed a significantly higher average individual number of RMHV compared to the outbred population, independently of their ethnicity and diagnostic status (23.31 ± 0.37 vs. 15.26 ± 0.08, P < 0.00001). Notably, the average RMHV in 12 subjects carrying a putative UPD was lower than the one reported for the outbred group, and significantly different compared to distant or close consanguineous subjects (UPD:14.46 ± 1.43; When testing the burden of RMHV in LOAD vs. controls, no significant association was detected in the outbred (LOAD:15.15 ± 0.10; Controls:15.35 ± 1.14; P = 0.303) or consanguineous (LOAD:23.97 ± 0.99; Controls:24.48 ± 1.67; P = 0.805) group.

Identification of RPH3AL p.A303V (rs117190076) as RMHV Associated With LOAD
Despite the lack of association between the burden of RMHV and LOAD, we decided to leverage WES data to perform a two-stage recessive-GWAS using the 201 consanguineous subjects identified in ADSP as discovery phase, followed by validation in the remaining 10,469 ADSP subjects. To this aim, we applied the RAFT statistic (Lim et al., 2014) to the 2,767 RMHV detected in the discovery cohort composed exclusively of consanguineous subjects. Seven RMHV yielded a Bonferroni's corrected statistically significant P < 1.8 × 10 −5 ( Table 5). When applying the RAFT statistic to the seven variants in the validation group, only the RPH3AL missense variant (rs117190076, NP_001177340 p.A303V), successfully replicated (Genotype Relative Risk = 1.9, P = 8.0 × 10 −4 ). However, we could not validate one of the seven variants passing the statistical threshold in the discovery phase (SCAPER on chr15q24.3, rs200719909, NP_001339938 p.A280V), because no minor homozygote was detected in the replication/validation phase conducted on outbred subjects (Table 5). Remarkably, 523 out of 2,767 RMHV tested in the discovery group (18.9%) did not have a minor homozygote counterpart in the validation group (Supplementary Table  S15). Although those variants did not pass the statistical threshold, set up by applying the Bonferroni's correction in the discovery phase on outbred subjects, they may still have functional/causal role in LOAD.

Putative UniParental Disomy Does Not Associate With LOAD
During the inbreeding determination process, 56 subjects (6 ACD, 42 NWE, 8 SEE) were found to be potential cases of UPD (Figure 4). The presence of UPD did not show a significant association with an increased risk for LOAD in a logistic regression testing the presence of putative isodisomy compared to the rest of ACD, NWE, and SEE outbred populations (OR = 1.561, P = 0.158). The origin of UPD in our subjects is unknown due to the lack of genotype data from their parents. Nine putative UPD subjects had first-degree relatives genotyped and in these nine cases none of the first-degree relatives showed any evidence of consanguinity or shared long ROHs, suggesting the presence of true isodisomy. Moreover, one of the 12 UPD subjects had ADSP WES data, showing a UPD on chromosome 9p (9p-UDP), was homozygote for the somatic JAK2 V617F (rs77375493) mutation. Since the co-occurrence of JAK2 V617F mutation and 9p-UPD is very common in hematological malignancies (Wang et al., 2016), a somatic (hematologic) origin for some of the reported UPD is highly conceivable, especially since 9p-UPD is the most common UPD among our UPD subjects (8/56, 14%, Figure 4).

DISCUSSION
Our results clearly demonstrate the effect of recent consanguinity and outbred autozygosity in increasing the risk of LOAD consistently across the eight ethnic groups analyzed, independently of APOE * 4 and EDU. Several important features separate our work from previous studies looking at the impact of consanguinity and autozygosity on LOAD. First, and most critically, this is the largest such study to date combining 11,196 cases and 10,296 controls across eight ethnically distinct populations. Second, we went beyond standard superpopulation definitions of ethnicity and determined European sub-ancestry (NWE, AJE, FCN, FIN, NWE, and SEE), since it has previously been established that these ethnicities have different inbreeding rates (Pemberton et al., 2012;Kang et al., 2016), or are characterized by founder effects (AJE, FCN, FIN) (Jakkula et al., 2008;Roy-Gagnon et al., 2011). Third, rather than examining autozygosity across all subjects, we enriched our sample by identifying a consanguineous subset and analyzing them separately from the relative outbred population. This step allowed us to estimate the risk for LOAD attributable to the mating types of consanguineous subjects (first cousin/doublefirst cousin/avuncular vs. second cousin offspring), thereby providing a measure applicable to the clinical setting. Fourth, we leveraged the large amount of WES data from ADSP to determine the contribution of rare recessive damaging variants in LOAD. Lastly, we provided, for the first time, an estimate of the impact of putative isodisomy on LOAD; these subjects were also removed from our analysis of inbred subjects, thereby eliminating a source of noise since these subjects are wrongly identified as consanguineous using standard measures. The overall results suggest the existence of inbreeding depression, which is a recognized phenomenon that is common to polygenic traits in all living organisms (Joshi et al., 2015). Inbreeding depression is thought to result from increased homozygosity of multiple recessive alleles that act in the same direction of effect at loci that influence the phenotype of interest ("directional dominance") (Joshi et al., 2015). In a consanguineous individual, inbreeding depression is predicted to affect many polygenic endophenotypes which can be established risk factors for late-onset diseases, such as blood pressure, body mass index, cholesterol levels, glucose levels, and bone mineral density (Rudan and Campbell, 2004). The previous negative association of educational attainment and general cognitive abilities with genome-wide autozygosity (Joshi et al., 2015) suggests involvement of directional dominance at these two endophenotypes in increasing the risk for LOAD. Indeed, it has been widely reported that lower education is associated with a greater risk for dementia (Sharp and Gatz, 2011), while lower general cognitive abilities have been linked to an increased risk of dementia according to the cognitive reserve theory (Schmand et al., 1997). However, the present results show that the association of consanguinity with LOAD is independent of educational attainment. This evidence leads us to speculate on the involvement of other polygenic endophenotypes mediating the association of consanguinity with LOAD or on the direct effect of recessive loci in LOAD, yet to be discovered. In this context, we can mention that a recent study (Andrews et al., 2021), leveraging Polygenic Risk Score/Mendelian Randomization analyses on a large sample (N = 26,431 LOAD cases/controls tested for 22 LOAD risk factors/clinical biomarkers), strongly supported a causal role for blood pressure and cholesterol levels with LOAD phenome. Thus, it may be conceivable that directional dominance acting on blood pressure and cholesterol levels may be contributing to the association reported here. Future studies targeting a larger subset of consanguineous subjects, phenotypically characterized in a more homogeneous way, ideally including clinically relevant biomarkers such as blood pressure and cholesterol levels, will allow us to better determine the impact of directional dominance at those endophenotypes in LOAD.
Notably, despite significant differences in consanguinity rates, autozygosity level, mean age, mean EDU, and APOE frequencies, each of the ethnic groups individually showed significant association (or a non-significant trend in the same direction) for the association of close consanguinity or autozygosity with an increased risk of LOAD. Our results in ACD, ECD, and YRI are reassuringly and not surprisingly, in line with the previous studies (Ghani et al., 2013(Ghani et al., , 2015 since we used overlapping datasets. However, the results in the European groups (AJE, FCN, FIN, NWE, and SEE) are new and highlight interesting differences across the five ethnicities.
Previous studies carried out on Europeans of British/Irish descent (Nalls et al., 2009a;Sims et al., 2011) reported inconsistent results on the role of ROHs in Caucasians. However, neither study had sufficient power to detect significant results given the small sample size (N < 3,000). Indeed, given the small variation in genome wide F ROH in unselected samples (standard deviation in our analyses are on the order of 0.001), large sample sizes (e.g., >12,000) are necessary to detect inbreeding depression given the relatively small effect sizes in samples not selected for recent inbreeding (Keller et al., 2012). We also leveraged WES data from ADSP to determine the contribution of rare recessive variants in LOAD. The lack of association between the global burden of rare recessive variants and LOAD suggests either the involvement of increased homozygosity at common loci or the existence of specific recessive loci driving the association of consanguinity with LOAD. The two-stage recessive-GWAS we carried out using ADSP WES data showed the association of RPH3AL p.A303V (rs117190076) with LOAD. The RPH3AL gene (also known as NOC2), located on 17p13.3 (OMIM * 604881), encodes for the Rabphilin 3A-like (without C2 domains) protein which plays an essential role in endocrine and exocrine cells, ranging from the accumulation of secretory granules of increased size to impairments in the regulated release of their secretory products (Cheviet et al., 2004). In particular, RPH3AL has been shown to be a crucial effector for RAB3A and RAB27A in the regulation of secretory vesicle exocytosis (Fukuda et al., 2004). The dysregulation of RAB3A and RAB27A has already been linked to Alzheimer's and other neurodegenerative disorders (Davidsson et al., 2001;Ginsberg et al., 2011;Bereczki et al., 2016;Iguchi et al., 2016), while the ancestral RPH3A (Rabphilin 3A) gene (Craxton, 2010) was found to influence dementia severity, cholinergic deafferentation, and increased β-amyloid concentrations in postmortem neocortex of Alzheimer's disease patients (Tan et al., 2014). Moreover, other rs117190076unlinked variants at the RPH3AL locus have been associated with LOAD-related phenotypes, such as Alzheimer's age-atonset in PSEN1 E280A carriers (rs4341804, P = 7.10 × 10 −13 ) (Vélez et al., 2013) and cognitive performance scores in electronic health records (rs74192827, P = 5.02 × 10 −7 ) (McCoy et al., 2018). Thus, the overall evidence suggests a functional role of the RPH3AL locus in LOAD that clearly warrants further investigations.
Interestingly, our work highlighted the presence of potential UPD carriers in the population studied, with prevalence estimates of 0.25% in NWE, 0.74% in SEE and 1.51% in ACD, respectively, showing a trend toward a significant association with increased risk of LOAD. Current estimates of UPD in the general population suggests a general prevalence of 0.05% (1 in 2,000 births) (Nakka et al., 2019), lower than the estimates we reported. One explanation for this discrepancy could be the fact that we were not able to determine whether those long, unique, ROHs (used to define the presence of UPD) may turn to be true long deleted genomic regions, of somatic or germ-line origin. Indeed, we did not have access to raw SNParray intensity data from most of the cohorts included in our study, leading to an under-estimation of large deletions and a consequent increased number of subjects carrying a putative UPD. Nonetheless, our studied sample is mostly representative of the elderly population, where age-related somatic events, like Clonal Hematopoiesis of Indeterminate Potential (CHIP), already linked to cardiovascular disease (Jaiswal et al., 2017), may result in large somatic genomic aberrations, such as "pseudo" 9p-UPD (Wang et al., 2016), that can be misinterpreted as germline UPD. A deeper analysis of these phenomena is clearly warranted, since it may offer important insights into the missing heritability of several age-related diseases. In this context, it is remarkable that functional mutations of the TET2 gene, a main driver of  CHIP (Jaiswal et al., 2017), have recently been found to be associated with multiple neurodegenerative disorders, including LOAD (Cochran et al., 2020). One important limitation is that the ethnic stratification, especially for the European groups, led to very small samples in terms of the number of inbred subjects for some of the ancestral groups [e.g., FCN, N = 53; FIN, N = 31 (Supplementary Tables S6-S13)]. Nonetheless, the meta-analytic approach used can greatly mitigate potential biases due to the inclusions of small samples, while providing a better sense of the ethnic-related differences in consanguinity prevalence. Similarly, considering the important contribution of somatic genomic events related to aging such as CHIP, the heterogeneous nature of the specimens used (e.g., whole blood vs. post-mortem brain tissue) in SNP-array genotyping across the different cohorts and samples may have led to uncontrolled biases when determining the impact of true ROHs vs. large genomic deletions of somatic origin.
Overall, these results provide substantial evidence that consanguinity increases risk for LOAD. One might anticipate a change in the genetic architecture of LOAD in the coming decades when more recent cohorts, composed of subjects born after the World War II, will be analyzed. Panmixia and larger effective population sizes have resulted in decreasing autozygosity as the chronological age of a population decreases (Nalls et al., 2009b). Consistent with this pattern, mounting evidence suggests that trends in dementia incidence rates are decreasing (Satizabal et al., 2016;Derby et al., 2017;Noble et al., 2017). Subsequent work with increased sample sizes of consanguineous subjects should accelerate the discovery of non-additive genetic effects in LOAD.

ALZHEIMER'S DISEASE NEUROIMAGING INITIATIVE
Data used in preparation of this article were obtained from the Alzheimer's Disease Neuroimaging Initiative (ADNI) database (adni.loni.usc.edu). As such. the investigators within the ADNI contributed to the design and implementation of ADNI and/or provided data but did not participate in analysis or writing of this report. A complete listing of ADNI investigators can be found at: http://adni.loni.usc.edu/wp-content/uploads/how_to_apply/ ADNI_Acknowledgement_List.pdf.

DATA AVAILABILITY STATEMENT
The original contributions presented in the study are included in the article/Supplementary Material, further inquiries can be directed to the corresponding author/s.

ETHICS STATEMENT
This was a re-analysis of de-identified data available from shared data repositories. The study protocol was granted an exemption by the Stanford Institutional Review Board because the analyses were carried out on "de-identified, off-the-shelf " data. The patients/participants provided their written informed consent to participate in this study.

AUTHOR CONTRIBUTIONS
VN: conceptualization, data curation, formal analysis, investigation, methodology, and writing -original draft preparation. MS: formal analysis, validation, and writing -review and editing. RK: formal analysis, resources, writing -review and editing. AA: writing -review and editing. MG: funding acquisition, supervision, and writing -review and editing. All authors contributed to the article and approved the submitted version.