A Comprehensive Analysis of Population Differences in LRRK2 Variant Distribution in Parkinson's Disease

Background: LRRK2 variants have been demonstrated to have distinct distributions in different populations. However, researchers have thus far chosen to focus on relatively few variants, such as R1628P, G2019S, and G2385R. We therefore investigated the relationship between common LRRK2 variants and PD risk in various populations. Methods: Using a set of strict inclusion criteria, six databases were searched, resulting in the selection of 94 articles covering 49,299 cases and 47,319 controls for final pooled analysis and frequency analysis. Subgroup analysis were done for Africans, European/West Asians, Hispanics, East Asians, and mixed populations. Statistical analysis was carried out using the Mantel-Haenszel approach to determine the relationship between common LRRK2 variants and PD risk, with the significance level set at p < 0.05. Results: In the absence of obvious heterogeneities and publication biases among the included studies, we concluded that A419V, R1441C/G/H, R1628P, G2019S, and G2385R were associated with increased PD risk (p: 0.001, 0.0004, < 0.00001, < 0.00001, and < 0.00001, respectively), while R1398H was associated with decreased risk (p: < 0.00001). In East Asian populations, A419V, R1628P, and G2385R increased risk (p: 0.001, < 0.00001, < 0.00001), while R1398H had the opposite effect (p: 0.0005). G2019S increased PD risk in both European/West Asian and mixed populations (p: < 0.00001, < 0.00001), while R1441C/G/H increased risk in European/West Asian populations only (p: 0.0004). Conclusions: We demonstrated that LRRK2 variant distribution is different among various populations, which should inform decisions regarding the development of future genetic screening strategies.


INTRODUCTION
Parkinson's disease (PD) is one of the most common neurodegenerative diseases, affecting ∼2% of people over the age of 60 years, and the most common cause of movement disorders, including bradykinesia, resting tremor, rigidity, and postural instability or gait difficulty. Non-motor symptoms, such as depression, olfactory dysfunction, and constipation are also common in PD (Schapira, 2006). The pathological hallmark of PD is Lewy Body aggregation in neurons and the loss of dopamine neurons in the substantia nigra compacta and corpus striatum.
The pathogenesis of PD is as yet unclear, but genetic and environmental factors, as well as aging are thought to contribute to PD risk. Since the discovery of the PARK1 locus, further autosomal-dominant or -recessive disease genes have been identified (Paisan-Ruiz, 2009). The most common among the former is the leucine-rich repeat kinase 2 (LRRK2) gene, which encodes for a protein containing armadillo (ARM), Ras of complex proteins (ROC), C-terminal of ROC (COR), mitogenactivated protein kinase kinase kinase (MAPKKK), and WD40 domains, in addition to others (Kruger, 2008).
To date, nearly a hundred LRRK2 variants have been identified. Of these, G2019S, R1628P, and G2385R have traditionally received much of the attention, and there have already been a number of meta-analyses on the role of these variants in PD risk in people of different ethnicities (Xie et al., 2014;Liu et al., 2016;Zhang et al., 2016;Zhao and Kong, 2016). These variants each possess distinct geographical distributions. G2019S, which is the most frequently-occurring variant, accounts for 3-6% of familial PD cases among patients in European populations  and nearly 14% of Ashkenazi Jews (AJ) (Luzon-Toro et al., 2007), whereas G2385R and R1628P were found to be more common among East Asian PD patients (Fu et al., 2013).
If strict selection criteria and quality control methods are employed, meta-analysis can be an immensely powerful tool, allowing researchers to pool data from original studies and effectively expanding the sample size (Haines et al., 2008) to provide unbiased evidence with far-reaching clinical implications (Wolf, 2015). In the case of genetic analyses in particular, where original studies are inevitably limited by the diversity of their subject pool, and by their limited sample sizes a meta-analytical approach can be invaluable in providing convincing evidence for the effects of a specific gene on disease risks. In the recent crop of articles, researchers have begun to shift their focus to the other less well-known LRRK2 variants. Given that others have not yet done so, we decided in this study to perform a complete analysis of all relevant original association studies relating to the LRRK2 variants that have been identified thus far.

Selection Criteria
The PICOS (participants, interventions, controls, outcomes, and study types) approach was used to define inclusion criteria. For this study, "participants" were patients diagnosed with PD according to accepted standards, such as the UK PD Society Brain Bank Clinical Diagnostic Criteria (Hughes et al., 1992) and other widely accepted criteria (Calne et al., 1992;Bower et al., 1999;Gelb et al., 1999); "Interventions" consisted of gDNA analysis performed using accepted methods based on PCR; "Controls" were people without PD or related diseases (movement disorders, neurodegenerative diseases etc.); We accepted the definition of PD patients or controls by the authors of the original articles if there were no description of the criteria; "Outcomes" were complete data (complete number of patients or controls carrying either homozygous or heterozygous polymorphisms of LRRK2) and at least four original articles reported on the same variant, and "study types" consisted of original case-control studies, cohort studies (Supplementary Table 1).

Data Extraction
Complete data, including containing the name of the first author, the publication year of the study, subject ethnicity, the country in which the study was performed, the gene and gene variants analyzed, the number of cases and controls and subject genotypes (both homozygotes and heterozygotes) were extracted from all selected original studies, and are detailed in the Table 1,  Supplementary Table 2. Pooled analysis was performed in cases where sufficient data was provided to allow for the calculation of odds ratios (OR) and 95% confidence intervals (CI). If studies had enough data to calculated frequency of variants, we included the articles in the frequency analysis. Newcastle-Ottawa Scale (NOS) was used to perform quality control on all included studies. Data extraction was performed by Li S and Yuan Z, in consultation with Qiying S. Results relating to R1628P were provided by ZY, a co-author on this study, and were based on (Zhang et al., 2017).

Statistical Analysis
Statistical analysis was performed using Revman 5.3 software. The pooled analyses were conducted if there were at least four original studies. Meta-analyses were conducted on total populations and subgroup analyses by ethnicity (Africans, European/West Asians, Hispanics, East Asians, mixed:composed of at least two different groups) (Risch et al., 2002;Zhang et al., 2018). In cases where the Q statistic P > 0.1 and I 2 statistic ≤ 50%, a fixed-effect model was used, otherwise, a randomeffect model was applied for pooled analysis instead. All pooled results were graphed using forest plots and publication biases were showed using funnel plots. Subgroup analysis using the Mantel-Haenszel statistical method was performed to determine how the common LRRK2 variants affect PD risk and the level of significance was set at p < 0.05. Sensitivity analysis was performed by sequentially deleting each included article, and observing how the pooled OR and 95%CI was affected by their    removal. Genotype frequency (GF) and minor allele frequency (MAF) were calculated of each LRRK2 variants in our analyses.

RESULTS
A total of 4,439 articles were included following our initial database search (Figure 1). 2,875 articles remained after repeated articles had been eliminated, and a further 2,618 articles were excluded after we had manually reviewed their titles and abstracts. Among the 257 articles that received a full-text review, 163 were excluded due to no controls, functional studies, not original studies, not complete genotyping data, pedigree analysis and articles studying variants with no more than 4 reported articles. Eventually, 94 relevant articles were included in the final analysis, covering 49,299 cases and 47,319 control subjects ( Table 1). As shown in Table 1, all included articles were of high quality. Subgroup analysis was performed for each of the major ethnic groups (Africans, European/West Asians, Hispanics, East Asians, mixed: composed of at least two different groups). Results of the pooled analysis were graphed using forest plots (Supplementary Figure 1), and variant frequencies in PD patients of different ethnicities were further calculated (Figure 2). All analysis was performed using a fixed-effect model due to there being relatively little heterogeneity among the included studies.

Comprehensive Analysis of LRRK2 Variants in Different Ethnicities
Based on our comprehensive analysis of the common LRRK2 variants, we concluded that A419V, R1398H, R1441C/G/H, R1628P, G2019S, and G2385R were associated with greater PD risk (p: 0.001, 0.0004, <0.00001, <0.00001, and <0.00001, respectively), while R1398H was associated with decreased PD risk (<0.00001; Figure 2). Of the high-risk variants, G2019S posed the greatest degree of risk, followed by R1441C/G/H, A419V, G2385R, and R1628P in descending order, as demonstrated by their OR values, which ranged from 13.16 to 1.83 ( Table 2). By ethnicity, our analysis indicated that A419V, R1628P, and G2385R were associated with increased PD risk in East Asian populations (p: 0.001, <0.00001, and <0.00001), while R1398H had the opposite effect (p: 0.0005). The G2019S variant was found to increase PD risk in European/West Asian and mixed populations (p: <0.00001 and <0.00001), while R1441C/G/H increased risk for European/West Asians only (p: 0.0004; Table 2).

LRRK2 Variant Frequency in PD Patients and Control Individuals of Different Ethnicities
Among the LRRK2 variants which were of statistical significance in previous meta-analyses, the most frequently-occurring LRRK2 variants in PD patients were, in descending order, R1398H, G2385R, R1628P, and A419V, which had MAF ranging from 0.094 to 0.012 in East Asian populations (Figure 2;  Supplementary Table 3). In European/West Asians, the MAF of the high-risk variants G2019S and R1441C/G/H were 0.013 and 0.020, respectively (Figure 2; Supplementary Table 3), and G2019S occurred within mixed populations at a total frequency of 0.013. Further, we found that A419V, R1628P, and G2385R appeared to be specific for Asian populations, while R1441C/G/H were European/West Asians-specific. Even though the total genotype frequency for other variants, such as S1647T and P755L were higher compared to those of G2385R or G2019S, a significant difference in their distribution between cases and control subjects was not apparent.

Sensitivity Analysis and Publication Bias
After sequentially deleting each included article, the pooled OR and 95% CI of each variant was not changed significantly, and the pooled results of each grouped and subgroup analysis remained stable. Publication biases were not obvious from the funnel plots of all responsive variants (Supplementary Figure 2).

DISCUSSION
The current meta-analysis and systematic review is, as far as we are aware, the most comprehensive analysis of common LRRK2 variants in PD to date, and revealed population heterogeneity to be a prominent factor in LRRK2 allelic distribution.
Previous studies of ethnicity-specific LRRK2 variation have focused primarily on the common variants G2019S, G2385R, and R1628P (Xie et al., 2014;Liu et al., 2016;Zhang et al., 2016;Zhao and Kong, 2016), showing that G2019S was more common in European and North American populations while G2385R and R1628P existed only in Asian populations . Our study replicated these results, and further demonstrated the importance of other variants, such as P755L, A419V, and R1398H. We demonstrated that G2019S, R1441C/G/H, A419V, G2385R, and R1628P, in descending order, carried the highest overall degrees of PD risk, as indicated by their ORs, which ranged from 13.16 to 1.83.
In East Asian populations, G2385R, R1628P, and A419V (arranged in descending order according to the frequency of their occurrence) were found to increase PD risk. Although previous studies have been quite successful at identifying high-frequency risk variants, our findings serve to highlight the fact that even those that occur at lower frequencies, such as A419V, should not be neglected, particularly in East Asian populations, even if clinically significant data may only be accessed in these cases through the use of larger sample sizes that are possible only with collaborative multi-center projects. We also determined that G2019S can increase PD risk in European/West Asian and mixed populations, while R1441C/G/H increases PD risk in European/West Asians only. The pooled analysis and frequency analysis of variants in LRRK2 supported the differences in geographic distributions of LRRK2 variants.
The 51-exon LRRK2 gene has always posed a challenge for researchers interested in screening for the gene due to its large size. For the sake of improving the efficiency and economy of genetic diagnosis, it is thus of great importance to prioritize the identification of specific variations instead of sequencing the entire gene (Foroud, 2005). Based on calculations of the ORs associated with each of the LRRK2 variants, we suggest that A419V, G2385R, and R1628P, and G2019S and R1441C/G/H should be screened for first in East Asians and European/West Asians, respectively. Variants with high OR and that occur at higher rates, such as G2385R in East Asian populations and G2019S in European/West Asian populations, should be, in particular, prioritized above all others.
Our meta-analysis also revealed additional details of how mutations affect LRRK2 function, which could have potential ramifications with regards to our understanding of the mechanisms that underlie PD and the development of future treatment strategies. We determined, for instance, that all of the high-risk variants carried mutations in the exon regions of the LRRK2 gene, and in the functional domains of the LRRK2 protein (Figure 3). While the specific pathological  mechanisms of these mutations are as yet unclear, it is possible that they can lead to an increase in the kinase activity of the protein, thereby increasing disease risk as well (West et al., 2005), as in the case of the G2019S "gain of function" feature (Luzon-Toro et al., 2007). If this is demonstrated to be true on a more general basis, it may be promising to target the variations in these vital regions for therapeutic purposes.
Certain LRRK2 variants appear to be linked with clinical phenotypes. For instance, motor fluctuations were more frequent in the G2385R carriers than in non-carriers in PD patients. G2019S had better olfactory function and less likely to have depression than G2385R carriers (West et al., 2005). In a large meta-analysis of LRRK2 related clinical features by our group, we found that LRRK2-G2019S-related PD patients were likely to be female, had higher rates of early-onset PD and family history. Moreover, they tended to have high scores of Schwab & England, low Geriatric Depression Scale (GDS) scores, high University of Pennsylvania Smell Identification Test (UPSIT) scores and responded well to Levodopa. G2385R carriers tended to have family history, lower Hoehn and Yahr rating (H-Y) and higher Mini Mental State Examination (MMSE) scores. However, both G2019S and G2385R carriers were more likely to develop motor complications than non-carriers . Therefore, for the purposes of clinical genetic counseling and testing, the symptoms exhibited by the patient could be useful in guiding the decision of which LRRK2 variant to screen for. Additionally, identifying the specific clinical features associated with carrier of particular LRRK2 variants could also be useful for neurologists in prescribing the correct symptomatic treatments.
There are, of course, limitations to this study. Firstly, although we have endeavored to include every published case-control study in the meta-analysis, certain pieces of unpublished data, or articles that were written in languages other than English or Chinese may have been omitted unintentionally. Secondly, due to the lack of sufficient data, variants in populations other than East Asians and European/West Asians cannot be analyzed, and it is possible that biases may exist in our pooled analysis of different ethnic groups and in their demographic information, such as age and gender. Thirdly, stratified analysis can inadvertently increase the possibility of there being false positives, especially as the sample size is limited.

CONCLUSION
In conclusion, we found that LRRK2 variants A419V, G2019S, R1441C/G/H, G2385R, and R1628P were associated with increased PD risk while R1398H was associated with decreased risk. In East Asian populations, A419V, G2385R, and R1628P increased risk, while R1398H had the opposite effect. G2019S increased the risk in European/West Asian and mixed populations while R1441C/G/H increased the risk of PD in European/West Asians. Combined with frequency analysis, we suggest that A419V, G2385R, and R1628P should receive top priority for screening in East Asian populations and that a greater focus be placed on G2019S and R1441C/G/H in European/West Asian populations.