Clinical and Genetic Analysis of Costa Rican Patients With Parkinson's Disease

Background: Most research in genomics of Parkinson's disease (PD) has been done in subjects of European ancestry, leading to sampling bias and leaving Latin American populations underrepresented. We sought to clinically characterize PD patients of Costa Rican origin and to sequence familial PD and atypical parkinsonism-associated genes in cases and controls. Methods: We enrolled 118 PD patients with 97 unrelated controls. Collected information included demographics, exposure to risk and protective factors, and motor and cognitive assessments. We sequenced coding and untranslated regions in familial PD and atypical parkinsonism-associated genes including GBA, SNCA, VPS35, LRRK2, GCH1, PRKN, PINK1, DJ-1, VPS13C, and ATP13A2. Results: Mean age of PD probands was 62.12 ± 13.51 years; 57.6% were male. The frequency of risk and protective factors averaged ~45%. Physical activity significantly correlated with better motor performance despite years of disease. Increased years of education were significantly associated with better cognitive function, whereas hallucinations, falls, mood disorders, and coffee consumption correlated with worse cognitive performance. We did not identify an association between tested genes and PD or any damaging homozygous or compound heterozygous variants. Rare variants in LRRK2 were nominally associated with PD; six were located between amino acids p.1620 and 1623 in the C-terminal-of-ROC (COR) domain of Lrrk2. Non-synonymous GBA variants (p.T369M, p.N370S, and p.L444P) were identified in three healthy individuals. One PD patient carried a pathogenic GCH1 variant, p.K224R. Discussion: This is the first study that describes sociodemographics, risk factors, clinical presentation, and genetics of Costa Rican patients with PD, adding information to genomics research in a Latino population.


INTRODUCTION
Parkinson's disease (PD) is a complex and heterogeneous movement disorder caused by a progressive degeneration of dopaminergic neurons. Main clinical motor symptoms associated with PD include tremor, rigidity, bradykinesia, and postural imbalance (1). Years before motor symptoms are manifested, there can be prodromal non-motor key features that include rapid eye movement (REM) sleep disorders, anosmia, and constipation (2). Cognitive impairment involving dysexecutive dysfunction with deficits in planning (3), shifting and sharing of attention (4), and problem solving (5), together with visuospatial dysfunction (6), can be also present from early stages of the disease. PD pathophysiology involves environmental factors as well as genetic variance, which provide insight into its molecular pathogenesis. Among environmental factors that contribute to PD risk are pesticide and herbicide exposure, welding, and well water consumption. There are also protective factors such as smoking, coffee consumption, and performing physical activity that may reduce the risk of developing PD (7).
Since the description of PD-associated mutations in the SNCA (8), other genes have been linked to autosomal dominant (AD) forms of familial PD, including LRRK2 and VPS35. In addition, there are clinically and genetically diverse early-onset (EO) autosomal-recessive (AR) forms of PD with associated genes like PRKN, PINK1, and DJ-1 that exhibit phenotypes similar to idiopathic PD, while other associated genes such as VPS13C and ATP13A2 combine atypical features of parkinsonism like dystonia and early cognitive impairment, along with a poor response to levodopa (9). Large-scale genome-wide association studies (GWASs) have identified 90 variants for PD risk across 78 genomic regions, confirming SNCA and GBA as the most important ones (10). Different GBA locus present as strong risk factors for PD in both homozygous and heterozygous state, displaying a phenotype similar to idiopathic PD, yet with a faster rate of progression of cognitive and motor decline (11).
Clinical characterization of PD in Latin American and Hispanic populations has been scarce (12). Likewise, there is a lack of diversity in genomics with an overrepresentation of European-derived individuals, leading to sampling bias and leaving large populations underrepresented (13). Few genetic trials have been conducted in PD individuals from Latin American populations. Studies looking at LRRK2 mutations have shown that their frequency varies across geographic areas and ethnicity groups. For the G2019S mutation in the LRRK2 gene, frequencies range from 0.2 to 0.4% in Peruvian cohorts (14,15), up to 4% in Uruguayans (14) and 5.45% in an Argentinian series (16)(17)(18)(19). Likewise, the R1441G and R1441H mutations in this same gene seem to be uncommon in Latin American populations (0.3-0.8%) (14,18). The LARGE-PD, a research consortium established among several Latin American countries, has been collecting data for what is the largest PD cohort in the region, allowing for large-scale genotyping as well as performing GWAS in these cohorts (20)(21)(22). This initiative aimed to estimate the frequency of LRRK2 mutations in the region and reported varying frequencies of the G2019S and R1441G/C mutations, which strongly correlated with the European admixture of the samples analyzed (15,20).
GBA mutations have also been studied in few Latin American cohorts but mainly focused on most frequently reported mutations in other populations. The observed frequency of these mutations varies across regions ranging from 0.2% (p.N370S) to 0.7% (p.E326K) in Ecuadorians (23) and up to 5.5% (p.L444P) in Mexican Mestizo and Brazilian cohorts (23)(24)(25)(26)(27). Few studies have studied the entire GBA gene in Latin America, showing a frequency similar to those reported in individuals of European descent (4-5%), but lower than frequencies reported in Ashkenazi patients (20%) (28). Moreover, the overall frequency of GBA mutations seems to be consistently higher than LRRK2 mutations across different geographic areas, suggesting that GBA could play a more important role in PD genetics for Latin American populations. Velez-Pardo et al. found a mutation that was specific for a Colombian cohort (p.K198E) and in a much higher frequency (9.9%) highlighting the need to sequence the whole GBA gene rather than focusing only on assessing commonly reported mutations (27).
In this study, we sought to clinically characterize PD patients of Costa Rican origin and to sequence familial PD and atypical parkinsonism-associated genes in Costa Rican PD cases and controls.

Study Subjects
We enrolled 118 consecutive unrelated PD patients (68 males, 50 females) with 97 unrelated controls (28 males, 69 females), matched according to age and gender whenever possible. Thirtyfive patients (16.28%) reported having a relative (≤2 • ) with any sort of movement disorder; of those, 21 (9.77%) had a formal PD diagnosis. All subjects resided and were originated from Costa Rica and were recruited at the Movement Disorders Unit of the Department of Neurology, Hospital San Juan de Dios, Caja Costarricense de Seguro Social. All patients fulfilled Gelb criteria for the clinical diagnosis of PD, while controls had no signs or personal history of any neurodegenerative disease and were mainly the spouses of the PD cases. We preferred using Gelb criteria over the United Kingdom Parkinson's Disease Society Brain Bank (UKPDSBB) as it provided different clinical diagnostic levels of certainty (possible and probable) and it has shown to have similar positive and negative predictive values, as well as sensitivity and global accuracy when compared to UKPDSBB (29). Albeit both diagnostic criteria sets have low specificity and are mainly focused on motor features, UKPDSBB criteria further err by challenging PD diagnosis in the presence of genetic risk factors (30). Our last patient was enrolled by 2011, which is 4 years earlier than when the Movement Disorder Society (MDS) task force proposed the new clinical diagnostic criteria for PD (MDS-PD criteria) (31); therefore, we were not able to use those for clinical diagnosis of patients enrolled in our study.
We gathered information concerning work and educational status as well as history of exposure to risk and protective factors of PD. We further obtained detailed information on PD history, comorbidities, and antiparkinsonian treatments. Additionally, motor disability of the patients was evaluated by means of the

Genetic Analysis
Molecular inversion probes were used to sequence coding and untranslated regions in familial PD and atypical parkinsonismassociated genes including GBA, SNCA, VPS35, LRRK2, GCH1, PRKN, PINK1, DJ-1, VPS13C, and ATP13A2 at McGill University with Illumina HiSeq 4000 as previously described (32). The full protocol can be found at https://github.com/gan-orlab/MIP_ protocol. All sequences were aligned using Burrows-Wheeler Aligner (BWA) using the reference genome hg19 (33). Genome Alignment Tool Kit (GATK v3.8) was used to call variants and perform quality control and ANNOVAR was used to annotate each variant (34,35). Exons 10 and 11 of GBA were sequenced using Sanger sequencing as previously described (36), and GBA variants in other exons were also validated using Sanger sequencing. We decided to focus on genes that are involved in typical PD, as our selected cohort is of typical PD (10,37,38).

Quality Control
All samples and variants were filtered based on standard quality control process as previously reported (39). In brief, variants were separated into common and rare by minor allele frequency (MAF) in the cohort. Rare variants (MAF < 0.01) with a minimum depth of coverage of >30× were included in the analysis, along with common variants (MAF ≥ 0.01) with >15× coverage. We have established that for common variants, we get reliable reads at 15×; however, to get reliable reads for rare variants, we need >30×; otherwise, there are many false positives (40). Variant calls with a genotype frequency of <25% of the reads or genotype quality of <30 were excluded. Samples and variants with more than 10% missingness were also excluded.

In silico Structural Analysis
The atomic coordinates of the human Lrrk2 C-terminal domain structure (a.a. 1327-2527) were downloaded from the Protein Data Bank (ID 6VP6). The figure was generated using PyMol v.2.4.0.

Statistics
We used Stata R (version 14) for the statistical analysis of sociodemographic and clinical variables. Normally distributed variables are reported as mean with its standard deviation (SD), whereas continuous but non-normally distributed variables are reported as median with the 25th and 75th percentile values (interquartile range, IQR). Normally distributed variables were compared with paired or unpaired t-tests, while non-normally distributed variables were compared with Mann-Whitney U-test or Wilcoxon match-paired signed-rank test. Frequencies were compared with χ 2 and Fisher's exact test. Tests were two-tailed, and significance was set at p < 0.05. We modeled through linear regression the association between demographic and clinical variables with the severity of the disease, as indexed by UPDRS and MoCA, as dependent variables in the models.
For genetic analysis, common and rare variants were analyzed separately. Association of common variants was tested using logistic regression adjusted for age and sex in PLINK v1. 9. For rare variants' analysis, we examined the burden of rare variants in each gene using optimized sequence Kernel association test (SKAT-O) adjusted for age and sex (41). Rare variants were separated into different categories based on their potential pathogenicity to examine specific enrichment in different variant subgroups as described previously (40): (1) variants with Combined Annotation Dependent Depletion (CADD) score of ≥12.37 (representing the top 2% of potentially deleterious variants) (42); (2) regulatory variants predicted by ENCODE (43); (3) potentially functional variants including all nonsynonymous variants, stop gain/loss variants, frameshift variants, and intronic splicing variants located within two base pairs of exon-intron junctions; (4) loss-of-function variants, which includes stop gain/loss, frameshift, and splicing variants; and (5) only non-synonymous variants. Bonferroni correction for multiple comparisons was applied as necessary.
This study was approved by the Ethics Committee of Hospital San Juan de Dios, Caja Costarricense de Seguro Social (CLOBI-HSJD #014-2015) and the University of Costa Rica (837-B5-304). Written informed consent was obtained from all participants.

Sociodemographic and Clinical Variables
At enrollment, PD probands had a mean age of 62.12 ± 13.51 years (range 25-86), and the mean age at onset was 54.62 ± 13.54 (range 16-83) years. Male PD patients comprised 57.63% of the sample. Despite the fact that a significantly larger proportion of the male PD patients reported current or previous jobs involving agricultural activities (19.40% male, 2.08% female; p = 0.01), the mean number of years of education of these men was significantly higher than women (10.74 ± 3.81 vs. 8.86 ± 4.01; p = 0.03). Table 1 details subjects' baseline characteristics along with the frequency of exposure to main risk and protective factors for PD. Most of the risk and protective factors were more prevalent in men. Tables 1, 2 detail the frequency of clinical manifestations as well as the standardized scale scores reported for PD cases. The most frequent initial symptoms included resting tremor (71.30%), rigidity (24.07%), and pain (10.19%). Most of the . Common non-motor manifestations such as hyposmia, sleep disorders and depressive/anxious mood were seen in more than 50% of the cases. Overall median score of UPDRS "ON" was 36 , most of our patients were graded in the "2.5" and "3" categories of the H&Y scale with a median for S&E score of 80% (80-90%). The median value for the MoCA test was 22 (17)(18)(19)(20)(21)(22)(23)(24)(25). There were no statistically significant differences between sex, regarding these scores.
We were able to establish through multivariate linear regression modeling that an increased disease duration along with the presence of orthostasis, dysphagia, and mood disorders significantly correlated with increased scores in total ON UPDRS. Furthermore, we found an interaction between performing regular physical activity and duration of disease, where despite having increased years of evolution, patients that performed regular physical activity still scored less in the total ON UPDRS (see Supplementary Figure 1). Additionally, lower scores in MoCA testing significantly correlated with increased age, coffee consumption, and the presence of hallucinations, falls, and mood disorders (depression/anxiety), whereas increased years of education correlated with better MoCA scores (see Supplementary Figure 2).

Quality of Coverage and Identified Variants
The average coverage of the 10 genes analyzed in this study was >588× for all genes.

Rare and Common Variants in PD and Parkinsonism-Related Genes
Burden and SKAT-O analyses did not identify an association of any of the tested genes and PD ( Table 3) after correction for multiple comparisons, as expected given the small sample size. We also did not identify any PD patients with potentially damaging homozygous or compound heterozygous variants in any of these genes. Rare variants in LRRK2 were nominally associated with PD, and 11 (9.2%) patients carried a rare non-synonymous variant, compared to four (4.1%) among the controls. Interestingly, six of these rare non-synonymous variants, all located between amino acids p.1620 and 1623 in the C-terminal-of-ROC (COR) domain of Lrrk2, were found in six patients and none in controls ( Table 4). Non-synonymous GBA variants were identified in three individuals: p.T369M was identified in a male patient with age at onset of 48 years, p.N370S was identified in a healthy female individual recruited at the age of 78 years, and p.L444P was identified in a healthy female individual recruited at the age of 64. While we cannot rule out that these healthy individuals will develop PD in the future, it is unlikely that GBA variants have a major role in PD among Costa Rican patients. One PD patient carried a pathogenic GCH1 variant, p.K224R, further emphasizing the role of this gene in PD.
In the analysis of common variants, none of the variants was associated with PD after correction for multiple comparisons (Supplementary Table 3), which set the corrected p-value for statistical significance at p < 0.00031. One non-synonymous variant in LRRK2, p.I723V, was found with allele frequency of

Clinical Features
PD prevalence has been increasing over time with a global agestandardized prevalence rate increase of 21.7% from the years 1990 to 2016 (44). Furthermore, PD prevalence seems to be lower in Eastern compared to Western countries (45). Few studies have explored the prevalence of PD in Latin America providing values that are similar either to other developing countries (46) or to European cohorts (47,48). PD also becomes more common with advancing age (44,45). Our sample average age of PD at onset and at diagnosis was lower when compared to other cohorts (49-51), although it could suggest that PD presents earlier in Costa Rica, and more epidemiological studies are needed as it could also be related to recruitment bias. The majority of our patients fulfilled Group A Gelb criteria while up to 60% also reported at least one of Group B symptoms, the most frequent being dystonia, falls, and dysphagia. The median for years of evolution of the disease for both men and women was 5; thus, we would expect to find Group B criteria in these patients along with the evolution of the disease. Few studies have explored ethnic variations in motor symptoms of PD, suggesting increased atypical features in Black and South Asian PD patients (52,53); however, there is not enough evidence available along with a lack of standardized methodology to determine motor subtypes across studies and to further establish ethnic patterns of motor features (12). Common non-motor manifestations such as hyposmia, sleep disorders, and depressive/anxious mood were seen in more than half of our PD cases. Regardless of ethnicity, non-motor features are commonly present in PD with subtle differences described. Gastrointestinal non-motor features along with depression seem to be high in East Asian cohorts (54,55). Likewise, Latino populations, such as Mexican (56), Peruvian (57), and ours, also reported high frequency of mood disorders including depression and anxiety, when compared to studies from UK and USA (58,59). We also observed in our sample a frequency of sleep disorders and hyposmia that is higher than those reported in other cohorts (12).
Overall, our patients had a low education, which has been previously associated with a higher hazard of incident parkinsonism (60). A reduced education has also been suggested as a risk factor for cognitive impairment in PD (61). A history of non-potable water consumption along with exposure to pesticides and herbicides was reported in up to 40% of our patients. This type of exposure agrees with a mostly rural origin and the fact that 12.2% of the subjects reported involvement in agricultural activities as a main income source. We did not assess the frequency of protective and risk factors in the control group; hence, we are not able to establish any comparison with PD cases. Previous exposure to pesticides and herbicides is associated with the development of PD (62); yet, the identification of a given specific agent and the exact timing and dosing of exposure are almost impossible to establish through observational studies (63,64). Nonetheless, key work detailing specific mechanisms that render patients vulnerable to pesticide-induced injury has been elegantly shown in animal models, further establishing biologic and toxicological pathways for specific chemicals to potentially cause PD (65). A similar situation is present regarding the exposure to welding and heavy metals. Manganese (66), copper, iron (67), and mercury (68) have been proposed as possible agents associated with the development of PD. In this study, 22.1 and 11.6% of the patients reported frequent exposure to welding and other heavy metals, respectively; however, the exact timing and dosing of exposure was not possible to assess.
Other literature has underscored the presence of protective factors for PD development, among which the most notable and with the strongest evidence include tobacco (69) and coffee consumption (7,(70)(71)(72)(73). For both protective factors, there is also a dosing effect described, where the protective effect increases along with an increasing exposure (74,75). Paradoxically, over Six of these rare non-synonymous variants ( † ), all located between amino acids p.1620 and 1623 in the COR domain of LRRK2, were found only in patients and not in controls.
90% of our PD cases had been exposed to a protective factor in the past, most of them having a regular coffee intake (two to three cups per day for over 15 years), and yet they all developed PD. Performing regular physical activity correlated with lower ON UPDRS scores in spite of increasing age. Physical activity has been established as a possible protective factor for incident Parkinsonism (76); our data would suggest that physical activity could determine reduced severity of disease, specifically concerning motor features. Although exercise has not been proven to slow the progression of akinesia, rigidity, and gait disturbances, it promotes a feeling of physical and mental well-being, and at the same time, it can alleviate rigidity-related pain and improve patients' motor (77) and non-motor symptoms (78).
Increasing age, coffee consumption, hallucinations, falls, and mood disorders along with reduced years of education significantly correlated with worse MoCA scores. Older age and duration of PD are determinant risk factors for incidence of dementia in PD (79). Furthermore, hallucinations have been established as risk factors for cognitive impairment (79,80) along with gait disturbances (manifested by falls) (81) and depression (82). Reduced education years also have been proposed as a risk factor for cognitive impairment in patients with PD (61). Poor global cognition has been previously associated with a higher risk of incident parkinsonism (60). Coffee consumption has been suggested to reduce risk of dementia (83) with a dosing effect (84,85); however, there have been inconsistent findings regarding the effects of coffee consumption on specific cognitive domains. It has been suggested to be in association with improved executive performance but smaller hippocampal volume and worse memory function (86); nonetheless, this association is not sustained when cognition is analyzed longitudinally. Other literature suggested that coffee might be slightly beneficial on memory without a dose-response relationship (87). Recent largescale genetic analysis using mendelian randomization did not find any evidence supporting any beneficial or adverse long-term effect of coffee consumption on global cognition or memory function (88) or AD incidence (89). To our knowledge, there is no literature evaluating the effect on cognition of coffee consumption, specifically for PD patients. Our findings suggest a possible deleterious effect that should be further explored in this population.

Genetic Assessment
After sequence coding familial PD and atypical parkinsonismassociated genes including GBA, SNCA, VPS35, LRRK2, GCH1, PRKN, PINK1, DJ-1, VPS13C, and ATP13A2 and correcting for multiple comparisons, burden and SKAT-O analyses did not show an association of any of the tested genes and PD. We also did not identify any homozygous or compound heterozygous pathogenic variants in any of these genes.
Non-synonymous GBA variants were identified in three individuals including one patient and two unaffected controls. While we cannot rule out that these healthy individuals will develop PD in the future, it is unlikely that GBA variants have a major role in PD among Costa Rican patients especially when compared to other European and Ashkenazi Jewish populations where we find that 8-20% of the patients harbor GBA variants (28).
Finally, one PD patient carried a pathogenic variant, p.K224R, in the GCH1 gene. GCH1 encodes for GTP cyclohydrolase 1, which is a key enzyme for dopamine production in nigrostriatal neurons. Loss-of-function mutations such as p.K224R have been shown to cause Dopa-responsive dystonia (DRD); however, variants in this gene have also been implicated in PD, perhaps through regulation of GCH1 expression (90,91). It has been suggested that late-onset DRD might present clinically with parkinsonism, or alternatively, pathogenic GCH1 mutations may predispose to both diseases and carriers will develop any or both depending on other genetic or environmental factors (92). Our patient did not present clinical features suggestive of DRD and did not have any family history of PD.
Rare variants in LRRK2 were nominally associated with PD, observed only in affected individuals; six of these rare nonsynonymous variants were located between amino acids p.1620 and 1623 in the COR domain of Lrrk2. LRRK2 encodes a multiple domain protein that includes a Roc-COR tandem domain, a tyrosine kinase-like protein kinase domain, and at least four repeat domains located within the N-terminal and C-terminal regions. The Roc-COR domain classifies the Lrrk2 protein as part of the ROCO superfamily of Ras-like G proteins (93). Mutations in LRRK2 are the most common cause of late-onset hereditary PD. Most frequently reported disease-causing mutations are located in the kinase domain (i.e., G2019S), increasing kinase activity, and in the Roc-COR tandem domain (i.e., R1441C/G and Y1699C), impairing its GTPase function. Alterations of both kinase and GTPase activity may mediate neurodegeneration in these forms of PD (94). Of the six patients found to have nonsynonymous variants in the COR domain, two had first-degree relatives with dementia, one had a second-degree relative with PD, and one had two sisters with PD diagnosed at a very young age (20 and 30 years old) (see Table 4).
Methodological issues, such as size and composition of the sample (i.e., number of familial and sporadic cases), might explain the variations seen in the frequency of LRRK2 mutations in case series from similar countries. However, there is a clear difference established among geographical regions, where North African Arabs (95), Ashkenazy-Jews (96) and certain Europeans cohorts (97-99) might report a higher prevalence than Latin American and Asian populations for these mutations (15,100,101).

Structural Analysis of LRRK2 Pathogenic Mutations
The non-synonymous missense mutations described here are all found in the COR domain of Lrrk2. To gain insight into how these mutations may affect the function of Lrrk2, we investigated their locations in the structure of Lrrk2. The highresolution cryoelectron microscopy (cryoEM) structure of the C-terminal domains of Lrrk2 in different states have recently been reported and shed light on how allosteric interactions between different domains regulate microtubule interactions (102). The structure notably shows interactions between the ROC GTPase domain and the COR-B domain, notably involving the pathogenic mutation sites p.Arg1441 and p.Tyr1699 (Figure 1A). These interdomain interactions enable the kinase activity to be regulated by GTP binding to the ROC domain. The mutations described here, found in the segment a.a. 1619-1623, are all located in a loop of the COR-A domain. This loop, which spans a.a. 1613-1624, is disordered in the cryoEM structure, and thus, no atomic resolution model is available for that segment ( Figure 1A). It is therefore not possible to gain detailed insights into the effect of each individual missense mutation.
However, integrative modeling, based on cryoelectron tomography (cryoET) data collected from in situ and in vitroreconstituted Lrrk2 filaments bound to microtubules, shows how the different domains of Lrrk2 dimerize and associate with microtubules (102,103). Dimerization is mediated via two sites through reciprocal interactions: one involving WD40-WD40 interactions and another one involving COR-COR interactions. These interactions enable Lrrk2 C-terminal domains to form extended oligomeric filaments that form a helix around the microtubule. Of particular interest here, the COR-COR dimerization interface involves both the COR-A and COR-B domains, with the loop containing a.a. 1613-1624 at the center of this interface (Figure 1B). Mutations in this loop may thus affect dimerization. Given that the kinase activity and conformation affect the ability of Lrrk2 to dimerize through the COR domain via allosteric interactions, it is possible that mutations in the COR-A loop in turn affect the kinase activity. Further experiments would be required to determine how the mutations described here affect the kinase, dimerization, and microtubule-binding activity of Lrrk2.

LIMITATIONS
Genome analysis from Mestizo populations in Latin America has previously shown in Costa Rica a European, Native American, and African admixture of 66.7, 28.7, and 4.6%, respectively (104). Therefore, we would have expected to observe a higher frequency of mutations, similar to other European series reported. However, our sample size is small and is more representative of the metropolitan area where most of the patients were recruited, thus warranting in the future a more comprehensive study involving a wider and more representative population of the whole country, particularly including more patients from the non-metropolitan and coastal zones. Moreover, the purpose of our study was to serve as an exploratory analysis in this population, which had not been studied before; likewise, we opted to cover as many genes as possible. We are aware that the sample size is limited, yet underrepresented populations with limited funding and resources that struggle to achieve large sample sizes should be studied and reported as well.
We did not gather information concerning protective and risk factors for subjects in the control group, therefore, we were not able to compare and discuss the frequency of these factors between cases and controls.

CONCLUSIONS
This is the first study that reports on sociodemographics, risk factors, clinical presentation, and genetics of Costa Rican patients with PD. We observed a high frequency of exposure to both risk factors (pesticides, herbicides, non-potable water, and low education) and protective factors (tobacco and coffee intake). Regular physical activity significantly correlated with better UPDRS scores despite years of evolution of the disease. Increased years of education were significantly associated with better MoCA test scores, whereas the presence of hallucinations, falls, and mood disorders correlated with a worse performance in the MoCA test. Interestingly, coffee consumption also correlated significantly with worse MoCA test scoring.
We did not find an association between any of the tested familial PD and atypical parkinsonism-associated genes, including GBA, SNCA, VPS35, LRRK2, GCH1, PRKN, PINK1, DJ-1, VPS13C, and ATP13A2, and PD. We also did not identify any homozygous or compound heterozygous pathogenic variants in any of these genes. Rare variants in LRRK2 were nominally associated with PD, with six of these rare non-synonymous variants all located in the COR domain of LRRK2. One PD patient carried a pathogenic GCH1 variant, p.K224R, further emphasizing the role of this gene in PD.

ETHICS STATEMENT
The studies involving human participants were reviewed and approved by the Ethics Committee of the University of Costa Rica (VI-3668-2014 and final report 837-B5-302). The patients/participants provided their written informed consent to participate in this study.

AUTHOR CONTRIBUTIONS
GT-A and EY conceptualized the report and made substantial contributions to the design, drafting, and revision of the work. J-FT performed in silico structural analysis and contributed to the discussion of these results. TL-P, JR-M, AG-P, ZG-O, KC-C, IF-M, and JF-T significantly contributed to drafting and critically reviewing the paper. All authors have contributed to the work and agree with the presented findings and that the work has not been published before nor is being considered for publication in another journal. All authors approved the final version of the manuscript and assume accountabilities for all aspects of the work.