Parkinson's Disease Subtypes Identified from Cluster Analysis of Motor and Non-motor Symptoms

Parkinson's disease is now considered a complex, multi-peptide, central, and peripheral nervous system disorder with considerable clinical heterogeneity. Non-motor symptoms play a key role in the trajectory of Parkinson's disease, from prodromal premotor to end stages. To understand the clinical heterogeneity of Parkinson's disease, this study used cluster analysis to search for subtypes from a large, multi-center, international, and well-characterized cohort of Parkinson's disease patients across all motor stages, using a combination of cardinal motor features (bradykinesia, rigidity, tremor, axial signs) and, for the first time, specific validated rater-based non-motor symptom scales. Two independent international cohort studies were used: (a) the validation study of the Non-Motor Symptoms Scale (n = 411) and (b) baseline data from the global Non-Motor International Longitudinal Study (n = 540). k-means cluster analyses were performed on the non-motor and motor domains (domains clustering) and the 30 individual non-motor symptoms alone (symptoms clustering), and hierarchical agglomerative clustering was performed to group symptoms together. Four clusters are identified from the domains clustering supporting previous studies: mild, non-motor dominant, motor-dominant, and severe. In addition, six new smaller clusters are identified from the symptoms clustering, each characterized by clinically-relevant non-motor symptoms. The clusters identified in this study present statistical confirmation of the increasingly important role of non-motor symptoms (NMS) in Parkinson's disease heterogeneity and take steps toward subtype-specific treatment packages.


INTRODUCTION
Parkinson's disease (PD) is classically considered a motor disorder, with resting tremor, rigidity, bradykinesia, and postural instability and gait disorder as its core features. However, the concept of PD has changed considerably in the last few years, now prompting a revision of its diagnostic criteria to include non-motor symptoms (NMS) in the core parameters (Postuma et al., 2015;Marras and Chaudhuri, 2016). There has been growing recognition that NMS in PD are caused by neurotransmitter pathway dysfunctions which involve both the central and peripheral nervous systems (Jellinger, 2012;Gjerløff et al., 2015). The significant clinical heterogeneity of NMS in PD suggests the existence of specific non-motor subtypes (Marras and Chaudhuri, 2016;Sauerbier et al., 2016).
Previous cluster analyses have already identified motor-and non-motor-based clusters in PD patients (e.g., van Rooden et al., 2011;Erro et al., 2013;Ma et al., 2015;Pont-Sunyer et al., 2015). Recently, it has been argued that the recent concept of non-motor endophenotypes of PD provides a stronger basis for subtyping, since these relate to the central pathophysiology of specific neurotransmitter systems and are therefore likely to remain stable over time (Marras and Chaudhuri, 2016). As such, several studies have explored PD subtypes while considering motor subtypes and their association with non-motor aspects of the disease such as, psychopathology and cognition (Graham and Sagar, 1999;Reijnders et al., 2009;Selikhova et al., 2009;Burn et al., 2012;Flensborg Damholdt et al., 2012), REM sleep behavior disorder (Romenets et al., 2012), and daily visual activities (Seichepine et al., 2011). To our knowledge, however, no studies have used cluster analysis techniques to examine subtypes present in NMS only.
In this study, we used cluster analysis techniques to search for PD subtypes from a large, multi-center, international, and well-characterized cohort of patients across all stages, using a combination of motor cardinal features (bradykinesia, rigidity, tremor, axial signs) and comprehensive NMS assessed using specific validated rater-based scales. We believe this is the largest study of its size with these characteristics, and the first to focus on exclusively NMS-based phenotyping.

Design
Data from two independent international studies were used in the analysis: the validation study of the Non-Motor Symptoms Scale (NMSS) (n = 411) (Martinez-Martin et al., 2009a) and baseline data from the global Non-Motor International Longitudinal Study (NILS) (n = 540) (Ray Chaudhuri et al., 2013). NILS has been adopted as a national study by the National Institute of Health Research in the UK (UKCRN No: 10084) and is a 5-year follow-up study addressing the range, nature, and natural history of NMS in PD across all motor stages. All data in NMSS and NILS have been anonymized and entered into a secure database at the National Center of Epidemiology, Carlos III Institute of Health, Madrid, Spain.

Patients
PD patients diagnosed according to internationally recognized criteria (Gibb and Lees, 1988;Lees et al., 2009) were included, and represented a mixed cohort of drug-naïve and treated PD across all disease stages. For the NMSS study, patients were older than 30 years, but for inclusion of NILS patients there was no age limit. Exclusion criteria were: inability to read, understand, or answer written questionnaires; comorbidity, sequelae, or any disorder interfering with the assessment of PD; and inability to give informed consent. Patient recruitment was carried out across 15 countries in America, Asia, and Europe from 2007 to 2011.

Standard Protocol Approvals, Registrations, and Patient Consent
The NMSS validation study received ethical approval from the Carlos III Institute of Health, Madrid, Spain, and local research ethics committees (Martinez-Martin et al., 2009a). The NILS is included in the UK Department of Health portfolio of approved studies (UK CRN portfolio Nr. 10084) and has been approved at all relevant institutions and corresponding ethics committees/institutional review boards. All patients gave written informed consent before inclusion in accordance with the Declaration of Helsinki.

Statistical Analysis
SCOPA-Motor examination items were aggregated to obtain four "cardinal motor signs": tremor (items 1 and 2), bradykinesia (item 3), rigidity (item 4), and axial signs (items 5 to 10). Additionally, an aggregate "motor complications" variable was obtained from the sum of items 18 to 21 (dyskinesias and motor fluctuations). All variables were standardized before clustering, and unstandardized afterwards for interpretation. Analyses were conducted in R version 3.2.4 (www.r-project.org) and Stata version 14 (http://www.stata.com/).

Cluster Analysis
k-means was used for cluster analysis. We performed two analyses on the data: the first clustering on the nine aggregate non-motor symptom domains, the four cardinal motor signs (tremor, bradykinesia, rigidity, axial), and motor complications, henceforth the "domains clustering, " and the second on the 30 individual NMS of the NMSS only, henceforth the "symptoms clustering." Average-linkage hierarchical agglomerative clustering on the 30 NMS, 4 motor signs, and motor complications was also performed to observe the grouping of the variables. Various formal measures were used to determine the optimal number of clusters for the dataset. For the domains clustering, the optimal k according to the Gap Statistic and the 1-standarderror method (Tibshirani et al., 2001) was k = 4 (Supplementary Figure 1A). Other cluster determination methods suggested k = 2, 3, 4, where k = 2, 3 simply divided the data uninformatively into groups with varying levels of overall disease severity. Thus, k = 4 was selected to offer a good combination of model fit and parsimony. The same method was applied for the symptoms clustering, where the number of clusters was k = 6 (Supplementary Figure 1B).

Comparative Subgroup Analysis
For each variable in both clusterings, we used one-way ANOVA and χ 2 tests to, respectively, check the equality of variable means and proportions across the clusters found, using Bonferroni correction for multiple testing with corrected p < 0.05 considered significant. Differences among pairwise clusters were tested post-hoc using Tukey's range test for continuous means, or pairwise χ 2 tests for proportions, with Bonferroni correction both for the within-variable pairwise tests and the multiple variable comparisons.
To compare the domains and symptoms clusterings, we depicted cluster alignment with a contingency table, and computed the adjusted rand index (ARI) (Hubert and Arabie, 1985) to evaluate similarity between the two clusterings.
Lastly, to explore the relationship between symptom severity and disease duration, we computed the correlation of each variable with disease duration and fitted smoothed loess curves to the data both globally and for each cluster in the domains clustering.

Study Sample
Out of the 951 patients in the study, we used list wise deletion to exclude 47 patients due to missing measurements, resulting in 904 remaining patients. There were no significant differences between the included and excluded groups with respect to age, sex, disease duration, and HY (χ 2 ≥ 0.19). The characteristics of the sample included for analysis (n = 904) are displayed in Table 1. Patients were predominantly male (62.17%). 13.38% were in HY stage 1; 43.36% in stage 2; 29.65% in stage 3; 11.50% in stage 4; and 2.10% in stage 5.

Domains Clustering
Results from the k-means clustering on the nine non-motor domains, the four cardinal motor signs, and motor complications are reported in Table 2 along with additional variables not used in the analysis (heatmap in Figure 1; boxplots in Supplementary Figure 2). Cluster means for all variables were found to be statistically significantly different except for age at disease onset and sex (adjusted p < 0.05). Specific pairwise differences are noted in the table.
Cluster D1 (n = 428) patients were mildly affected in all domains. This cluster was characterized by relatively lower disease durations and ages.
Cluster D2 (n = 180) patients were severely affected in non-motor domains but mildly affected in motor domains. This cluster had a severity of motor variables relatively similar to the cluster D1 (mild) subtype especially in tremor, but expressed significantly higher scores for non-motor domains than clusters D1 and D3, especially in the sleep/fatigue, mood/apathy, urinary, and miscellaneous domains. Except for motor complications, scores for every variable were statistically significantly different from those in cluster D3.

Symptoms Clustering
k-means performed on the 30 individual NMS found six clusters ordered according to increasing CISI-PD score ( Table 3, heatmap in Figure 2). Means of all symptoms were found to differ across clusters except for disease onset, sex, and tremor, with pairwise differences again noted in the table.
Cluster S1 (n = 456), the largest cluster representing 50% of the group, was similar to domains cluster D1, and was composed of patients relatively mildly affected in all NMS. Cluster S2 (n = 201) had higher mean symptom scores than cluster S1's in several cases, including restless legs syndrome (RLS), swallowing, and the miscellaneous domain, but could nonetheless be classified as a mild/moderate cluster.
Although clusters S3-S6 increased in motor and overall disease severity, they varied significantly in their non-motor expression and each expressed a unique subset of NMS. These groups of NMS aligned well with the established nonmotor domains. Cluster S3 (n = 100) mainly expressed domain 7 (urinary), while cluster S4 (n = 73) was affected severely in domain 3 (mood/apathy). Cluster S5 (n = 54) showed severe impact in most NMS but especially in domain 5 (attention/memory). Similarly, cluster S6 (n = 20) had severe scores across all NMS and motor features, but was most severely affected in the cardiovascular, perception/hallucination, and gastrointestinal NMSS domains. Overall, the symptoms clustering fragmented the domains clusters into smaller groups, as explored in the next section. Table 4. While S1 grouped patients from D1 (mild) and D3 (motor-dominant), and D4 (severe) showed a dominant contribution from S5 (severe non-motor) and S6 (severe motor and non-motor), the remaining clusters were fragmentarily distributed, as indicated by the low similarity between the clusterings (ARI = 0.32). For the domains clustering, patients in clusters mildly affected in non-motor domains (D1, D3) were distributed among the milder symptoms clusters (S1-S4, skewed left). Conversely, patients in FIGURE 1 | Heatmap of variables for each cluster in the domains clustering, separated by white lines according to nine non-motor domains, four cardinal motor features, motor complications, and four general variables not included in the analyses. Since symptoms have different scales, cluster means for each symptom are displayed as standardized scores relative to each overall symptom mean. clusters with severe NMS (D2, D4) were split among the various specific non-motor-dominant clusters (S2-S6), suggesting that the symptoms clustering is clinically more specific than the domains clustering.

Hierarchical Clustering on Variables
Hierarchical clustering on the 30 NMS and the four cardinal motor signs is depicted in Figure 3. Symptoms belonging to the same domain of the NMSS tended to cluster together, with some exceptions. Diplopia (domain 4) was grouped closer to domain 8 (sexual) symptoms than to symptoms in its own domain. Similarly, RLS (domain 2) was closer to domain 9 (miscellaneous) symptoms, and drowsiness (domain 2) with domain 5 (attention/memory) symptoms. Notably, tremor was the most isolated symptom, occupying a single branch at the top of the tree.

Correlation Analysis
Due to high variance, most variables had little to no correlation with disease duration (Supplementary Figure 3). In Figure 4, we plotted 4 variables especially relevant to the domains clustering against disease progression: CISI-PD, Tremor, Anxiety, and Depression. Notable differences in disease progression for each cluster can be seen in the scatterplots: for example, patients in NMS dominant cluster D2 actually tended to have higher scores for anxiety and depression at disease onset, decreasing with increasing disease duration.

DISCUSSION
We believe that this is the largest cluster analysis-based study of PD-related motor and NMS from a large, international, multicenter cohort. Previous cluster-analysis based studies have either focused on early/untreated Parkinson's disease (Erro et al., 2013;Pont-Sunyer et al., 2015) or lack detailed assessments based on the severity and frequency of non-motor domains and symptoms (van Rooden et al., 2011). Additionally, we believe this is the first study to perform cluster analysis exclusively on NMS to reveal NMS-specific subtypes.
The domains clustering's four clusters closely correspond with several previous studies (van Rooden et al., 2011;Erro et al., 2013;Ma et al., 2015;Pont-Sunyer et al., 2015), especially those reported by van Rooden et al. (2011). Both clusters D1 (mild) and D4 (severe) are groups which are present in most analyses, but unlike van Rooden et al., our data show that mean differences in disease duration do exist between mild and severe subtypes. Clusters D2 and D3 represent a divergence in symptomatic expression: D2 representing a non-motor dominant phenotype also described in many clinical phenotype-driven studies (Sauerbier et al., 2016), and D3 corresponding to the traditional motor-dominant view of PD. Due to these clusters' similar overall PD severity (CISI-PD) and duration, differences in disease progression do not explain the differences between D2 and D3. Finally, the high incidence of tremor in D3, even higher than D4, is interesting and reflects not only the motor-dominant subtype of van Rooden et al. but also the tremor-dominant/slowprogression cluster described by Ma et al. (2015).
Our correlation analysis demonstrates notable differences in disease progression among these clusters. The high initial depression and anxiety scores for cluster D2 suggests that patients susceptible to NMS-dominant PD can be identified by high NMS scores early after disease onset. Furthermore, the general improvement in depression and anxiety scores for this cluster contrasts with the relatively stable scores in clusters D1, D3, and D4.
FIGURE 2 | Heatmap of variables for each cluster in the symptoms clustering, separated by white lines according to 30 individual non-motor symptoms and variables not included in the analysis (cardinal motor features, motor complications, and other general variables). Since symptoms have different scales, cluster means for each symptom are displayed as standardized scores relative to each overall symptom mean.
specific marker in non-motor dominant clusters and disease progression. Cluster S4, characterized by high mood/apathy symptoms, is consistent with the sleep and apathy clinical phenotypes described by other studies (Sauerbier et al., 2016). Clusters S5 and S6 are of clinical interest, as in these clusters NMS dominate, overshadowing motor symptoms with an emphasis on cognitive impairment in S5 and autonomic (cardiovascular and gastrointestinal) symptoms in S6. Overall, many of these subtypes are newly reported and their characteristics support clinical endophenotyping of non-motor subtypes not reported in previous studies. The comparison between the domains and symptoms clustering shown in the contingency table (Table 4) suggests that the broader subset of cluster S1, a mild non-motor dominant cluster, essentially expresses two NMS subtypes, one of them with motor symptoms. The low numbers observed in some cells do not allow consistent clinical interpretation. The hierarchical clustering (Figure 3) indicates that the symptoms grouping in the NMSS dimensions works as expected, as most items in each domain group together, with the exception of tremor.
What are the clinical implications of these clusterings? First, our analysis represents statistical conformation of NMSdominant presentation of PD. The specific expression of several NMS domains such as, mood/anxiety, sleep/fatigue, cognition, and urinary function suggests that these subgroups may have different patterns of neurodegeneration involving the brain's various non-dopaminergic pathways, possibly in excess of dopaminergic degeneration, as suggested by several authors (Jellinger, 2012). Second, clinical recognition of subtypes using ad hoc criteria would allow for the development of truly subtype-specific treatment packages for PD (Marras and Chaudhuri, 2016). Third, clinical characterization of these groups will allow studies of natural history of specific subtypes.
The clinical non-specificity of D1, S1, D4, and S6, with extremely diverse disease durations and severities, contrasts with the precisely characterized clusters S3, S4, and S5, with dominant expression of urinary, mood/apathy and attention/memory symptoms, respectively, at intermediate stages of the disease. This pattern is in line with the notion of "phenotypic convergence" proposed by Warren et al. (2013)  Like any cohort-based, cluster-analysis driven study, there are several limitations of this analysis. Due to the data collection methods of the two studies used, selection due to prevalence bias, i.e., sample overrepresentation of patients with higher survival, is unlikely to explain this clustering; however, clustering at early PD stages may have been undermined by poorly recorded symptoms prior to diagnosis. Furthermore, we did not report a control group, although our intention was not to describe the symptoms as discriminant from normal subjects. Lastly, in the treated patients in our sample, NMS symptoms could be influenced  by dopaminergic therapy, including depression via pramipexole (Barone et al., 2010), sleep disorders via rotigotine (Trenkwalder et al., 2011), and others. The numbers of patients undergoing specific treatments are too small to conduct meaningful analyses of the effects of such therapies; we expect, however, that such effects on single non-motor components do not significantly alter the trends observed in total NMSS scores across all treated and drug-naïve patients in our sample.
Conversely, our study has several notable strengths: (1) the sample size, which to our knowledge is the largest international sample in this kind of study; (2) the inclusion of patients in all disease stages; and (3) the use of detailed assessments both for motor and NMS.
In conclusion, we present statistical confirmation of the growing recognition of NMS-dominant presentation of PD and its heterogeneity. The clinical recognition of these subtypes could allow for subtype-specific treatment packages for PD, and in the future, clinical characterization of these groups will allow for studies of natural history of the various non-motor dominant clusters identified in this paper. Translating results to clinical management or experimental designs would require the identification of inclusion and exclusion criteria of patients into specific subgroups.

AUTHOR CONTRIBUTIONS
JM conducted the statistical analysis and interpretation of the data, drafted the methods and results section of the paper, and revised the manuscript for content. KC obtained the data, designed the study, and drafted and revised the introduction and conclusion sections of the paper. CB, JdP, and PL revised the manuscript for content and contributed to the analysis and interpretation of data. PM obtained the data, designed the study, and drafted and revised the introduction and conclusion sections of the paper.

FUNDING
This study was funded by the Spanish Ministry of Economy and Competitiveness through the Cajal Blue Brain (C080020-09; the Spanish partner of the Blue Brain initiative from École Polytechnique Fédérale de Lausanne) and TIN2016-79684-P projects, the Regional Government of Madrid through the S2013/ICE-2845-CASI-CAM-CM project, the European Union's Horizon 2020 research and innovation programme under grant agreement No. 720270, and the National Institute of Health Research in the UK (UKCRN No: 10084).