Skip to main content

METHODS article

Front. Aging Neurosci., 08 October 2020
Sec. Neurocognitive Aging and Behavior
Volume 12 - 2020 |

Phenotype-Agnostic Molecular Subtyping of Neurodegenerative Disorders: The Cincinnati Cohort Biomarker Program (CCBP)

Andrea Sturchio1 Luca Marsili1 Joaquin A. Vizcarra1 Alok K. Dwivedi2 Marcelo A. Kauffman3,4 Andrew P. Duker1 Peixin Lu5,6 Michael W. Pauciulo7 Benjamin D. Wissel1,5 Emily J. Hill1 Benjamin Stecher1 Elizabeth G. Keeling1 Achala S. Vagal8 Lily Wang8 David B. Haslam9 Matthew J. Robson10 Caroline M. Tanner11 Daniel W. Hagey12 Samir El Andaloussi12 Kariem Ezzat12 Ronan M. T. Fleming13 Long J. Lu4 Max A. Little14,15 Alberto J. Espay1*
  • 1James J. and Joan A. Gardner Family Center for Parkinson’s disease and Movement Disorders, Department of Neurology, University of Cincinnati, Cincinnati, OH, United States
  • 2Division of Biostatistics and Epidemiology, Department of Biomedical Sciences, Paul L. Foster School of Medicine, Texas Tech University Health Sciences Center, El Paso, TX, United States
  • 3Consultorio y Laboratorio de Neurogenética, Centro Universitario de Neurología “José María Ramos Mejía” y División Neurología, Hospital JM Ramos Mejía, Facultad de Medicina, Universidad de Buenos Aires, Buenos Aires, Argentina
  • 4Programa de Medicina de Precision y Genomica Clinica, Instituto de Investigaciones en Medicina Traslacional, Facultad de Ciencias Biomédicas, Universidad Austral– Consejo Nacional de Investigaciones Científicas y Técnicas de Argentina, Pilar, Argentina
  • 5Division of Biomedical Informatics, Cincinnati Children’s Hospital Medical Center, Department of Pediatrics, University of Cincinnati, Cincinnati, OH, United States
  • 6School of Information Management, Wuhan University, Wuhan, China
  • 7Division of Human Genetics, Cincinnati Children’s Hospital Medical Center, Department of Pediatrics, University of Cincinnati, Cincinnati, OH, United States
  • 8Department of Radiology, University of Cincinnati Medical Center, Cincinnati, OH, United States
  • 9Division of Infectious Diseases, Center for Inflammation and Tolerance, Cincinnati Children’s Hospital Medical Center, Cincinnati, OH, United States
  • 10Division of Pharmaceutical Sciences, James L. Winkle College of Pharmacy, University of Cincinnati, Cincinnati, Cincinnati, OH, United States
  • 11Department of Neurology, Weill Institute for Neurosciences, Parkinson’s Disease Research Education and Clinical Center, San Francisco Veteran’s Affairs Medical Center, University of California, San Francisco, San Francisco, CA, United States
  • 12Department of Laboratory Medicine, Clinical Research Center, Karolinska Institutet, Stockholm, Sweden
  • 13Analytical Biosciences, Division of Systems Biomedicine and Pharmacology, Leiden Academic Centre for Drug Research, Leiden University, Leiden, Netherlands
  • 14School of Computer Science, University of Birmingham, Birmingham, United Kingdom
  • 15Media Lab, Massachusetts Institute of Technology, Cambridge, MA, United States

Ongoing biomarker development programs have been designed to identify serologic or imaging signatures of clinico-pathologic entities, assuming distinct biological boundaries between them. Identified putative biomarkers have exhibited large variability and inconsistency between cohorts, and remain inadequate for selecting suitable recipients for potential disease-modifying interventions. We launched the Cincinnati Cohort Biomarker Program (CCBP) as a population-based, phenotype-agnostic longitudinal study. While patients affected by a wide range of neurodegenerative disorders will be deeply phenotyped using clinical, imaging, and mobile health technologies, analyses will not be anchored on phenotypic clusters but on bioassays of to-be-repurposed medications as well as on genomics, transcriptomics, proteomics, metabolomics, epigenomics, microbiomics, and pharmacogenomics analyses blinded to phenotypic data. Unique features of this cohort study include (1) a reverse biology-to-phenotype direction of biomarker development in which clinical, imaging, and mobile health technologies are subordinate to biological signals of interest; (2) hypothesis free, causally- and data driven-based analyses; (3) inclusive recruitment of patients with neurodegenerative disorders beyond clinical criteria-meeting patients with Parkinson’s and Alzheimer’s diseases, and (4) a large number of longitudinally followed participants. The parallel development of serum bioassays will be aimed at linking biologically suitable subjects to already available drugs with repurposing potential in future proof-of-concept adaptive clinical trials. Although many challenges are anticipated, including the unclear pathogenic relevance of identifiable biological signals and the possibility that some signals of importance may not yet be measurable with current technologies, this cohort study abandons the anchoring role of clinico-pathologic criteria in favor of biomarker-driven disease subtyping to facilitate future biosubtype-specific disease-modifying therapeutic efforts.


We have long assumed that the neuropathological findings of aggregated α-synuclein (α-syn) into Lewy bodies and Lewy neurites define and cause Parkinson’s disease (PD) and that aggregations of amyloid (Aβ) into plaques and tau into neurofibrillary tangles define and cause Alzheimer’s disease (AD), and that the distribution of these proteins explains their clinical heterogeneity (Espay et al., 2020). These pathological findings are, however, ubiquitous and do not correlate with agnostic post-mortem analysis: α-syn, Aβ, and tau aggregation are frequent “co-pathologies” in AD and PD (Irwin et al., 2017; Boyle et al., 2018; Karanth et al., 2020) and can be found even in super-survivors without dementia or parkinsonism (Head et al., 2009; Wallace et al., 2019). The overlapping pathological features may instead reflect clinical characteristics shared by PD and AD (Scarmeas et al., 2004, 2005; Kehagia et al., 2010). Indirect evidence from human studies suggest protein aggregation in sporadic cases may in fact be protective and not capable of discriminating clinical disease subtypes (Espay et al., 2019). As a result, it has become imperative to transition from the century-old, clinico-pathological convergent model on which diseases are classified to a systems biology framework, in which genotype and biomolecular abnormalities, rather than clinical phenotypes alone, define nosology and drive therapeutics (Espay et al., 2017).

Given these premises, we have recently launched at the University of Cincinnati’s James J. and Joan A. Gardner Center for Parkinson’s Disease and Movement Disorders, a phenotype-agnostic biomarker-discovery program aimed at characterizing biological subtypes of neurodegenerative disorders, particularly those best suited for targeting with therapies available for repurposing. This cohort study has unique features compared to ongoing [e.g., Parkinson’s Progression Marker Initiative (PPMI)] or newly assembled cohorts [e.g., Luxembourg study, Personalized Parkinson Project (PPP)] (Table 1) (Kenneth et al., 2011; Hipp et al., 2018; Bloem et al., 2019). The main novelty for our cohort study is a design based on the assumption we do not know which biomarkers have clinical relevance at the individual level. Accordingly, the recruitment will be deliberately inclusive of different neurodegenerative phenotypes with the expectation that biological subtypes may not align with clinico-pathological subtypes.


Table 1. Comparison between established biomarker-development cohorts.

Here we summarize the methodological aspects of this cohort study, including phenotypic measures and analytic approach, and discuss anticipated challenges.

The Cincinnati Cohort Biomarker Program

This is an omics-based, longitudinal, structural causal model, non-phenotype-driven population-based study. We will enroll a total of 4,000 patients with neurodegenerative diseases and 1,000 healthy age-matched controls with yearly follow-up for at least 5 years, extended to 10 and beyond contingent on additional funding. At each visit, patients will undergo a similar clinical, paraclinical, and biospecimen collection. Pragmatic approaches such as streamlining data gathering (prioritizing biospecimen collection) will be allowed if important to retain subjects and minimize dropouts. The exploratory nature of this study rendered it unsuitable for funding considerations by agencies giving continued preference for hypothesis-based studies based on the prevailing clinico-pathologic model of neurodegenerative diseases, which remains the gold standard for nosology, biomarker validation, and disease modification. As a result, this study was funded through philanthropy, with major support by the James J. and Joan A. Gardner Family Foundation. The main aim is to identify biological outliers defining molecular disease subtypes, with a focus on those suitable for targeting with already available therapies (repurposing) in future built-in adaptive clinical trials.

Inclusion and Exclusion Criteria

Given the inclusive nature of the study, we are recruiting subjects older than 18 years of age exhibiting a range of parkinsonisms representing PD and PD-like disorders, such as progressive supranuclear palsy, multiple system atrophy, and corticobasal syndrome, as well as AD and AD-like disorders, such as frontotemporal dementias, normal pressure hydrocephalus, and vascular dementia. The enrollment of young subjects could help in the identification of early biomarkers in specific conditions (e.g., genetic). However, our enrollment will be initially focused on the elderly population seeking care at the University of Cincinnati Gardner Center, which receives referrals from a wide range of Cincinnati-area neurologists. The Center evaluates a representative population of neurodegenerative disorders seeking care in the Cincinnati area. We will also recruit age- and sex-matched healthy controls. Controls that during the study assessment manifest signs of neurological disease will be shifted as cases.

Although “Cases” and “Controls” are determined by virtue of the presence or absence of neurological symptoms, respectively, our inclusion criteria for neurodegenerative disorders are otherwise deliberately inclusive, based on the premise that we do not a priori know in which clinical phenotypes will the first targetable molecular subtypes be identified. As noted in Section “Data Analysis and Management,” none of proposed analysis will use the classification of participants into cases or controls, nor any phenotypic subtype created therein, as independent variables. Nevertheless, all participants will be referred by a neurologist to make sure they present specific signs of parkinsonism or dementia. In case of doubt, the Principal Investigator will decide if the subjects fit inclusion criteria. Only subjects with recognized causes or contributors for their motor or cognitive manifestations (e.g., vitamin B12 deficiency) and those requiring aggressive medical management will be excluded.

Ethics, Collection, and Storage of Biological Samples

The study protocol was approved by the Institutional Review Board of the University of Cincinnati (protocol number 2020-0039). Informed consent is obtained from all subjects with the conduct of the study fully adhering to the principles of the Declaration of Helsinki. Biospecimens will be collected from subjects and healthy controls, including peripheral blood, urine, and stool.

Plasma is being isolated from blood collected in EDTA vacutainers and aliquoted for future use, including isolation of plasma proteins and extracellular vesicles (EVs). As all cells secrete EVs, they are abundant in all bodily fluids and have been shown to carry diverse species of nucleic acids, proteins, and lipids (van Niel et al., 2018). Plasma will be subjected to size exclusion chromatography (70 nm qEV original, Izon Science) to separate EVs from soluble proteins. The EVs present in each sample will be quantified using nanoparticle tracking analysis (NanoSight NS300, Malvern Panalytical), and their surface proteins characterized by a flow cytometry method optimized for vesicle analysis (Wiklander et al., 2018). Following isolation, we will extract RNA and sequence the mRNA present within these vesicles using methods developed for single-cell RNA-sequencing. In order to amplify the most informative signals in total EVs mRNA, we will utilize known neuron, astrocyte and oligodendrocyte cell surface markers using immunoprecipitation (Miltenyi Biotec).

A urine sample is being collected in a sterile kit during in-clinic visits. Stool samples are aliquoted into preservative containers (OMNIgene.Gut, DNA Genotek, Corp.) immediately after passage. Samples are transferred to −80° storage within 72 h. DNA is subsequently extracted from 0.25 gm stool using the PowerFecal Pro extraction kit (Qiagen, Inc.). DNA sequencing libraries will be constructed (Nextera XT, Illumina, Corp.) and pooled for sequencing on an Illumina sequencing machine (NextSeq500, Illumina, Corp.). Sequencing reads will be aligned to a microbial genome database using Kraken (Wood and Salzberg, 2014) to determine the assemblage of microorganisms present in each fecal sample (Quigley, 2017). Biospecimens are processed and aliquoted for downstream use consistent with the strategy of future use/sharing of the samples. All sample meta-data are tracked via the DT Biobank’s LIMS system to catalog the chain of custody and processing details. Stool samples are stored at −80°C in the Microbial Genomics and Metagenomics Laboratory at Cincinnati Children’s Hospital. Participants are also asked to participate in an optional brain donation program.

Genomics, transcriptomics, proteomics, metabolomics, epigenomics, and microbiomics will be processed from our biological samples. We will use validated methods for the analysis of the samples to ensure feasibility and reproducibility of the study in future independent cohorts. The specific methods will be selected at a later time; this will give us greater flexibility in the choice of assays as the analytic technologies become less expensive. Also, we may add other ‘–omics’ (e.g., lipidomics, etc.) in the future.

Clinical, Paraclinical, and Neuroimaging Assessments

Clinical Scales and Questionnaires

Motor and non-motor symptoms are assessed through the Movement Disorders Society Unified Parkinson’s Disease Rating Scale (MDS-UPDRS part II and III) (Goetz et al., 2008), the Tinetti Gait and Balance scale (Tinetti et al., 1986), the Non-motor Symptoms Scale (NMSS) (Chaudhuri et al., 2006), the Parkinson’s Disease Quality of Life Questionnaire (PDQ-8) (Jenkinson et al., 1997), the Epworth Sleepiness Scale (ESS) (Johns and Hocking, 1997), the Activities of Daily Living (ADL), the Instrumental ADL (iADL) (Katz, 1983), the Beck Depression Inventory scale (BDI) (Beck and Beamesderfer, 1974), Beck Anxiety Inventory scale (BAI) (Beck et al., 1988), the REM Sleep Behavior Disorder Screening Questionnaire (RBDSQ) (Stiasny-Kolster et al., 2007), and the Montreal Cognitive Assessment (MoCA) (Nasreddine et al., 2005). An extensive epidemiological, demographical, pharmacological, and lifestyle questionnaire, as well as a Food Frequency Questionnaire (FFQ),1 are also collected.

Gait and Postural Stability Outcome Measures Obtained Using Mobile Health Technologies

Gait and postural stability are measured in the following conditions (Axivity, Ltd., Newcastle upon Tyne, United Kingdom): (1) Two-minute Walk: Subjects are asked to walk a straight path for 2 min. Parameters include: stride length, gait speed, stride width, and stride asymmetry; (2) Instrumented Time Up and Go (iTUG): Subjects are instructed to sit comfortably in an armless chair. At the “go” signal, they rise from the chair without using support, walk 3 m, turn 180° and walk back; (3) Postural Sway: Subjects are asked to stand with their hands at their sides and feet together spaced by a wooden wedge on a firm surface; (4) 360° Turn: Subjects are instructed to turn in a complete circle (360°), first to the left, and then to the right. Other measures include: (1) Tapping test: Subjects are asked to tap on the smartphone screen for 30 s; (2) Rest and postural tremor tests: Subjects hold their arm out straight for 30 s, and subsequently rest their arms in the lap while counting down from 100; and (3) Voice and speech tests: Subjects are asked to say “aaaah” at a comfortable pitch and loudness, and subsequently recite a short, phonetically-balanced passage, into an Android-based smartphone microphone.

A 3-Tesla brain MRI will be obtained within 6 months from the baseline. A comprehensive protocol including 3D T1 fast spoiled gradient echo (FSPGR), 3D T2-weighted, 3D T2-FLAIR, susceptibility weighted imaging (SWI), resting state functional MRI (fMRI), diffusion tensor imaging (DTI), and 3D arterial spin labelling (ASL) will be performed. 3D T1 FSPGR sequence provides volumetric analysis of regional atrophy. T2 and FLAIR sequences will be analyzed for chronic small vessel disease including white matter disease, lacunar infarcts, dilated perivascular spaces. SWI will provide information on iron deposition in the deep nuclei and microbleeds. Resting state fMRI will be analyzed for changes in functional connectivity. DTI tractography analysis will provide information on white matter integrity.

At Home Sensor-Based Assessment

Participants are provided with smartwatches (Sony Corporation, Tokyo, Japan) for at-home 24-h continuous collection of sensor data such as accelerometry and wrist-based photoplethysmography, from which estimates of multiple behavioral parameters, including sleep behavior quality, heart rate variability and step count will be obtained.

Data Storage and Process

All biological samples are stored for future analyses in a dedicated Biobank at Cincinnati Children’s Hospital Medical Center (CCHMC) and Discover Together Biobank using established protocols, for processing, storage, and future analysis. The database was designed to account for the longitudinal study design, linkage to multi-omics measurements and formats, and capacity to store big data. The stored data are labeled according to processed or unprocessed data, methods, and type of omics data. All the samples are coded using an identifier reflecting sites and subject number. All the samples are preprocessed for background correction, quality control and standard deviation of the intensity ratios. Prior to conducting analyses, normalization using LOWESS or quantiles, scaling with baseline correction, outlier removal, and missing imputation for less than 20% missing data using K-nearest neighbor imputation will be performed. BioMart for database and Bioconductor for data processing and analyses will be used along with specific software required for sequence, network, reads, mining, and pathways will be utilized according to their specific purposes. We plan to create an online platform where de-identified and analyzed data can be shared. To protect confidentiality and prevent bias, all imaging data will be deidentified and transmitted with unique study identification numbers to the imaging core lab, utilizing a HIPAA complaint secure platform. Imaging readings will be recorded on electronic case report forms and integrated seamlessly with the clinical data.

Data Quality Management

A pre-analytical standard operating procedure (SOP) has been developed. The multiple steps included are aimed at minimizing biases at forming and analyzing substudy cohorts. The following SOP are highlighted: (a) subjects are selected only by neurologists; (b) controls are selected from the same population and time period than cases; (c) a substantially large sample size will permit estimating rare molecular subtypes; (d) pragmatic assessments to minimize dropouts and maximize adherence to protocol over a long observational period. Finally, our interdisciplinary team is meeting regularly to review the quality controls of data collection, SOP protocol adherence, data-gathering issues, and concerns related to ethics, data storage, data process, and management.

Data Analysis and Management


The main aim is to identify biologically unique biological subgroups with emphasis on those suitable for repurposing of already available therapies using proof-of-concept adaptive clinical trials.

Sample Size and Statistical Power

The sample size of this study was computed using several simulations under various conditions. We utilized the Qiu and Joe (2009) formula (10 × d × k) (Qiu and Joe, 2009) where d is the number of variables included for clustering while k is the number of clusters and formula (70 × d) (Dolnicar et al., 2014). Using this formula to estimate the moderate, adjusted Rand index values produces a sample size of 3500 with 50 biological markers. This sample size is powered for detecting at least 10 subtypes with 40 biological markers using the Qiu and Joe formula. Furthermore, a total of 800 healthy controls are required to form a comparative group based on 1:2 case-control design for detecting small Cohen effect sizes (D = 0.2) between groups with more than 90% power and 5% level of significance. The sample size suggested in this study is more than sufficient to detect small (odds ratio 1.2 or standardized mean difference 0.20) to moderate (OR = 1.5 or SMD = 0.50) expected associations between individual subtypes and clinical outcomes depending on the types of outcomes and subtypes with more than 80% power and 5% level of significance and covariates accounting for 10 to 50% of variance in a given outcome using logistic regression analysis. This sample size also ensures adequate power for detecting small to moderate Cohen’s effect sizes (SMD 0.20–0.50) using two-sided unpaired t-tests. The sample size estimation was also found to be sufficient using data-driven sample size driven algorithm (DSD) (Billoir et al., 2016) and sample size in high-dimensionality data settings using the MV power algorithm (Guo et al., 2010). We note here that these formulas depend upon assumptions (such as Gaussianity) which may not hold for these data and for the kinds of clustering analysis we plan to use in this study and can only be considered reasonable to justify the sample size. Although a sample size of 3500 patients and 800 healthy controls was estimated as sufficient, we plan to enroll 4000 cases and 1000 controls in order to account for potential dropouts. The sample size will most likely need to increase to identify heretofore unanticipated molecular subtypes.

Exploratory Data Analysis

All potential biomarkers will be compared between cases and controls using a bootstrap test to screen for significant biomarkers from each omics platform and thereafter we will apply Bayesian exponential family principal components analysis (BE-PCA) (Shakir et al., 2008), a generalization of principal component analysis (PCA), which is a widely used method of statistical analysis and simplification of data sets, to reduce the dimensionality of the multi-omic data (Wold et al., 1987). We avoid the use of simple PCA because some of the variables we measure in this project are likely to be non-Gaussian.

Data-Driven Causal Inference

The analytical approach of this study will be based on the latest techniques from statistics and data science (Little, 2019), revolving around causal modeling and inference of the interaction between all the variables captured in the study across genomics, transcriptomics, proteomics, metabolomics, epigenomics, microbiomics, and pharmacogenomics data (Figure 1). Starting with a simple causal model built using existing datasets, the model can be used for various purposes, including simulating randomized trials using causal inference, and acting as a guide to designing pragmatic trials to collect appropriate data to “fill in” missing information in the causal model. Results of these simulated trials will then further inform the modeling and statistical analysis choices, with the end goal of deriving a simple, mechanistic model that is both explanatory and predictive, which can be used to extract “subtypes” most likely to respond to therapies (Pearl, 2010).


Figure 1. Basic causal model of proposed relationships between measured variables in the CCBP cohort study. Arrows between variables (in circles) indicate the dominant direction of causal influence between them. In this study, machine learning is used to model predictive relationships, but these should also be causal, not merely associational, relationships. For example, in predicting phenotype (effect) from omics data (causes), confounders such as subject age influence both cause and effect variables, which makes it critical to take these into consideration when using predictive machine learning algorithms.

The justification for the use of these techniques is that they aim to minimize the misleading effects of reliance on speculative and unproven theories of disease, behavior and symptom mechanisms while avoiding the problems of purely data-driven modeling, which can be easily confounded by unmeasured variables, poor-quality data or mischaracterized measurement processes.

These advances in causal inferential methods rely on a synthesis of two analytical techniques (Little and Badawy, 2019):

(1) Data-driven approaches. These approaches often have high predictive accuracy, and can capture high-dimensional, non-Gaussian, non-linear relationships. Machine learning is one example. The primary drawback is their limited explanatory power and high sensitivity to irrelevant confounding effects, which inevitably creep into measurements.

(2) Causal modeling approaches. A set of probabilistic relationships is drawn up to describe the mechanistic processes explaining the data. Because these models traditionally require fully-specified probabilistic relationships between variables, they often do not make quantitatively accurate predictions, but they do allow realistic, causal interactions among biological, behavioral and symptom expression processes to be built in to the analysis. This causal structure is essential in this study given the sheer number of variables and the resulting complexity of interaction between them.

We propose to use a synthesis of these two approaches, which can be described as data-driven causal inference. This aims to exploit the advantages of the high predictive accuracy of data-driven approaches and the realism of causal modeling. It respects the causal structure of the real world captured by the measurements, and is verified against the high-dimensional, non-Gaussian measured data with non-linear interactions, promising to circumvent both the problems of erroneous clinico-pathological reasoning and prevent data analysis which is heavily biased by spurious correlations because its structure can disentangle confounding factors in the measured data, for example.

Technically, data-driven causal inference involves finding variables and their covariates (Figure 1), isolating the mechanism predicting these variables using causal bootstrapping (Little and Badawy, 2019) or other causal adjustment methods (Pearl, 2010), then using the data to fit a predictive model of that isolated mechanism. The isolated mechanisms can then be assembled into a full, predictive causal network. After examining the associations of identified biological subtypes with clinical characteristics and outcomes, the severity of subtypes, their motor and non-motor functionalities, and progression pattern will be determined by integrating data from biological interpretation of subtypes as well. Visual interpretations obtained using Bayesian exponential family PCA and other dimensionality reduction techniques and relationship with clinical neurodegenerative disease subtypes, will be summarized to generate a global view of each subtype. The main benefit of this causal-inference data driven model is not the validation in separate populations but the identification of suitable candidates, within the cohort for future repurposing therapy approaches.

Machine Learning-Based Subtyping and Integration

Subtyping Based on Individual Markers From Integrative Analysis

The analysis of the biological data should lead to clustering subjects with shared biomolecular alterations regardless of phenotype (Espay et al., 2017). In data-driven biological subtyping, the “truth” is unknown and the analysis hypothesis free. Clustering is a major method for disease subtyping based on high-dimensional omics data (Wang and Gu, 2016). We will apply clustering methods to identify subtypes in genomics, transcriptomics, proteomics, metabolomics, epigenomics, microbiomics, and pharmacogenomics. There are currently two main methods for the fusion clustering of multi-omics data [i.e., iCluster, similarity network fusion (SNF)] based on the sample similarity network. Studies have shown that SNF has better performance in disease subtyping than iCluster (e.g., cancer) (Wang et al., 2014; Wang and Gu, 2016). We will perform unsupervised clustering on the processed data by SNF and validate similarities and dissimilarities in identified subtypes using moCluster and pattern fusion analysis by adaptive alignment of multiple heterogeneous omics data. Because clustering analysis is an unsupervised learning method, the results cannot be tested by ground truth which usually indicates the accuracy of training set’s classification of supervised leaning techniques. We can also perform bioinformatics analysis, such as differential expression analysis and functional enrichment analysis, for different subtypes and compare the difference among them. Data-driven subtypes will be determined using various parameters described above. Deep phenotyping from clinical (e.g., development of clinical milestones such as falls, progression of motor and non-motor symptoms, etc.), paraclinical (e.g., mobile health technologies), and neuroimaging data (e.g., brain atrophy) will be used as outcome measures or dependent variables. The longitudinal design, with multiple follow-ups, will give us information about the casual role of potentially druggable biomarkers. The relationship between biomarkers and disease will require similar assessments in the control group.

Subtyping Based on Composite Markers From Integrative Analysis

The clustering of markers (joint expressions of important features) arising from different omics measurements may be useful in identifying unique subtypes of patients as opposed to using patterns of individual markers to form patient subtyping. This procedure typically involves a two-stage framework of clustering. The first stage of clustering groups the subset of variables into disjointed segments whereas the second stage creates subtyping of patients by exploring the patterns in the identified clusters of markers from the first stage. We will utilize unsupervised feature selection methods such as sparse partial least square (sPLS), sparse canonical correlation analysis (sCCA) (Witten and Tibshirani, 2009), and variable cluster analysis (VCLUS) in the first step followed by moCluster (Meng et al., 2016) and SNF in the second stage to determine subtypes.

Subtyping Based on Outlier and Non-Gaussian Markers

Heterogeneity may exit in the identified subtypes of patients. Generally, clustering approaches are conducted to determine subtypes and variable selection after removing outliers and non-Gaussian data. As opposed to removing outliers and non-Gaussian data, several unique subtypes and biological heterogeneity can be obtained by determining subtypes based on outlier markers. In this regard, two novel approaches can be adopted to identify outlier markers as well as non-normal markers. We will employ outlier profile and pathway analysis (OPPAR) using the modified cancer outlier profile analysis (mCOPA) (Wang et al., 2012). The mCOPA is used to identify markers that are outliers either up-regulated or down-regulated. We will also apply the maximum ordered subset t-statistics (MOST) (Karrila et al., 2011) method for identifying bimodal distributed markers. After selecting the appropriate set of non-normal and outlier markers, moCluster and SNF methods will be used to cluster patients into homogenous patterns of non-normal and outlier markers. These steps of identifying subtypes will be replicated for gene set enrichment analysis using OPPAR.

Subtyping Based on Dynamic Network Biomarkers

Individual sets of omics may have limitations, such as poor sample quality or data sparsity, network-based stratification can be used to overcome these limitations and identify unique patient subtypes. We will employ a network-based stratification approach for baseline omics data that determines patients with genes in similar network regions (Hofree et al., 2013). The dynamic network biomarkers (DNBs) method examines time-dependent alterations in biomarkers. We will select the cases-markers which are not statistically different at the baseline from controls and determine the longitudinal changes in the markers according to disease progression or treatment response. MoCluster and SNF will then be applied to determining subtypes based on the changes in non-significant markers.

Bioassay Development for Currently Available Therapies

First, genomics, transcriptomics, proteomics, metabolomics, epigenomics, and microbiomics data will serve to identify potentially altered molecular pathways for each global neurodegenerative subtype (Figure 2). Bioassay candidates will be selected depending on candidates identified by relevant pathway analyses. For example, from the genomics data, we will perform genome-wide association study (GWAS) analysis to obtain SNPs of each subtype and then identify the potential pathogenic genotype and pathways in which they are associated. Viable bioassay candidates will be selected, determined by the generation of high-throughput clinically relevant assays for the quantification of expression and/or biologic state of candidates.


Figure 2. Overview of the pipeline to ascertaining biologically suitable subpopulations for drug repurposing. GWAS, genome-wide association study; WGCNA, weighted correlation network analysis; DE, differential expression analysis; DMP, differential methylation probe; DMR, differential methylation region; DMB, differential methylation block; PCA, principal component analysis; PLS-DA, partial least squares discrimination analysis; OUTs, operational taxonomic unit.

Second, online databases, including OMIM and PubMed, will be searched for related mechanistic information. Specifically, we will collect information of the effects of gain of function (GOF) and loss of function (LOF) in human and/or mammalian models and remove targets that can significantly aggravate the corresponding phenotype. Targets will be obtained through candidate analysis above and candidate drugs with repurposing potential will be recognized for future proof-of-concept clinical trials from identified pathways/protein combinations and drug-related protein information (Table 2).


Table 2. Available putative disease-modifying drugs for bioassay development (and targeted repurposing) examined in phase III clinical trials in Parkinson’s disease.

We plan to work with industry partners to develop/utilize bioassays for the presumed mechanisms of actions for each of the candidate drugs. Given the phenotype-agnostic nature of this study, after the identification of bioassay-based abnormality suggesting vulnerability to a specific drug, a proof-of-concept clinical trial will be designed to match the drug with the bioassay-defined clinical cohort in order to evaluate for preliminary safety and efficacy of such to-be-repurposed intervention.

Reliability and Validation of Patient Subtypes

Various approaches will be used to assess replicability, naturalness, and validation of cluster subtypes. The reliability will be assessed by the adjusted Rand statistic and percentage agreement on cross-validated hold-out testing. The validation will be assessed by comparing the clusters across different clustering methods (SNF, moCluster) (sPLA and sCCA) and (OPPAR followed by SNF, moCluster) and the concordance index (c-statistic) by evaluating the predictive performance of each cluster on primary outcomes across different methods.

Anticipated Challenges

Accessibility of Target Tissue and Use of Systemic Surrogates

Given the impossibility of serial brain biopsies, we can only rely on extra-cerebral surrogates (e.g., peripheral blood, urine, stool) to assess phenomena associated with brain neurodegeneration (Lehmann-Werman et al., 2016). Nevertheless, selected biological alterations associated with central nervous system neurodegeneration can also be detected in other tissues (Kaushik and Cuervo, 2015; Lehmann-Werman et al., 2016); for instance, EVs will be used as a platform for “liquid biopsies.” EVs have been shown to transport this molecular cargo directly between neighboring cells, as well as to distant cells via blood and other fluids (van Niel et al., 2018). EVs bear both surface proteins and intracellular contents from their parent cells into peripheral fluids, which are then accessible without the invasiveness of tissue biopsy (El Andaloussi et al., 2013). Moreover, in the future, EVs may also serve as a delivery system for therapies given that, as native nanoparticles, they benefit from immune tolerance and the ability to cross biological barriers (van Niel et al., 2018; Wiklander et al., 2019).

Relevance of Biomarkers

Neurodegeneration starts years prior to symptom onset (Cacabelos, 2017). This creates difficulties in distinguishing between early biomarkers, related to causal disease mechanisms, and late biomarkers, possibly end results of other processes, themselves pathogenic, or resulting from response to various treatments (Espay et al., 2017). Moreover, early or late biomarkers may be transient or constant across neurodegenerative disorders, potentially underestimating or overestimating the importance of an early or late biomarker depending on the time of data acquisition. A population-based study design with control subjects, multiple visits, longitudinal assessments and next-generation statistical analysis may help mitigate these issues.

Development of Bioassays

Some of the known mechanisms of therapies with repurposing potential (Table 2) may not be relevant to disease pathogenesis in any subtype, even if bioassays can be developed to measure their range in a laboratory. Some bioassay candidates can be difficult to deploy or measure with existing technology in a manner that would make them clinically viable. Connecting specific biomarkers to disease stage/progression will be difficult given our study design. This concern will be ameliorated by using promising bioassays to select patients for future proof-of-concept drug studies. Such studies will contribute toward separating primary from secondary biologic mechanisms of each neurodegenerative subtype.

Uncertainty About Extent of Unknowns

While the data-driven design of this study favors the collection of data without a priori hypotheses for later analysis using discovery algorithms (Kim et al., 2016), a major challenge is to define which biologically promising targets may be more relevant than any of the currently known biomarkers. Also, some technologies may be insufficiently sensitive for potentially relevant biomarkers or result in false negative assays. As for the known variability of prior omics data, we expect that to be attenuated by the unbiased analysis, not anchored on diagnostic or phenotypic data. The creation of a robust biobank is designed to mitigate these difficulties by providing the opportunity to re-analyze samples and data in the future.

The “All of Us” Program

The “All of US” program is an important effort funded by the NIH starting in 2015, aiming to collect clinical, paraclinical, and biological data in a very large population, not preselected for the presence of neurodegenerative disorders (All of Us Research Program Investigators, 2019). The goal of the program is to enroll at least 1 million persons nationwide from 340 recruitment sites (All of Us Research Program Investigators, 2019). This effort represents a significant step forward in the understanding of human health and disease. However, the lack of focus on neurodegenerative disorders (or any other disorder) represents an important limitation from the standpoint of our research objectives.

Compared to the “All of US,” our study aims to merge an “inclusive” approach to all neurodegenerative disorders and utilizes standardized clinical questionnaires and scales, in-clinic and at-home wearable technologies, and more extensive biological sampling. Nevertheless, a future collaboration between these two approaches stands to accelerate the understanding of neurodegenerative disorders.


This phenotype-agnostic, population-based, bio-subtyping and bioassay development program will provide longitudinally-collected clinical and biological data to characterize patients affected by neurodegenerative diseases –not to understand diseases, but to understand how individuals are affected by them. The inclusivity and large number of deeply-phenotyped individuals (currently classified under a range of neurodegenerative disorders) and the causal model-driven nature of analyses, blinded to the clinical disease classification, are unique elements in the design of this study, expected to identify small but molecularly suitable subsets of subjects for embedded proof-of-concept adaptive clinical trials. Our goal is to identity the first molecular subset of individuals for whom an available therapy can be repurposed before the end of the 2020s. Despite many anticipated challenges, the ascertainment of biological subtypes will help to materialize the promise of precision medicine for patients affected by neurodegenerative disorders.

Author Contributions

AS organized, executed the research project, conceived and wrote the first draft of the manuscript. LM and MK organized, executed the research project and critically revised the manuscript. AKD and LL conceived the statistical methods and critically revised the manuscript. JV, APD, PL, MP, BW, EH, BS, EK, AV, LW, DBH, MR, CT, DWH, SE, KE, and RF critically revised the manuscript. ML organized the research project, conceived the statistical methods and critically revised the manuscript. AE organized, executed and supervised the research project, conceived and wrote the first draft of the manuscript. All authors contributed to the article and approved the submitted version.


The CCBP has received major funding through a grant from the Gardner Family Foundation.

Conflict of Interest

AKD is currently supported as a co-investigator by the NIH (1 R21 HL143030-01) and (R21 AI133207) grants. He is also currently serving as a statistician in CPRIT funded studies (PP200006, PP190058, PP180003, and PP170068). The author is also an Adjunct Associate Professor in the department of neurology and rehabilitation medicine, University of Cincinnati.

MK is an employee of the CONICET. He has received grant support from Ministry of Science and Technology of Argentina and Ministry of Health of Buenos Aires. Genetics and metabolism (Elsevier, Inc., New York, NY, United States), and PLOSone (Public Library of Science, San Francisco, CA, United States).

BS has been paid by Aligning Science Across Parkinson’s Workshop, Chan Zuckerberg Initiative’s Neurodegeneration Challenge Network Meeting, StemCellTalks Toronto, Partnerships in Clinical Trials Europe 2018, 2019 World Parkinson’s Congress, 2019 ADPD Congress – Roche sponsored talk, Alkahest, Biolegend, Lysosomal Therapeutics, Abbvie’s Sharing for Better Caring Symposium, University of Toronto Neuroscience Rounds, McGill (multiple classes), EPFL Open Science Initiative, University of Cincinnati Neuroscience Rounds, Tanenbaum Open Science Initiative, 2018 and 2019 Rallying to the Challenge at Grand Challenges in Parkinson’s Symposium, Parkinson’s Canada Research Symposium, European Parkinson’s Disease Association’s YOPD Symposium at the European Union Headquarters, 2018 Synuclein Meeting, 2018 Ontario Brain Institute Research Summit, The Buck Institute, 23andMe, System1 Biosciences, Duke-NUS, NIH Neuroscience Rounds, Aspen Biosciences, Biogen, Zambon, Lundbeck, Cerevel, Idorsia. He is a member of the patient advisory board for the Toronto Western Hospital’s Movement Disorder Clinic and a contributing editor to the Journal of Parkinson’s.

AV received funding from R01 NIH/NINDS NS103824-01, R01 NINDS NS100417, NIH/NINDS 1U01NS100699, R01 NIH/NINDS NS30678, Imaging Core Lab, ENDOLOW Trial, Cerenovus, Johnson and Johnson, Human centered design grant, and ACR Innovation Fund.

CT is an employee of the University of California – San Francisco and the San Francisco Veterans Affairs Health Care System. She receives grants from the Michael J. Fox Foundation, the Parkinson’s Foundation, the Department of Defense, BioElectron, Roche/Genentech, Biogen Idec and the National Institutes of Health, compensation for serving on Data Monitoring Committees from Voyager Therapeutics, Intec Pharma and Cadent Therapeutics and personal fees for consulting from Neurocrine Biosciences, Adamas Therapeutics, Gray Matter, Acorda, Acadia, Amneal and CNS Ratings.

SE is co-founder of and shareholder in evox Therapeutics that develops engineered exosome therapies to treat genetic diseases.

ML received grant support from the Michael J. Fox Foundation.

AE has received grant support from the NIH and the Michael J. Fox Foundation; personal compensation as a consultant/scientific advisory board member for Abbvie, Neuroderm, Neurocrine, Amneal, Adamas, Acadia, Acorda, InTrance, Sunovion, Lundbeck, and USWorldMeds; publishing royalties from Lippincott Williams and Wilkins, Cambridge University Press, and Springer; and honoraria from USWorldMeds, Acadia, and Sunovion. He serves as Associate Editor of the Journal of Clinical Movement Disorders and on the editorial boards of JAMA Neurology, the Journal of Parkinson’s Disease and Parkinsonism and Related Disorders.

The remaining authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

The reviewer SS declared a past co-authorship with one of the authors, AE, to the handling Editor.


We would like to thank all participants involved in the CCBP biomarkers program. In particular, the Project Manager (Dawn Skirpan); the Clinical Research Manager (Erin Neefus); the Business Administrator (Douglas Wead); the Research Operations Manager (Kristy Espay); the Research Associates from the Department of Radiology at the University of Cincinnati (Vivek Khandwala and Brady Williamson); the Clinical Research Coordinators (Cynthia Spikes, Deepa Agrawal Bajaj, and Nathan Gregor); the Research Scholars (Kevin Duque Yupanqui, Maria Barbara Grimberg, and Hussein Elhusseini Abdelghany); the Clinical Fellows (Abhimanyu Mahajan, Miguel Situ, Jennifer Sharma, Nathaniel Wachter); the Data Manager team (Cyndie Wells and Rushi Goswami) from the Division of Biostatistics and Epidemiology at the Cincinnati Children’s Hospital Medical Center. We also thank the Biospecimen processing and storage services from the Discover Together Biobank at Cincinnati Children’s Research Foundation. Finally, we are grateful for all the participants that have contributed and will continue to participate in this research project.


  1. ^


All of Us Research Program Investigators (2019). The “All of Us” research program. N. Engl. J. Med. 381, 668–676. doi: 10.1056/NEJMsr1809937

PubMed Abstract | CrossRef Full Text | Google Scholar

Athauda, D., and Foltynie, T. (2016). The 1glucagon-like peptide 1 (GLP) receptor as a therapeutic target in Parkinson’s disease: mechanisms of action. Drug Discov. Today 21, 802–818. doi: 10.1016/j.drudis.2016.01.013

PubMed Abstract | CrossRef Full Text | Google Scholar

Baggio, L. L., and Drucker, D. J. (2007). Biology of Incretins: GLP-1 and GIP. Gastroenterology 132, 2131–2157. doi: 10.1053/j.gastro.2007.03.054

PubMed Abstract | CrossRef Full Text | Google Scholar

Beck, A. T., and Beamesderfer, A. (1974). Assessment of depression: the depression inventory. Psychol. Meas. Psychopharmacol. 7, 151–169. doi: 10.1159/000395074

PubMed Abstract | CrossRef Full Text | Google Scholar

Beck, A. T., Epstein, N., Brown, G., and Steer, R. A. (1988). An inventory for measuring clinical anxiety: psychometric properties. J. Consult. Clin. Psychol. 56, 893–897. doi: 10.1037/0022-006X.56.6.893

PubMed Abstract | CrossRef Full Text | Google Scholar

Billoir, E., Navratil, V., and Blaise, B. J. (2016). Sample size calculation in metabolic phenotyping studies. Brief. Bioinform. 16, 813–819. doi: 10.1093/bib/bbu052

PubMed Abstract | CrossRef Full Text | Google Scholar

Bloem, B. R., Marks, W. J. Jr., Silva de Lima, A. L., Kuijf, M. L., van Laar, T., Jacobs, B. P. F., et al. (2019). The Personalized Parkinson Project: examining disease progression through broad biomarkers in early Parkinson’s disease. BMC Neurol. 19:160. doi: 10.1186/s12883-019-1394-3

PubMed Abstract | CrossRef Full Text | Google Scholar

Boyle, P. A., Yu, L., Wilson, R. S., Leurgans, S. E., Schneider, J. A., and Bennett, D. A. (2018). Person-specific contribution of neuropathologies to cognitive loss in old age. Ann. Neurol. 83, 74–83. doi: 10.1002/ana.25123

PubMed Abstract | CrossRef Full Text | Google Scholar

Cacabelos, R. (2017). Parkinson’s disease: from pathogenesis to pharmacogenomics. Int. J. Mol. Sci. 18:551. doi: 10.3390/ijms18030551

PubMed Abstract | CrossRef Full Text | Google Scholar

Chaudhuri, K. R., Martinez-Martin, P., Schapira, A. H., Stocchi, F., Sethi, K., and Odin, P. (2006). International multicenter pilot study of the first comprehensive self-completed nonmotor symptoms questionnaire for Parkinson’s disease: the NMSQuest study. Mov. Disord. 21, 916–923. doi: 10.1002/mds.20844

PubMed Abstract | CrossRef Full Text | Google Scholar

Comitato, R., Ambra, R., and Virgili, F. (2017). Tocotrienols: a family of molecules with specific biological activities. Antioxidants 6:93. doi: 10.3390/antiox6040093

PubMed Abstract | CrossRef Full Text | Google Scholar

Dennys, C. N., Armstrong, J., Levy, M., Byun, Y. J., Ramdial, K. R., Bott, M., et al. (2015). Chronic inhibitory effect of riluzole on trophic factor production. Exp. Neurol. 271, 301–307. doi: 10.1016/j.expneurol.2015.05.016

PubMed Abstract | CrossRef Full Text | Google Scholar

Dolnicar, S., Grün, B., Leisch, F., and Schmidt, K. (2014). Required Sample Sizes for Data-Driven Market Segmentation Analyses in Tourism. J. Travel Res. 53, 296–306. doi: 10.1177/0047287513496475

CrossRef Full Text | Google Scholar

El Andaloussi, S., Mäger, I., Breakefield, X. O., and Wood, M. J. (2013). Extracellular vesicles: biology and emerging therapeutic opportunities. Nature reviews. Drug Discov. 12, 347–357. doi: 10.1038/nrd3978

PubMed Abstract | CrossRef Full Text | Google Scholar

Espay, A. J., Kalia, L. V., Gan-Or, Z., Williams-Gray, C. H., Bedard, P. L., and Rowe, S. M. (2020). Disease modification and biomarker development in Parkinson disease: revision or reconstruction? Neurology 94, 481–494. doi: 10.1212/WNL.0000000000009107

PubMed Abstract | CrossRef Full Text | Google Scholar

Espay, A. J., Schwarzschild, M. A., Tanner, C. M., Fernandez, H. H., Simon, D. K., Leverenz, J. B., et al. (2017). Biomarker-driven phenotyping in Parkinson’s disease: a translational missing link in disease-modifying clinical trials. Mov. Disord. 32, 319–324. doi: 10.1002/mds.26913

PubMed Abstract | CrossRef Full Text | Google Scholar

Espay, A. J., Vizcarra, J. A., Marsili, L., Lang, A. E., Simon, D. K., Merola, A., et al. (2019). Revisiting protein aggregation as pathogenic in sporadic Parkinson and Alzheimer diseases. Neurology 92, 329–337. doi: 10.1212/WNL.0000000000006926

PubMed Abstract | CrossRef Full Text | Google Scholar

Fan, L., Feng, Y., Chen, G. C., Qin, L. Q., Fu, C. L., and Chen, L. H. (2017). Effects of coenzyme Q10 supplementation on inflammatory markers: a systematic review and meta-analysis of randomized controlled trials. Pharmacol. Res. 119, 128–136. doi: 10.1016/j.phrs.2017.01.032

PubMed Abstract | CrossRef Full Text | Google Scholar

Ferreira, G. C., and McKenna, M. C. (2017). l-Carnitine and Acetyl-l-carnitine Roles and Neuroprotection in Developing Brain. Neurochem. Res. 42, 1661–1675. doi: 10.1007/s11064-017-2288-7

PubMed Abstract | CrossRef Full Text | Google Scholar

Fetoni, A. R., Piacentini, R., Fiorita, A., Paludetti, G., and Troiani, D. (2009). Water-soluble Coenzyme Q10formulation (Q-ter) promotes outer hair cell survival in a guinea pig model of noise induced hearing loss (NIHL). Brain Res. 1257, 108–116. doi: 10.1016/j.brainres.2008.12.027

PubMed Abstract | CrossRef Full Text | Google Scholar

Galimberti, D., and Scarpini, E. (2017). Pioglitazone for the treatment of Alzheimer’s disease. Expert Opin. Investig. Drugs 26, 97–101. doi: 10.1080/13543784.2017.1265504

PubMed Abstract | CrossRef Full Text | Google Scholar

Ghosh, A., Chandran, K., Kalivendi, S. V., Joseph, J., Antholine, W. E., and Hillard, C. J. (2010). Neuroprotection by a mitochondria-targeted drug in a Parkinson’s disease model. Free Radic Biol Med. 49, 1674–1684. doi: 10.1016/j.freeradbiomed.2010.08.028

PubMed Abstract | CrossRef Full Text | Google Scholar

Goetz, C. G., Tilley, B. C., Shaftman, S. R., Stebbins, G. T., Fahn, S., Martinez-Martin, P., et al. (2008). Movement Disorder Society-Sponsored Revision of the Unified Parkinson’s Disease Rating Scale (MDS-UPDRS): scale presentation and clinimetric testing results. Mov. Disord. 23, 2129–2170. doi: 10.1002/mds.22340

PubMed Abstract | CrossRef Full Text | Google Scholar

Guo, Y., Graber, A., McBurney, R. N., and Balasubramanian, R. (2010). Sample size and statistical power considerations in high-dimensionality data settings: a comparative study of classification algorithms. BMC Bioinformatics 11:447. doi: 10.1186/1471-2105-11-447

PubMed Abstract | CrossRef Full Text | Google Scholar

Harikishore, A., and Sup Yoon, H. (2015). Immunophilins: structures, mechanisms and ligands. Curr. Mol. Pharmacol. 9, 37–47. doi: 10.2174/1874467208666150519113427

PubMed Abstract | CrossRef Full Text | Google Scholar

Head, E., Corrada, M. M., Kahle-Wrobleski, K., Kim, R. C., Sarsoza, F., Goodus, M., et al. (2009). Synaptic proteins, neuropathology and cognitive status in the oldest-old. Neurobiol. Aging 30, 1125–1134. doi: 10.1016/j.neurobiolaging.2007.10.001

PubMed Abstract | CrossRef Full Text | Google Scholar

Hider, R. C., and Hoffbrand, A. V. (2018). The role of deferiprone in iron chelation. N. Engl. J. Med. 379, 2140–2150. doi: 10.1056/NEJMra1800219

PubMed Abstract | CrossRef Full Text | Google Scholar

Hipp, G., Vaillant, M., Diederich, N. J., Roomp, K., Satagopam, V. P., Banda, P., et al. (2018). The luxembourg Parkinson’ s study: a comprehensive approach for stratification and early diagnosis. Front. Aging Neurosci. 10:326. doi: 10.3389/fnagi.2018.00326

PubMed Abstract | CrossRef Full Text | Google Scholar

Hofree, M., Shen, J. P., Carter, H., Gross, A., and Ideker, T. (2013). Network-based stratification of tumor mutations. Nat. Methods 10, 1108–1115. doi: 10.1038/nmeth.2651

PubMed Abstract | CrossRef Full Text | Google Scholar

Inaba-Hasegawa, K., Shamoto-Nagai, M., Maruyama, W., and Naoi, M. (2017). Type B and A monoamine oxidase and their inhibitors regulate the gene expression of Bcl-2 and neurotrophic factors in human glioblastoma U118MG cells: different signal pathways for neuroprotection by selegiline and rasagiline. J. Neural. Transm. 124, 1055–1066. doi: 10.1007/s00702-017-1740-9

PubMed Abstract | CrossRef Full Text | Google Scholar

Irwin, D. J., Grossman, M., Weintraub, D., Hurtig, H. I., Duda, J. E., Xie, S. X., et al. (2017). Neuropathological and genetic correlates of survival and dementia onset in synucleinopathies: a retrospective analysis. Lancet Neurol. 16, 55–65. doi: 10.1016/S1474-4422(16)30291-5

CrossRef Full Text | Google Scholar

Jenkinson, C., Fitzpatrick, R., Peto, V., Greenhall, R., and Hyman, N. (1997). The PDQ-8: development and validation of a short-form Parkinson’s disease questionnaire. Psychol. Heal. 19, 308–312. doi: 10.1080/08870449708406741

CrossRef Full Text | Google Scholar

Johns, M. W., and Hocking, B. (1997). Daytime sleepiness and sleep habits of Australian workers. Sleep 20, 844–849. doi: 10.1177/1352458512451947

PubMed Abstract | CrossRef Full Text | Google Scholar

Karanth, S., Nelson, P. T., Katsumata, Y., Kryscio, R. J., Schmitt, F. A., Fardo, D. W., et al. (2020). Prevalence and clinical phenotype of quadruple misfolded proteins in older adults. JAMA Neurol. e201741. doi: 10.1001/jamaneurol.2020.1741 [Epub ahead of print].

CrossRef Full Text | PubMed Abstract | Google Scholar

Karrila, S., Lee, J. H., and Tucker-Kellogg, G. (2011). A comparison of methods for data-driven cancer outlier discovery, and an application scheme to semisupervised predictive biomarker discovery. Cancer Inform. 10, 109–120. doi: 10.4137/CIN.S6868

PubMed Abstract | CrossRef Full Text | Google Scholar

Katz, S. (1983). Assessing self-maintenance: activities of daily living, mobility, and instrumental activities of daily living. J. Am. Geriatr. Soc. 31, 721–727. doi: 10.1111/j.1532-5415.1983.tb03391.x

PubMed Abstract | CrossRef Full Text | Google Scholar

Kaushik, S., and Cuervo, A. M. (2015). Proteostasis and aging. Nat. Med. 21, 1406–1415. doi: 10.1038/nm.4001

PubMed Abstract | CrossRef Full Text | Google Scholar

Kehagia, A. A., Barker, R. A., and Robbins, T. W. (2010). Neuropsychological and clinical heterogeneity of cognitive impairment and dementia in patients with Parkinson’s disease. Lancet Neurol. 9, 1200–1213. doi: 10.1016/S1474-4422(10)70212-X

CrossRef Full Text | Google Scholar

Kenneth, M., Danna, J., Shirley, L., Andrew, S., Caroline, T., Tanya, S., Chris, C., et al. (2011). The Parkinson Progression Marker Initiative (PPMI). Prog. Neurobiol. 95, 629–635. doi: 10.1016/j.pneurobio.2011.09.005

PubMed Abstract | CrossRef Full Text | Google Scholar

Kim, R. S., Goossens, N., and Hoshida, Y. (2016). Use of big data in drug development for precision medicine. Expert. Rev. Precis. Med. Drug Dev. 1, 245–253. doi: 10.1080/23808993.2016.1174062

PubMed Abstract | CrossRef Full Text | Google Scholar

Kumar, A., Singh, U. K., Kini, S. G., Garg, V., Agrawal, S., Tomar, P. K., et al. (2015). JNK pathway signaling: a novel and smarter therapeutic targets for various biological diseases. Future Med. Chem. 7, 2065–2086. doi: 10.4155/fmc.15.132

PubMed Abstract | CrossRef Full Text | Google Scholar

Labie, C., Canolle, B., Chatelin, S., Lafon, C., and Fournier, J. (2006). Effects of paliroden (SR57667B) and xaliproden on adult brain neurogenesis. Curr. Alzheimer Res. 3, 35–36. doi: 10.2174/156720506775697070

PubMed Abstract | CrossRef Full Text | Google Scholar

Lang, A. E., Gill, S., Patel, N. K., Lozano, A., Nutt, J. G., Penn, R., et al. (2006). Randomized controlled trial of intraputamenal glial cell line-derived neurotrophic factor infusion in Parkinson disease. Ann. Neurol. 59, 459–466. doi: 10.1002/ana.20737

PubMed Abstract | CrossRef Full Text | Google Scholar

Lehmann-Werman, R., Neiman, D., Zemmour, H., Moss, J., Magenheim, J., Vaknin-Dembinsky, A., et al. (2016). Identification of tissue-specific cell death using methylation patterns of circulating DNA. Proc. Natl. Acad. Sci. U.S.A. 113, E1826L–E1834. doi: 10.1073/pnas.1519286113

PubMed Abstract | CrossRef Full Text | Google Scholar

Li, Y., Chigurupati, S., Holloway, H. W., Mughal, M., Tweedie, D., Bruestle, D. A., et al. (2012). Exendin-4 ameliorates motor neuron degeneration in cellular and animal models of amyotrophic lateral sclerosis. PLoS One 7:e32008. doi: 10.1371/journal.pone.0032008

PubMed Abstract | CrossRef Full Text | Google Scholar

Little, M. A. (2019). Machine Learning for Signal Processing. Oxford: Oxford University Press.

Google Scholar

Little, M. A., and Badawy, R. K. (2019). Causal bootstrapping. ArXiv Available online at: (accessed March 14, 2020).

Google Scholar

Meng, C., Helm, D., Frejno, M., and Kuster, B. (2016). moCluster: identifying joint patterns across multiple omics datasets. J. Proteome Res. 15, 755–765. doi: 10.1021/acs.jproteome.5b00824

PubMed Abstract | CrossRef Full Text | Google Scholar

Morales, P. E., Torres, G., Sotomayor-Flores, C., Peña-Oyarzún, D., Rivera-Mejías, P., Paredes, F., et al. (2014). GLP-1 promotes mitochondrial metabolism in vascular smooth muscle cells by enhancing endoplasmic reticulum-mitochondria coupling. Biochem. Biophys. Res. Commun. 446, 410–416. doi: 10.1016/j.bbrc.2014.03.004

PubMed Abstract | CrossRef Full Text | Google Scholar

Nasreddine, Z. S., Phillips, N. A., Bédirian, V., Charbonneau, S., Whitehead, V., Collin, I., et al. (2005). The Montreal Cognitive Assessment. MoCA: a brief screening tool for mild cognitive impairment. J. Am. Geriatr. Soc. 53, 695–699. doi: 10.1111/j.1532-5415.2005.53221.x

PubMed Abstract | CrossRef Full Text | Google Scholar

Nassar, N. N., Al-Shorbagy, M. Y., Arab, H. H., and Abdallah, D. M. (2015). Saxagliptin: a novel antiparkinsonian approach. Neuropharmacology 89, 308–317. doi: 10.1016/j.neuropharm.2014.10.007

PubMed Abstract | CrossRef Full Text | Google Scholar

Olanow, C. W., Bartus, R. T., Baumann, T. L., Factor, S., Boulis, N., Stacy, M., et al. (2015). Gene delivery of neurturin to putamen and substantia nigra in Parkinson disease: a double-blind, randomized, controlled trial. Ann. Neurol. 78, 248–257. doi: 10.1002/ana.24436

PubMed Abstract | CrossRef Full Text | Google Scholar

Olanow, C. W., Schapira, A. H., LeWitt, P. A., Kieburtz, K., Sauer, D., Olivieri, G., et al. (2006). TCH346 as a neuroprotective drug in Parkinson’s disease: a double-blind, randomised, controlled trial. Lancet Neurol. 5, 1013–1020. doi: 10.1016/S1474-4422(06)70602-0

CrossRef Full Text | Google Scholar

Orsucci, D., Mancuso, M., Ienco, E. C., LoGerfo, A., and Siciliano, G. (2011). Targeting Mitochondrial Dysfunction and Neurodegeneration by Means of Coenzyme Q10 and its Analogues. Curr. Med. Chem. 18, 4053–4064. doi: 10.2174/092986711796957257

PubMed Abstract | CrossRef Full Text | Google Scholar

Ortner, N. J., Bock, G., Dougalis, A., Kharitonova, M., Duda, J., Hess, S., et al. (2017). Lower affinity of isradipine for L-type Ca2++ channels during Substantia Nigra dopamine neuron-like activity: implications for neuroprotection in Parkinson’s disease. J. Neurosci. 37, 6761–6777. doi: 10.1523/JNEUROSCI.2946-16.2017

PubMed Abstract | CrossRef Full Text | Google Scholar

Parkinson Study Group PRECEPT Investigators (2007). Mixed lineage kinase inhibitor CEP-1347 fails to delay disability in early Parkinson disease. Neurology 69, 1480–1490. doi: 10.1212/01.wnl.0000277648.63931.c0

PubMed Abstract | CrossRef Full Text | Google Scholar

Pearl, J. (2010). Causality: Models, Reasoning, and Inference. Cambridge. New York, NY: Cambridge University Press.

Google Scholar

Qiu, W., and Joe, H. (2009). clusterGeneration: Random Cluster Generation (with Specified Degree of Separation), R package version 1.2.7.

Google Scholar

Quigley, E. M. M. (2017). Microbiota-brain-gut axis and neurodegenerative diseases. Curr. Neurol. Neurosci. Rep. 17:94. doi: 10.1007/s11910-017-0802-6

PubMed Abstract | CrossRef Full Text | Google Scholar

Rizvi, S., Raza, S. T., Ahmed, F., Ahmad, A., Abbas, S., and Mahdi, F. (2014). The role of Vitamin E in human health and some diseases. Sultan Qaboos Univ. Med. J. 14, e157–e165.

Google Scholar

Scarmeas, N., Albert, M., Brandt, J., Blacker, D., Hadjigeorgiou, G., Papadimitriou, A., et al. (2005). Motor signs predict poor outcomes in Alzheimer disease. Neurology 64, 1696–1703. doi: 10.1212/01.WNL.0000162054.15428.E9

PubMed Abstract | CrossRef Full Text | Google Scholar

Scarmeas, N., Hadjigeorgiou, G. M., Papadimitriou, A., Dubois, B., Sarazin, M., Brandt, J., et al. (2004). Motor signs during the course of Alzheimer disease. Neurology 63, 975–982. doi: 10.1136/practneurol-2014-000849

PubMed Abstract | CrossRef Full Text | Google Scholar

Shakir, M., Zoubin, G., and Katherine, A. (2008). Heller, Bayesian Exponential Family PCA. Adv. Neural Inform. Process. Syst. 21, 8–22.

Google Scholar

Stiasny-Kolster, K., Mayer, G., Schäfer, S., Möller, J. C., Heinzel-Gutenbrunner, M., and Oertel, W. H. (2007). The REM sleep behavior disorder screening questionnaire - A new diagnostic instrument. Mov. Disord. 22, 2386–2393. doi: 10.1002/mds.21740

PubMed Abstract | CrossRef Full Text | Google Scholar

Swanson, C. R., Joers, V., Bondarenko, V., Brunner, K., Simmons, H. A., Ziegler, T. E., et al. (2011). The PPAR-γ agonist pioglitazone modulates inflammation and induces neuroprotection in parkinsonian monkeys. J. Neuroinflam. 8:91. doi: 10.1186/1742-2094-8-91

PubMed Abstract | CrossRef Full Text | Google Scholar

Szökö, É, Tábi, T., Riederer, P., Vécsei, L., and Magyar, K. (2018). Pharmacological aspects of the neuroprotective effects of irreversible MAO-B inhibitors, selegiline and rasagiline, in Parkinson’s disease. J. Neural. Transm. 125, 1735–1749. doi: 10.1007/s00702-018-1853-9

PubMed Abstract | CrossRef Full Text | Google Scholar

Thomas, M. L. W. (2004). Minocycline: neuroprotective mechanisms in Parkinson’s disease. Curr. Pharm. Des. 10, 679–686. doi: 10.2174/1381612043453162

PubMed Abstract | CrossRef Full Text | Google Scholar

Tinetti, M. E., Williams, T. F., and Mayewski, R. (1986). Fall risk index for elderly patients based on number of chronic disabilities. Am. J. Med. 80, 429–434. doi: 10.1016/0002-9343(86)90717-5

CrossRef Full Text | Google Scholar

Uchoa, M. F., de Souza, L. F., Dos Santos, D. B., Peres, T. V., Mello, D. F., Leal, R. B., et al. (2016). Modulation of Brain Glutathione Reductase and Peroxiredoxin 2 by α-Tocopheryl Phosphate. Cell Mol. Neurobiol. 36, 1015–1022. doi: 10.1007/s10571-015-0298-z

PubMed Abstract | CrossRef Full Text | Google Scholar

van Niel, G., D’Angelo, G., and Raposo, G. (2018). Shedding light on the cell biology of extracellular vesicles. Nat. Rev. Mol. Cell Biol. 4, 213–228. doi: 10.1038/nrm.2017.125

PubMed Abstract | CrossRef Full Text | Google Scholar

Villapol, S. (2018). Roles of Peroxisome Proliferator-Activated Receptor Gamma on Brain and Peripheral Inflammation. Cell Mol. Neurobiol. 38, 121–132. doi: 10.1007/s10571-017-0554-5

PubMed Abstract | CrossRef Full Text | Google Scholar

Visanji, N. P., Orsi, A., Johnston, T. H., Howson, P. A., Dixon, K., Callizot, N., et al. (2008). PYM50028, a novel, orally active, nonpeptide neurotrophic factor inducer, prevents and reverses neuronal damage induced by MPP+ in mesencephalic neurons and by MPTP in a mouse model of Parkinson’s disease. FASEB J. 22, 2488–2497. doi: 10.1096/fj.07-095398

PubMed Abstract | CrossRef Full Text | Google Scholar

Wallace, L., Theou, O., Godin, J., Andrew, M. K., Bennett, D. A., and Rockwood, K. (2019). Investigation of frailty as a moderator of the relationship between neuropathology and dementia in Alzheimer’s disease: a cross-sectional analysis of data from the Rush Memory and Aging Project. Lancet Neurol. 18, 177–184. doi: 10.1016/S1474-4422(18)30371

CrossRef Full Text | Google Scholar

Wang, B., Mezlini, A. M., Demir, F., Fiume, M., Tu, Z., Brudno, M., et al. (2014). Similarity network fusion for aggregating data types on a genomic scale. Nat. Methods 11, 333–337. doi: 10.1038/nmeth.2810

PubMed Abstract | CrossRef Full Text | Google Scholar

Wang, C., Taciroglu, A., Maetschke, S. R., Nelson, C. C., Ragan, M. A., and Davis, M. J. (2012). mCOPA: analysis of heterogeneous features in cancer expression data. J. Clin. Bioinform. 2:22. doi: 10.1186/2043-9113-2-22

PubMed Abstract | CrossRef Full Text | Google Scholar

Wang, D., and Gu, J. (2016). Integrative clustering methods of multi-omics data for molecule-based cancer classifications. Quant. Biol. 4, 58–67. doi: 10.1007/s40484-016-0063-4

CrossRef Full Text | Google Scholar

Wiklander, O., Bostancioglu, R. B., Welsh, J. A., Zickler, A. M., Murke, F., Corso, G., et al. (2018). Systematic methodological evaluation of a multiplex bead-based flow cytometry assay for detection of extracellular vesicle surface signatures. Front. Immunol. 2018:1326. doi: 10.3389/fimmu.2018.01326

PubMed Abstract | CrossRef Full Text | Google Scholar

Wiklander, O., Brennan, M. Á, Lötvall, J., Breakefield, X. O., and El Andaloussi, S. (2019). Advances in therapeutic applications of extracellular vesicles. Sci. Transl. Med. 11:eaav8521. doi: 10.1126/scitranslmed.aav8521

PubMed Abstract | CrossRef Full Text | Google Scholar

Witten, D. M., and Tibshirani, R. J. (2009). Extensions of sparse canonical correlation analysis with applications to genomic data. Stat. Appl. Genet. Mol. Biol. 8, 1–27. doi: 10.2202/1544-6115.1470

PubMed Abstract | CrossRef Full Text | Google Scholar

Wold, S., Esbensen, K., and Geladi, P. (1987). Principal component analysis. Chemom. Intell. Lab. Syst. 2, 37–52.

Google Scholar

Wood, D. E., and Salzberg, S. L. (2014). Kraken: ultrafast metagenomic sequence classification using exact alignments. Genome Biol. 15:R46. doi: 10.1186/gb-2014-15-3-r46

PubMed Abstract | CrossRef Full Text | Google Scholar

Wu, Y., Kazumura, K., Maruyama, W., Osawa, T., and Naoi, M. (2015). Rasagiline and selegiline suppress calcium efflux from mitochondria by PK11195-induced opening of mitochondrial permeability transition pore: a novel anti-apoptotic function for neuroprotection. J. Neural. Transm. 122, 1399–1407. doi: 10.1007/s00702-015-1398-0

PubMed Abstract | CrossRef Full Text | Google Scholar

Xia, P., Pan, Y., Zhang, F., Wang, N., Wang, E., Guo, Q., et al. (2018). Pioglitazone Confers Neuroprotection Against Ischemia-Induced Pyroptosis due to its Inhibitory Effects on HMGB-1/RAGE and Rac1/ROS Pathway by Activating PPAR. Cell Physiol. Biochem. 45, 2351–2368. doi: 10.1159/000488183

PubMed Abstract | CrossRef Full Text | Google Scholar

Zanelli, S. A., Solenski, N. J., Rosenthal, R. E., and Fiskum, G. (2005). Mechanisms of ischemic neuroprotection by acetyl-L-carnitine. Ann. N. Y. Acad. Sci. 1053, 153–161. doi: 10.1196/annals.1344.013

PubMed Abstract | CrossRef Full Text | Google Scholar

Zemke, D., and Majid, A. (2004). The potential of minocycline for neuroprotection in human neurologic disease. Clin. Neuropharmacol. 27, 293–298. doi: 10.1097/01.wnf.0000150867.98887.3e

CrossRef Full Text | Google Scholar

Zhou, J., Wang, H., Shen, R., Fang, J., Yang, Y., Dai, W., et al. (2018). Mitochondrial-targeted antioxidant mitoq provides neuroprotection and reduces neuronal apoptosis in experimental traumatic brain injury possibly via the Nrf2-ARE pathway. Am. J. Transl. Res. 10, 1887–1899.

Google Scholar

Keywords: biomarkers, Parkinson’s disease, Alzheimer’s disease, neurodegeneration, cohort, drug repurposing, bioassay

Citation: Sturchio A, Marsili L, Vizcarra JA, Dwivedi AK, Kauffman MA, Duker AP, Lu P, Pauciulo MW, Wissel BD, Hill EJ, Stecher B, Keeling EG, Vagal AS, Wang L, Haslam DB, Robson MJ, Tanner CM, Hagey DW, El Andaloussi S, Ezzat K, Fleming RMT, Lu LJ, Little MA and Espay AJ (2020) Phenotype-Agnostic Molecular Subtyping of Neurodegenerative Disorders: The Cincinnati Cohort Biomarker Program (CCBP). Front. Aging Neurosci. 12:553635. doi: 10.3389/fnagi.2020.553635

Received: 19 April 2020; Accepted: 10 September 2020;
Published: 08 October 2020.

Edited by:

Fernanda Laezza, University of Texas Medical Branch at Galveston, United States

Reviewed by:

Jessica Mandrioli, University Hospital of Modena, Italy
Stefano L. Sensi, Università degli Studi G. d’Annunzio Chieti-Pescara, Italy

Copyright © 2020 Sturchio, Marsili, Vizcarra, Dwivedi, Kauffman, Duker, Lu, Pauciulo, Wissel, Hill, Stecher, Keeling, Vagal, Wang, Haslam, Robson, Tanner, Hagey, El Andaloussi, Ezzat, Fleming, Lu, Little and Espay. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

*Correspondence: Alberto J. Espay,; espayaj@UCMAIL.UC.EDU