Tracking a Fast-Moving Disease: Longitudinal Markers, Monitoring, and Clinical Trial Endpoints in ALS

Amyotrophic lateral sclerosis (ALS) encompasses a heterogeneous group of phenotypes with different progression rates, varying degree of extra-motor involvement and divergent progression patterns. The natural history of ALS is increasingly evaluated by large, multi-time point longitudinal studies, many of which now incorporate presymptomatic and post-mortem assessments. These studies not only have the potential to characterize patterns of anatomical propagation, molecular mechanisms of disease spread, but also to identify pragmatic monitoring markers. Sensitive markers of progressive neurodegenerative change are indispensable for clinical trials and individualized patient care. Biofluid markers, neuroimaging indices, electrophysiological markers, rating scales, questionnaires, and other disease-specific instruments have divergent sensitivity profiles. The discussion of candidate monitoring markers in ALS has a dual academic and clinical relevance, and is particularly timely given the increasing number of pharmacological trials. The objective of this paper is to provide a comprehensive and critical review of longitudinal studies in ALS, focusing on the sensitivity profile of established and emerging monitoring markers.


INTRODUCTION
Amyotrophic lateral sclerosis (ALS) is a clinically, genetically, and pathologically heterogeneous neurodegenerative condition (1)(2)(3). Clinical heterogeneity in ALS is multidimensional owing to variations in upper motor neuron (UMN) and lower motor neuron (LMN) involvement, extra-motor symptoms, age of onset, survival, and progression-rates. Disease heterogeneity hinders biomarker development (3,4) which in turn impedes the reliable assessment of candidate drugs in clinical trials (1). Current clinical trials recruit relatively heterogeneous cohorts of symptomatic patients, despite the notion that considerable pathological changes can already be detected at the time of diagnosis (5,6). The considerable variability in progression rates in ALS is another confounding factor in clinical trial designs (1,(7)(8)(9)(10). Imaging and electrophysiological markers have been repeatedly proposed as candidate monitoring markers (11,12), but it is increasingly clear that a panel of several "wet" and "dry" biomarkers may be required to capture subtle changes over short periods of time (13,14). The objective of this paper is the comprehensive and critical review of longitudinal studies in ALS, focusing on study designs, statistical power, clinical correlations, the sensitivity profile of proposed monitoring markers and their applicability to clinical trials. separately: "staging, " "monitoring, " "outcomes, " "clinical, " "clinical trials, " "electrophysiology, " "neurophysiology, " "electromyography, " "transcranial magnetic stimulation, " "motor unit number estimation, " "motor unit number index, " "positon emission tomography, " "single photon emission computed tomography, " "magnetic resonance imaging, " "neuroimaging, " "imaging, " "blood, " "urine, " "cerebrospinal fluid, " "saliva, " and "muscle." A supplementary search combined the core search terms with the following keywords: "presymptomatic, " "asymptomatic, " and "post-mortem." Inclusion criteria included longitudinal studies investigating imaging, neurophysiological, clinical, or biofluid biomarkers in ALS. Animal studies, review papers, opinion pieces, editorials, case reports, and case series were excluded. Only articles written in English and published between January 1980 and August 2018 were reviewed. Based on the above criteria a total of 118 original research papers were selected and reviewed in accordance with the Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA) recommendations.

Neuroimaging
The sample size characteristics, study design features, followup intervals of longitudinal neuroimaging, neurophysiology, and clinical studies are summarized in Table 1. Whilst most longitudinal imaging studies in ALS evaluate cerebral alterations (10), a number of promising spinal studies have now also been published. Spinal imaging has gradually overcome the technical challenges of physiological motion, small crosssectional dimensions and susceptibility gradients (19, [110][111][112][113][114][115][116][117][118]. The majority of longitudinal studies in ALS are single-center studies eliminating the need for cross-platform MR sequence harmonization and inter-rater reliability tests. Given the low incidence of certain phenotypes such as primary lateral sclerosis (PLS), progressive muscular atrophy (PMA), and spinal and bulbar muscular atrophy (SBMA) however, multisite collaboration is often necessary (119). The infrastructure, funding and governance of such multicenter collaborations are now established via international consortia like the Neuroimaging Society in Amyotrophic Lateral Sclerosis (NISALS) or the Northeast ALS Consortium (NEALS) (16,23,120,121). The need to include disease-controls in addition to healthy controls to describe ALS-specific changes Neuron; ROA2, Heterogeneous nuclear ribonucleoproteins A2/B1; RSA, relative surface area; rsfMRI, resting state functional magnetic resonance imaging; SCA, spinocerebellar ataxia; SBMA, spinal and bulbar muscular atrophy; SEIQOL-DW, Schedule for the Evaluation of the Individual Quality of Life-Direct Weighting; SF-36, 36-Item short form health survey; SMA, spinal muscular atrophy; SMUAP, single motor unit action potential; SNIP, sniff nasal inspiratory pressure; SOD1, superoxide dismutase 1; SOP, Standard operating procedure; SPECT, single photon emission computed tomography; SPO2, peripheral capillary oxygen saturation; SVC, slow vital capacity; TA, tibialis anterior; TDP-43, TAR DNAbinding protein 43; TiM, Telehealth in Motor Neuron disease; TMS, transcranial magnetic stimulation; TNF, tumor necrosis factor; TUG, timed up and go test; Tw Pdi, twitch trans-diaphragmatic pressure; TWBC, total white blood cell count; UMN, upper motor neuron; VC, vital capacity; WALS, Western ALS Consortium; WVFI, Written Verbal Fluency Index.      Frontiers in Neurology | www.frontiersin.org is increasingly recognized (30, 43, 44). With few exceptions (122)(123)(124), most ALS imaging studies use 3 Tesla platforms and 7 Tesla systems are more commonly used in post-mortem studies (125,126). Disease progression has been detected across a range of MR imaging metrics including structural (22, 26), diffusion (16,18), functional (28, 40), and spectroscopy (41, 42) measures. As the majority of studies have a two-timepoint design, it is often unclear if specific imaging metrics show linear or exponential changes. The few existing multi-timepoint studies suggest that pathological change is not linear (10). The revised ALS functional rating scale (ALSFRS-r) is the most commonly reported clinical measure (16,(18)(19)(20), with only few imaging studies reporting associations with staging (15) or neuropsychological performance (15,24).

Neurophysiology
Most longitudinal neurophysiology studies are single center studies, reducing the risk of inter-rater and inter-center variability (127). As presented in Table 1, follow-up interval ranges between 7 days (65) and 3 years (66), and up to 7 follow-up time-points have been included in some studies (57,60). Surprisingly few studies include disease controls such as peripheral neuropathy (60)

Clinical Biomarkers and Instruments
Robust clinical longitudinal studies in ALS have up to 6 followup time points (88,89,91), the interval between the assessments can be as short as 3 months (95) and the sample size can be as big as several thousands (70, 93) ( Table 1). Few multi-timepoint studies include disease controls such as motor neuropathies (91), alternative neuromuscular diseases (78), or neurodegenerative conditions (83). Large, multi-timepoint longitudinal studies invariably suffer from considerable attrition rates, but these are rarely explicitly reported in the manuscript abstracts (10). Detailed genotyping is only available in a minority of longitudinal studies (15,77,79,94). The most widely utilized rating scale in longitudinal studies is the ALSFRS-r (70,71,128) which provides a composite score of bulbar, limb and respiratory dysfunction, and is invariably evaluated in clinical trials (72,105). Quality of life (QoL) in ALS is increasingly evaluated by disease-specific instruments such as the 40-item ALS assessment questionnaire (ALSAQ-40) or the revised ALS-specific Quality of Life questionnaire (ALSSQoL-R) (129)(130)(131). A number of symptom-specific instruments are also commonly used such as the Center for Neurologic Study-Bulbar Function Scale (CNS-BFS), a 21-item self-report scale of bulbar function, and the Center for Neurologic Study-Lability Scale (CNS-LS), a 7-item self-report scale of pseudobulbar affect (PBA) (132). Tapping rates, composite reflex scores, The Penn UMN Score (133), the Modified Ashworth scale (MAS) are often used as proxies of UMN degeneration (132).
In clinical trials, muscle strength is often estimated by handheld dynamometry (HHD) (134), manual muscle testing (MMT) (105), scoring systems such as the Medical Research Council (MRC) Scale for muscle strength (135) and some studies also report limb circumference (136). Respiratory function in ALS is typically monitored by sniff nasal inspiratory pressure (SNIP), SVC, or FVC in addition to measures such as early morning arterial blood gas (ABG) and overnight pulse-oximetry (137,138). Measures of typing ability (91) (143). In contrast to the relentlessly progressive motor deficits of ALS, the trajectory of cognitive and behavioral deficits is less clear due to considerable individual variations, genotype-associated profiles (144,145), differences in assessment strategies and practiceeffects (146). Several longitudinal neuropsychology studies do not detect progression (77,147,148), progressive behavioral impairment has been noted in the absence of cognitive change (149), and some studies report improved performance as a result of practice effects (77).

Wet Biomarkers
The findings, study design characteristics, and follow-up intervals of longitudinal biofluid studies are summarized in Table 2. Phosphorylated neurofilament heavy chain (pNFH), neurofilament light chain (NF-L), progranulin (PGRN), cytokines, TAR DNA-binding protein 43 (TDP-43), cystatin C, creatinine, micro-RNAs (miRNAs), chitotriosidase-1 (CHIT1), chitinase-3-like protein 1 (CHI3L1), chitinase-3-like protein 2 (CHI3L2) have been evaluated in both research studies (152,153,157,158,162,164,168,171) and clinical trials (150,156,157,160,161). Markers of iron metabolism and ferroptosis are relatively recent domains of ALS biomarker research (172,173). Most biofluid studies are either serum (150,157) or CSF studies (152,167), but urine (155) and skeletal  Frontiers in Neurology | www.frontiersin.org muscle-based (153) studies have now also been published. Quantitative enzyme-linked immunosorbent assay (ELISA) is the most commonly used antibody-based technique (13,174) which can be performed with one antibody (indirect ELISA), or with two antibodies (sandwich ELISA). Increased CSF (13) and serum (175) pNFH detected by ELISA is thought to be a sensitive marker of axonal degeneration in ALS (152,171,176,177). The specificity of this marker however may be inadequate to reliably differentiate ALS from other neurodegenerative conditions (13,176). Other antibody-based techniques such as Western blot (171) and electrochemiluminescence (ECL) (153,168) may improve detection sensitivity and reliability (13). Panels of multiple proteins can be evaluated by multiplex immunoassays such as planar or microbead assays (13). Mass spectrometry based methods using chromatin-immunoprecipitation-based surfaces, two-dimensional gel electrophoresis or high-resolution mass spectrometry have identified cystatin-C and transthyretin as candidate biomarkers (178)(179)(180). The longest wet biomarker study followed patients for 4 years (164). The majority of studies have at least 2 follow-up timepoints (155,162,170) and one study included 13 follow-up timepoints (156,159). Large multi-center trials include as much as 1,000 participants (156). One of the most striking shortcomings of existing longitudinal studies is that very few included disease controls such as Parkinson's disease cohorts, patients with multifocal motor neuropathy with conduction block, Kennedy's disease, chronic inflammatory demyelinating polyneuropathy (CIDP), cervical or lumbar radiculopathy, Charcot-Marie-Tooth disease (CMT), benign fasciculation, and cramp syndrome etc. (152,159,162). Another limitation of many longitudinal studies is the lack of comprehensive genotyping (12) as very few studies report comprehensive screening for ALS-associated mutations (153,159,169,171).
Exhaustive clinical profiling, such as medications (152,164), neuropsychological assessments (171), quality of life indices are rarely reported in longitudinal studies. The majority of studies limit their clinical descriptions to ALSFRS-r, FVC, MRC, and Ashworth scores (153,161,162). Serum and plasma biomarkers such as creatinine (150,156), pNfH (158,159), and micro-RNAs (157), CSF biomarkers such as CHI3L1 (152), tau (160,161), and cystatin-C (162), and urinary (155) and skeletal muscle (153) biomarkers are some of the promising tools for detecting disease progression. While no progressive changes have been detected in NFL levels, it is likely to be a useful as a diagnostic biomarker (168,171).

Studies of Asymptomatic Mutation Carriers
Current clinical trials only recruit symptomatic cases despite accruing evidence that ALS has a long presymptomatic phase (5). Imaging studies of asymptomatic mutation carriers have consistently confirmed disease-specific cerebral and spinal cord changes prior to symptom onset (181)(182)(183)(184) indicating that this disease-phase may represent a crucial window for therapeutic or neuroprotective intervention. The majority of presymptomatic studies assess a single time-point, as opposed to the longitudinal tracking of asymptomatic carriers of ALS-causing mutations (15). While the overwhelming majority of presymptomatic studies focus on C9orf72 hexanucleotide carriers (183,(185)(186)(187), no prognostic markers have been validated to predict whether single patients will develop ALS or FTD. Compared to imaging studies, strikingly few presymptomatic neurophysiology studies have been undertaken (66). Studies of asymptomatic ALScausing mutation carriers have enormous potential for academic research and may pave the way for asymptomatic pharmaceutical trials (5,181).

DISCUSSION
Clinical trials currently evaluate the efficacy of candidate drugs using the revised ALS functional rating scale (ALSFRS-r), muscle strength assessment tools such as manual muscle testing (MMT), respiratory function indices such as forced vital capacity (FVC), slow vital capacity (SVC) and sniff nasal inspiratory pressure (SNIP), neurophysiological measures and survival (102,116,120,188,189). These measures however primarily reflect latestage functional impairment and are not indicative of early stage pathology. Brain and spinal cord imaging has been evaluated as early-stage biomarkers with both diagnostic and monitoring potential (116,120,190). The core neuroimaging signature of ALS, irrespective of the disease-stage, includes corticospinal tract (191,192), corpus callosum (193) and motor cortex degeneration (194). Atrophy in frontotemporal regions has been primarily associated with neuropsychological deficits (195)(196)(197) and linked to hexanucleotide repeats in C9orf72 (145,198). Longitudinal imaging studies are superior to crosssectional studies as they readily detect dynamic structural and functional changes and may elucidate compensatory processes (10,14,23,28,40,120,199). The emergence of multi-timepoint study designs (14,20) enable the characterization of anatomical propagation patterns (200) and provide invaluable temporal insights into the disease trajectory of late-stage ALS. Interscan intervals as short as 3 months can detect longitudinal changes (14,18,120). Many longitudinal studies make use of multiple magnetic resonance (MR) metrics which is particularly useful in establishing an optimal panel of monitoring markers (120). Several longitudinal studies have indicated that white matter degeneration can be detected relatively early in the course of ALS with restricted further progression over time, whereas gray matter pathology shows relentless progression in the symptomatic phase of the disease (4,14,120). In addition to structural imaging studies, connectivity-based, metabolic, peripheral nerve, and, whole body muscle imaging have contributed to our understanding of longitudinal changes (20, [201][202][203]. Needle electromyography and nerve conduction studies play an important clinical role in ruling out alternative conditions and confirming a suspected diagnosis of ALS. Despite variations in local protocols, neurophysiological tests are recognized as objective, reliable and cost-effective tests of neuromuscular dysfunction, and have also been repeatedly proposed as longitudinal markers (55,204). CMAP is generated by depolarization of muscle fibers through the stimulation of a single nerve, where amplitude reductions are interpreted as loss of motor axons (205,206). While CMAP measurements capture longitudinal decline, it is confounded by variations in temperature, limb positioning and electrode placement (56,207). CMAP-derived measures such as MUNE and MUNIX are now extensively utilized to characterize progressive changes in ALS. MUNE estimates motor neuron numbers, and may detect the rate of motor neuron loss, making it a more reliable method of appraising disease progression than CMAP (208,209). However, its early-phase sensitivity has been questioned, as its use is limited to distal muscles, and the technique requires considerable training, especially for inter-rater and multi-site comparisons (205,210). TMS allows the characterization of upper motor neuron dysfunction, and may be particularly useful in detecting progressive changes (57,205).
Functional rating-scales are often the monitoring instruments of choice in clinical trials (55), as they are easy to administer, cost-effective to utilize and have acceptable inter-and intrarater reliability profiles (7). The most widely used rating scale in clinical longitudinal studies is the ALSFRS-r. Despite its ease of administration, it has considerable limitations, as it may be disproportionately influenced by LMN dysfunction, does not account for laterality or asymmetry of symptoms, omits cognitive impairment, and may be affected by medications (14,128,188,211).

Prediction Analyses
Age at symptom onset (212), BMI (139), bulbar involvement (213), cognitive impairment (214), C9orf72 genotype status (144), respiratory insufficiency (215), "definite ALS" by the El Escorial criteria (216), and functional disability (217) are the most commonly cited determinants of poor prognosis in ALS. SNIP (218) and less commonly used measures such as twitch trans-diaphragmatic pressure (Tw Pdi) (219) and maximal static expiratory mouth pressure (MEP) were shown to be good predictors of ventilator-free survival (219). A combined panel of several clinical, wet, and dry biomarkers is likely to offer the most accurate prognostic information (115,120,216,217,220). While cerebral (217,221,222) and spinal (115) imaging measures have been repeatedly linked to survival outcomes, these have not been utilized in a clinical setting. Neurophysiological variables, such as phrenic nerve stimulation outcomes (223) and biofluid markers, such as pNFH and NFL (165,168,(224)(225)(226) are also thought to be accurate predictors.

Patient Stratification
Attempts to enroll patients in the early stages of the disease are hampered by the universally long diagnostic delay in ALS (227). Patient stratification in trials is typically based on site of onset (228), instead of other variables which have an established prognostic impact (138,229). Admixed patient cohorts within a trial may hamper the ability to detect how different phenotypes and genotypes may exhibit a different response to a candidate drug (230)(231)(232). The stratification of heterogeneous cohorts is now aided by the development of validated staging systems, such as the King's (233), Milano-Torino (MITOS) (234) or the Fine'til 9 (FT9) (235) staging systems. The King's Staging system is based on the number of body regions affected, and the presence of nutritional or respiratory failure (233). The MITOS staging system is based on the ALSFRS-r, and is particularly sensitive to changes in later stages of the disease (236,237). However, none of these staging systems account for cognitive or behavioral changes (236). Pathological staging systems suggest a four-stage model of ALS based on anatomical patterns of pTDP-43 load (238,239). This system has now been validated by in vivo neuroimaging studies (240) and signals that accurate pathological staging and patient stratification may be possible based on neuroimaging (199,240).

International Consortia
Only few ALS centers maintain dedicated biobanking facilities to store and process molecular markers in human biofluid locally. Similarly, relatively few centers are in a position to generate sufficient number of MRI and neurophysiology data sets of rare phenotypes to make meaningful inferences in a single center setting. Brain and tissue banks are also challenging to establish, maintain and fund, despite their invaluable contribution to ALS research (241)(242)(243).
Biospecimen samples are also often collected during clinical trials, and discarded after negative outcomes, despite their enormous potential for biomarker discovery (172). One of the most important achievements of biomarker development efforts is the establishment of national and international research consortia such as Association pour la recherche sur la SLA (ARSLA), Neuroimaging Society in ALS (NISALS), Research Motor Neuron (RMN), Canadian ALS Neuroimaging Consortium (CALSNIC), EU Joint Programme for Neurodegenerative Disease Research (JPND), European multidisciplinary ALS network identification to cure motor neurone degeneration (EUROMOTOR) which maintain vital biobanking facilities, registries, data repositories for multicenter data interpretation (121,244). Clinical trial networks are also increasingly recognized as valuable platforms for multisite data collection and interpretation as they operate with carefully standardized protocols. Consortia such as the European Registry of ALS (EURALS) Consortium, the Western ALS (WALS) Consortium and the Northeast ALS (NEALS) Consortium are other examples (245). NEALS is one of the largest consortia with over 100 member sites from the US, Canada, Mexico, Italy, Lebanon and Australia (246). EURALS coordinates research studies and clinical trials relying on population-based European registries and include centers from Scotland, England, Netherlands, Spain, Ireland, Serbia, Italy, France, and Germany (241,247,248). ALS research consortia promote patient-oriented research, maintain biofluid, imaging and DNA banks, and have the potential to translate scientific advances into pragmatic clinical interventions.

Telehealth
Novel trends in longitudinal data collection include telemedicine-based technologies, wearable sensors and mobile phone applications (230). The continuous collection of data via telephone or telemedicine applications such as the Telehealth in Motor Neuron disease (TiM) system circumvent the inconvenience of patients and caregivers traveling long distances for research appointments (249). Once local data-protection and governance guidelines are complied with, information uploaded from these systems can be made available to healthcare professionals of multidisciplinary teams in real time (249). The feasibility of telehealth for ALS patients via live videoconferencing has also been evaluated (250) and is considered a particularly promising clinical and research platform (249,250). A number of cognitive-behavioral screening tools have also been adapted for phone administration (251) including modified versions of the ALS Cognitive Behavior Screen (ALS-CBS), the Controlled Oral Word Association Test (COWAT), the Center for Neurologic Study-Lability Scale (CNS-LS) and found to be statistically equivalent to face-to-face assessments (251). Performance on other tests however, such as the telephone versions of the ALS-Frontal Behavioral Inventory (ALS-FBI) caregiver interview and the Written Verbal Fluency Index (WVFI) was not equivalent to clinic-based assessments (251). The continued development of telephone and internet-enabled devices are likely to provide further insights to longitudinal physical, cognitive and behavioral changes (251).

CONCLUSIONS
While clinical indicators of disease progression remain indispensable, neuroimaging, neurophysiology, and biofluid measures are particularly promising, objective, quantitative biomarker candidates. The validation of combined "wet" and "dry" biomarker panels will not only enable the detection of subtle progressive changes in ALS, but allow precision stratification of heterogeneous patient cohorts in clinical trials and improve existing prediction algorithms.

AUTHOR CONTRIBUTIONS
The manuscript was drafted by RC. EF, SL, OH and PB contributed to the conceptualization, editing, and revision of this paper.