The ARCA Registry: A Collaborative Global Platform for Advancing Trial Readiness in Autosomal Recessive Cerebellar Ataxias

Autosomal recessive cerebellar ataxias (ARCAs) form an ultrarare yet expanding group of neurodegenerative multisystemic diseases affecting the cerebellum and other neurological or non-neurological systems. With the advent of targeted therapies for ARCAs, disease registries have become a precious source of real-world quantitative and qualitative data complementing knowledge from preclinical studies and clinical trials. Here, we review the ARCA Registry, a global collaborative multicenter platform (>15 countries, >30 sites) with the overarching goal to advance trial readiness in ARCAs. It presents a good clinical practice (GCP)- and general data protection regulation (GDPR)-compliant professional-reported registry for multicenter web-based capture of cross-center standardized longitudinal data. Modular electronic case report forms (eCRFs) with core, extended, and optional datasets allow data capture tailored to the participating site's variable interests and resources. The eCRFs cover all key data elements required by regulatory authorities [European Medicines Agency (EMA)] and the European Rare Disease (ERD) platform. They capture genotype, phenotype, and progression and include demographic data, biomarkers, comorbidity, medication, magnetic resonance imaging (MRI), and longitudinal clinician- or patient-reported ratings of ataxia severity, non-ataxia features, disease stage, activities of daily living, and (mental) health status. Moreover, they are aligned to major autosomal-dominant spinocerebellar ataxia (SCA) and sporadic ataxia (SPORTAX) registries in the field, thus allowing for joint and comparative analyses not only across ARCAs but also with SCAs and sporadic ataxias. The registry is at the core of a systematic multi-component ARCA database cluster with a linked biobank and an evolving study database for digital outcome measures. Currently, the registry contains more than 800 patients with almost 1,500 visits representing all ages and disease stages; 65% of patients with established genetic diagnoses capture all the main ARCA genes, and 35% with unsolved diagnoses are targets for advanced next-generation sequencing. The ARCA Registry serves as the backbone of many major European and transatlantic consortia, such as PREPARE, PROSPAX, and the Ataxia Global Initiative, with additional data input from SPORTAX. It has thus become the largest global trial-readiness registry in the ARCA field.

Autosomal recessive cerebellar ataxias (ARCAs) are a heterogeneous group of ultrarare multisystemic neurodegenerative diseases affecting the cerebellum and/or its afferent tracts, often accompanied by damage to other neurological (e.g., corticospinal tract, basal ganglia, vestibular system, and peripheral nerves) or non-neurological systems (e.g., muscle, heart, and pancreas) (1,2). The number of ARCA genes is continuously expanding, extending far above >100 genes, and the first ARCAs now come into reach of targeted treatment options (2).
Disease registries have been important for identification, characterization, and aggregation of rare neurological diseases. However, the real-world quantitative and qualitative evidence in registries and registry-based natural history and outcome measure studies have now also become a precious source for planning of treatment trials and modeling trial designs and endpoints, thereby complementing the knowledge available from preclinical studies and clinical trials (3). The ARCA Registry was launched in 2013 in order to apply this concept to the field of ARCAs, and it remains the only multicenter registry fully dedicated to ARCAs and early-onset ataxias (EOAs), which are known to be enriched but not exclusive for ARCAs (1,2). The overarching goal of the ARCA Registry is to become a key facilitator enabling trial readiness by • providing an easily accessible, web-based, good clinical practice (GCP)-conforming, and general data protection regulation (GDPR)-compliant multicenter multi-trial registry infrastructure platform as a backbone for global trial-readiness efforts in ARCAs; • building cohorts of sufficient size for trial-readiness studies and upcoming treatment trials through aggregating ARCA patients in an accessible, standardized, multicenter fashion around the world; • characterizing the phenotypic spectra for ARCAs, which will inform treatment trial design and especially outcome selection for future treatment trials; • collecting real-world natural history data for ARCAs acquired during daily clinical life across a large range of centers across the world, thereby informing design, planning, and modeling of treatment trials; and • providing a continuous database backbone for trial-readiness ataxia consortia around the world, e.g., the German DZNE ARCA-EOA network (4), the PREPARE consortium (5), PROSPAX (6), and ARCA GLOBAL (7).
In this overview, we will describe the main methodological features and assets of the ARCA Registry, with examples on how it is already being utilized to improve trial readiness in the field of ARCAs, including its current use by multiple research networks. It will also illustrate the registry's potential for expansion to other partners worldwide to promote trial readiness for ARCAs. used in a variety of national and international medical research consortia (Figure 1). The web-based implementation allows direct access by registered clinicians and study teams from any computer worldwide, as required for easy access in a global multicenter setting. The fact that it uses the same technical registry platform (WebSpirit) as one of the largest autosomal dominant ataxia registries, namely, the spinocerebellar ataxia (SCA)/ESMI registry (8,9) and SCA Global (10), as well as the large sporadic ataxia registry (SPORTAX) (11,12) and the Hereditary Spastic Paraplegia (HSP) Registry, allows for cross talk and joint analysis not only across the manifold ARCAs captured in the ARCA Registry itself but also with SCAs, sporadic ataxias, and even HSPs. This is further facilitated by aligning all key electronic case report forms (eCRFs) between these major ataxia registries. The registry platform is GCP-compliant: an audit trail is maintained to track changes to recorded data, a detailed rights and role management system limits access to entered data for each individual system user, and quality assurance is supported by an integrated online-monitoring system. Moreover, it is fully compliant with the European Union GDPR based on the following features: use of unique pseudonyms generated with a secure one-way hash function to restrict the use of personally identifiable data to local sites; separation of processing activities through assignment of user roles (e.g., data entry, monitoring, and data management) and restriction of access to data; record and transfer only of pseudonymized data; all access to data through encrypted connections; and servers located within the EU. Participating sites maintain access to their data entered in the ARCA Registry, with the possibility to easily export and systematically analyze locally aggregated datasets. Access to full multisite datasets is provided for specific projects upon request by a standardized project template and provided to the project submitter after evaluation of the request. The physician-reported multidomain datasets in the ARCA Registry are the core of a larger systematic multi-component ARCA database cluster (Figure 1). For longitudinal collection of biomaterials, the ARCA Registry is linked to an ARCA biomaterial database built on REDCap. To facilitate wholeexome and whole-genome sequencing in all patients with unsolved ARCA, the ARCA Registry is moreover linked to next-generation sequencing (NGS) data on the genomics research platform GENESIS (13,14). GENESIS is a userfriendly collaborative cloud-based analysis and matchmaking platform that encompasses the largest ataxia NGS dataset collection worldwide (>2,000 ataxia NGS datasets), aggregated via the PREPARE consortium (PREPARE-GENESIS) (see below). While the ARCA Registry and the GENESIS platform are two distinct databases, subjects from the registry are linked to the GENESIS platform via an ID generated by the ARCA biomaterial database. Ongoing developments of this multicomponent ARCA database cluster will include an ARCA multistudy database as a repository for features of digital outcomes such as magnetic resonance imaging (MRI), digital-motor sensors (APDM, Q-Motor), and optical coherence tomography and for patient-driven entry of patient-reported outcome measures (PROMs). Graphical user interface of the web-based ARCA Registry, with display of a representative electronic case report form (eCRF) and embedding in the larger database infrastructure. The front-end of the software and the main core of eCRFs are shared with other major ataxia and rare disease registries (e.g., the HSP registry used by the TreatHSP network), making the ARCA Registry user friendly and convenient for joint analysis of data across genetic ataxias, sporadic ataxias, and hereditary spastic paraplegia (HSP) registries.

CAPTURING PHENOTYPIC SPECTRA, PHENOTYPIC EVOLUTION, AND DISEASE PROGRESSION OF AUTOSOMAL RECESSIVE CEREBELLAR ATAXIAS: THE ELECTRONIC CASE REPORT FORMS
The eCRFs of the ARCA Registry are designed to characterize the clinical heterogeneity of phenotypic spectra and natural history phenotypic evolution of ARCAs, thus helping in the selection of outcomes and planning of upcoming treatment trials (sample size calculation, trial duration, etc.) as wellmodeling of trial endpoints and treatment effects. Different degrees of eCRFs details-characterized as "core, " extended, " and "optional" datasets-allow data capture tailored to the participating site's variable interests and resources ( Table 1). In brief, the eCRFs include clinical scales and composite measures, clinician-reported outcome measure and PROMs, biomarker outcomes, and quantitative performance measures: • The core dataset in the ARCA Registry comprises demographic data (with ethnic background), genetic diagnosis (with types of sequencing performed), different scores to measure disease severity like the Friedreich Ataxia Rating Scale (FARS) Functional Stage (15), the Scale for the Assessment and Rating of Ataxia (SARA) (16), systematic phenotyping using the Inventory of Non-Ataxia Signs (INAS) (17) with customized amendments (e.g., bradykinesia, ptosis, or the head impulse test), the presence and onset of typical clinical ARCA features (e.g., ataxia, epilepsy, cognitive impairment, and diabetes), ARCA biomarkers (serum and neurophysiology), and relevant comorbidities (including alcohol intake). • The extended dataset adds questionnaires on health status and depression (EQ-5D and PHQ-9) (18,19), diseaserelevant medication and treatment effects, and a summary of MRI findings. • Optional datasets include the possibility to report pediatric features (e.g., pregnancy and birth or developmental milestones), and the ARSACS Disease Severity Index as a disease-specific outcome measure (20).
The Patient's Global Impression of Change (PGI-C; core dataset) (21,22) and the FARS Activity of Daily Living (ADL; extended datasets) (15) have recently been implemented as "anchor

CROSS-CONTINENTAL MULTICENTER CAPTURE AROUND THE WORLD: THE CONTRIBUTING CENTERS
The ARCA Registry captures ARCAs from centers around the world (Figure 2). While initially mainly capturing centers from countries across Europe, the scope of the ARCA Registry has continuously grown in the last 5 years currently to now more than 30 sites from 15 countries. The registry has an active strategy to recruit centers from underrepresented countries to strengthen its global representation of ARCAs, regarding both disease prevalence and variable genetic/ethnic backgrounds. Participation is possible upon request. Minimum requirements are the commitment to contribute CRFs of at least the basic phenotype (see above) and to aim for longitudinal follow-ups.

CURRENT DATA IN THE ARCA REGISTRY:
A DESCRIPTIVE OVERVIEW OF 800 PATIENTS AND 1,500 VISITS From its foundation in 2013 until now, more than 800 patients with almost 1,500 visits have been recruited to the ARCA Registry.
In the past 5 years, there has been a considerable increase in longitudinal data, currently reaching up to eight annual followup visits in the first patients ( Figure 3A). Follow-up core datasets including SARA or INAS from at least two visits and 13 sites are available in >300 patients. Follow-up extended datasets such as MRI summary data or the self-rated assessment of health status by EQ-5D from at least two visits are available from >200 patients. In addition to its longitudinal coverage, the ARCA Registry also captures patients with a broad range of ages and disease stages (Figure 3B). While-in keeping with the early onset of ARCAs (1, 2) −60% of patients have symptom onset before 40 years of age, 40% of patients have later onset, up to 80 years of age. Ataxia severity at baseline visits have been recorded as mild (SARA: ≤8), moderate (8)(9)(10)(11)(12)(13)(14)(15)(16), and severe (>16) in 16, 41, and 43% of patients, respectively. Sixty-five percent of patients have an established genetic diagnosis. The most frequent diagnoses in the ARCA Registry are ARSACS (∼120 patients, 14%), Friedreich ataxia (∼90 patients, 11%), and SPG7 (∼40 patients, 4%; see Figure 3C for the 10 most frequent ARCAs). Except for the enrichment of ARSACS-which is an overrepresentation due to the major contribution of participating sites in Quebec-the ARCA Registry provides prevalence data that are generally consistent with expectations from the literature (1,23). Patients in the ARCA Registry who do not have a genetic diagnosis yet (currently 35%) are included in a coordinated NGS effort on a continuous basis to make a diagnosis or to identify novel genes, via the PREPARE-GENESIS platform (see above). The ARCA Registry with its phenotypic and longitudinal data has enabled large clinico-genetic cohort studies to delineate the phenotypic spectrum and longitudinal disease progression of major and novel ARCAs. It thus fulfills the requirements of primary datasets that can be used to describe natural history progression models and plan treatment trials in almost all of the most frequent ARCAs, including pharmacometric modeling of outcome measures and treatment effects. For example, for RFC1-ataxia, it has helped to reveal multisystemic phenotypes mimicking cerebellar type multiple system atrophy and progressive supranuclear palsy, and hyperkinetic movement disorders such as chorea and dystonia, and provided first sample size calculations based on longitudinal SARA assessments (24). For COQ8A/ADCK3-ataxia, the ARCA Registry facilitated the delineation of clinico-genetic associations, and the longitudinal analysis of SARA scores has provided the first systematic, groupbased evidence for a possible treatment effect of coenzyme Q10 (25). Similarly, the ARCA Registry has enabled the natural history of POLG-related ataxia to be documented through longitudinal SARA and INAS assessments (26). The systematic assessment of patients with as-yet-unknown genetic molecular diagnoses means that-when the underlying gene is discovered-there are already established longitudinal progression data, as exemplified for patients found to carry pathogenic variants in the novel ARCA gene PRDX3 (27).

MEETING CRITERIA OF THE EUROPEAN RARE DISEASE REGISTRY INFRASTRUCTURE AND OF THE EUROPEAN MEDICINES AGENCY GUIDELINES FOR REGISTRY-BASED STUDIES
European authorities including the European Medicines Agency (EMA) have highlighted the potential of disease registries to provide real-world evidence that can complement preclinical, clinical, and even post-marketing data especially for rare diseases (3). At the same time, however, they have also put forward clear standards for data collection and quality criteria for disease registries that aim to meet this goal.

European Medicines Agency Registry Standards for Data Collection
The ARCA Registry captures the key EMA data elements (3), including administrative information (e.g., site, contact dates, registry entry and exit dates, and reason for registry exit), patient data (e.g., age, sex, and alcohol as lifestyle factor), disease features (including diagnosis, disease duration, severity/staging, genetic information, and biochemical tests if appropriate), relevant comorbidities, and disease-related or relevant concomitant medical treatment ( Table 2). The ARCA Registry is also enrolled in the European Directory of Registries of the European Rare Disease Registry Infrastructure (ERDRI.dor). As a constituent registry of the evolving Rare Neurological Disease Registry of the European Research Network on Rare Neurological Diseases (ERN-RND) (28), the ARCA Registry will provide the common data elements defined by the ERDRI (29), which adds more systematic coding of diagnosis (Orpha code), genetic diagnosis (HGVS) or phenotype (HPO), resources for research (e.g., biosampling and link to biobank), and a classification of disability ( Table 3). As a cross-disease database, the ERN-RND registry collects general information on demographics, genetics, and phenotype, which allows the identification of centers that look after specific (and often genetically defined) patient groups. By contrast, as a disease-group specific database, the ARCA Registry collects complementary in-depth data to enable trial readiness in recessive ataxias. Both registries are well-interconnected, as the common ERN dataset can be easily extracted from the ARCA Registry and imported into the ERN-RND registry. This link with the ERN-RND registry also adopts the FAIR principles in the ARCA Registry, ensuring that its data are f indable, accessible, interoperable, and reusable between countries (30).

European Medicines Agency Standards for Data Quality
In line with the EMA standards of data quality (3), the ARCA Registry aims for consistency, completeness, accuracy, and timeliness. Consistency of data is facilitated by common standardized eCRFs, by clearly defined variables and selection of questions with binary outcomes, and by the implementation of clinical scales with high interrater reliability, especially SARA, INAS, and FARS stage or ADL (15,16,31). Completeness of data in core datasets is automatically checked online as the first step of a continuous database-embedded monitoring process; this resulted in 95% (e.g., for clinical features of ARCAs) to 99% (e.g., for SARA or INAS) CRF completion rate. Following automated online control of data plausibility and consistency within and between different CRFs at the time of data entry, accuracy of data is afterwards controlled offline in the second monitoring step. Recruitment numbers including availability and completeness are regularly disseminated in systematic, standardized reports of networks that use the ARCA Registry.

NETWORKS USING THE ARCA REGISTRY AS THEIR INFRASTRUCTURE BACKBONE
The ARCA Registry is being used not only by more than 30 single sites but also by several leading ARCA networks in Europe and worldwide.

German Autosomal Recessive Cerebellar Ataxias/Early-Onset Ataxia Network
The German network on ARCAs and EOAs, launched in 2013 by the German Center for Neurodegenerative Diseases (DZNE), comprises five major German ataxia sites (Tübingen, Bonn, Munich, Magdeburg, and Rostock). The network has established the first version of the ARCA Registry, which was co-hosted by the Ataxia Study Group. Every effort was made to ensure that the Registry is fully aligned in its data fields and database system with other major SCA and sporadic ataxia registries (32), likewise hosted by the Ataxia Study Group. Since then, the German ARCA/EOA network has contributed >400 subjects to the ARCA

PROSPAX
The network PROSPAX (An integrated multimodal PROgression chart in SPastic atAXias), launched in 2020 and funded by the European Joint Program on Rare Diseases (EJP RD), will establish a paradigmatic integrated trial-ready model of disease progression and mechanistic evolution in spastic ataxias. It hereby builds on a rigorous trial-like multicenter natural history center study on the two flagship recessive ataxias ARSACS and SPG7, combining longitudinal clinician-and patient-reported digital and molecular outcomes for these spastic ARCAs. It unites all major European ARCA and HSP networks and includes Canadian ARCA centers (>7 centers) to run this transatlantic natural history study. PROSPAX hereby utilizes a "spin-off " study registry version, which directly builds on the ARCA Registry, with fully compatible pseudonymization procedure and eCRFs, and where datasets will be integrated into the ARCA Registry (and equally the HSP registry) at the end of the study. PROSPAX also draws on all the other components of the multi-component ARCA database cluster described above (ARCA biomaterial database, GENESIS, and ARCA multi-study database). By sharing the same core eCRFs and front-end with the HSP Registry used by the TreatHSP network (33), this spinoff version of the ARCA Registry enables direct cross talk with the HSP registry and joint analysis with HSPs, which is of high importance given the large genetic, molecular, and clinical overlap between ataxias and HSPs (34).

ARCA Global
ARCA GLOBAL and its sister platform SCA GLOBAL together comprise the Ataxia Global Initiative (AGI). The AGI presents a worldwide multi-stakeholder (academia, industry, and patient organizations) platform coordinating and preparing all necessary steps for trial readiness in autosomal-dominant (SCA GLOBAL) and autosomal recessive (ARCA GLOBAL) ataxias (7,10). With establishing trial-ready cohorts and cross-center harmonized clinician-reported outcome measures and PROMs as one of its key tasks, the AGI uses the ARCA Registry as one of its key registries. This reflects the fact that the ARCA Registry already captures all outcome measures that were stipulated by the AGI as the common core set of clinical outcome measures to be used by ataxia centers worldwide. Moreover, the AGI builds on the ARCA Registry as one of its major trial-readiness resources, as this registry readily allows data-download and dataset preparation for further workup, e.g., by the Critical Path to Therapeutics for the Ataxias (CPTA) consortium of the Critical Path Institute (C-Path), which aims to prepare regulatory approval by the FDA and the EMA for clinical endpoints in genetic ataxias. In addition, the SPATAX network, which includes all types (i.e., not only autosomal recessive) of ataxias and HSPs, has contributed subsets of data to the ARCA Registry.

LIMITATIONS AND OUTLOOK
The ARCA Registry faces several limitations and open challenges that remain be to be addressed: • The sustainability of the ARCA Registry depends on strong commitment by the contributing centers as well as projectbased funding, with fluctuations in patient recruitment, participating sites, and monitoring performance. Technical improvements and new software implementations often come along with variable latencies. Even the "core dataset" may exceed the possibilities of a clinical appointment in many ataxia centers, but further minimization of the core dataset would need to be carefully weighed against the minimal required data necessary to really help in preparing trial readiness as well as against registry standards put forward by public authorities. The timeliness of monitoring is governed by batch monitoring of each site by site, which provides the opportunity for focused local revision of data and files but also leads to periodic delays for those sites that had just been monitored. • Patients with the same genetically defined ARCA are still dispersed in different ataxia registries, e.g., because of identification of novel autosomal recessive genes in sporadic late-onset ataxia patients (e.g., RFC1) who have so far been collected in a sporadic ataxia registry (e.g., the SPORTAX registry). • The ARCA Registry does not cover all aspects of each ARCA disease, or may cover it with measures that are too broad for clinical trial design in a specific ARCA disease. While the global phenotype or the progression of ataxia as measured by the SARA score is assessed, more fine-grained motor (e.g., walking speed) and especially a larger array of non-motor features (e.g., the cerebellar cognitive-affective syndrome) are not captured. Moreover, selected eCRFs and non-ataxia scales like the INAS were primarily implemented to systematically capture disease phenotypes but might show less responsiveness to change. Thus, for capturing the natural history of certain ARCAs, the ARCA Registry might need to be complemented by additional eCRFs. Registry spin-offs such as PROSPAX registry, however, exemplify that the registry infrastructure can indeed be readily adapted to meet the needs of such natural history studies. • Finally, recruitment into the ARCA Registry is still biased toward patients from Europe or of European descent, which leads to an underrepresentation of ARCAs with other ethnic/genetic as well as sociocultural backgrounds. Structural disadvantages (especially availability of local person and funding resources for data-entry) and language barriers may hamper a more global dissemination of the ARCA Registry. The eCRFs, the registry software, and templates for an application to a local institutional review board are all available in English language, but additional translations and countryspecific adaptations may help to increase the scope.

CONCLUSIONS
The ARCA Registry has (i) enabled the harmonization of clinical outcomes across ataxia centers around the world; (ii) has demonstrated its capacity to act as a centralized database for genotype-phenotype and natural history studies in the >100 ARCAs, already exemplified for COQ8A-, RFC1-, and POLG-related ataxias; and (iii) aggregates the necessary largescale longitudinal progression datasets for calculating sample sizes, modeling trial designs and randomization procedures, and running pharmacometric models simulating treatment effect sizes for anticipated clinical trials. Given its adoption by many international ARCA sites and networks; its GCP, GDPR, and EMA compliance; its web-based data capture; and its connections to a constantly growing multi-component ARCA database cluster, the ARCA Registry is well-placed to become a global trial-readiness registry for ARCAs.