Assessment of movement disorders using wearable sensors during upper limb tasks: A scoping review

Background: Studies aiming to objectively quantify movement disorders during upper limb tasks using wearable sensors have recently increased, but there is a wide variety in described measurement and analyzing methods, hampering standardization of methods in research and clinics. Therefore, the primary objective of this review was to provide an overview of sensor set-up and type, included tasks, sensor features and methods used to quantify movement disorders during upper limb tasks in multiple pathological populations. The secondary objective was to identify the most sensitive sensor features for the detection and quantification of movement disorders on the one hand and to describe the clinical application of the proposed methods on the other hand. Methods: A literature search using Scopus, Web of Science, and PubMed was performed. Articles needed to meet following criteria: 1) participants were adults/children with a neurological disease, 2) (at least) one sensor was placed on the upper limb for evaluation of movement disorders during upper limb tasks, 3) comparisons between: groups with/without movement disorders, sensor features before/after intervention, or sensor features with a clinical scale for assessment of the movement disorder. 4) Outcome measures included sensor features from acceleration/angular velocity signals. Results: A total of 101 articles were included, of which 56 researched Parkinson’s Disease. Wrist(s), hand(s) and index finger(s) were the most popular sensor locations. Most frequent tasks were: finger tapping, wrist pro/supination, keeping the arms extended in front of the body and finger-to-nose. Most frequently calculated sensor features were mean, standard deviation, root-mean-square, ranges, skewness, kurtosis/entropy of acceleration and/or angular velocity, in combination with dominant frequencies/power of acceleration signals. Examples of clinical applications were automatization of a clinical scale or discrimination between a patient/control group or different patient groups. Conclusion: Current overview can support clinicians and researchers in selecting the most sensitive pathology-dependent sensor features and methodologies for detection and quantification of upper limb movement disorders and objective evaluations of treatment effects. Insights from Parkinson’s Disease studies can accelerate the development of wearable sensors protocols in the remaining pathologies, provided that there is sufficient attention for the standardisation of protocols, tasks, feasibility and data analysis methods.


Introduction
The execution of upper limb tasks requires fine-tuned coordination of multiple upper limb joints, which is often disturbed in individuals with movement disorders (Levin, 1996;Beer et al., 2000;Kukke et al., 2016). Movement disorders can be defined as "a neurological syndrome in which there is either an excess of movement or a paucity of voluntary and automatic movements" and are the consequence of lesions in the basal ganglia, cerebellum or thalamus brain regions. They are present in a variety of neurological diseases and can occur in every phase of the life cycle (Jankovic and Jankovic, 2021). Prevalence of movement disorders increases with age, up to 28% in a general population over 50 years old and 50% for individuals over 80 years old (Wenning et al., 2005). In several neurologic diseases, movement disorders belong to the main symptom of the disease. In childhood, neurologic movement disorders are most often associated with a diagnosis of dyskinetic cerebral palsy (CP) or with primary dystonias (i.e., inherited or idiopathic dystonias) with a prevalence of 25-50/100,000 and 15-30/100,000, respectively (Nutt et al., 1988;Phukan et al., 2011;Monbaliu et al., 2017). In individuals over the age of 50 years, the prevalence of primary dystonia increases to 732/100,000 (Müller et al., 2002). In the elderly, the most prevalent condition causing movement disorders is Parkinson's disease (PD), reporting a prevalence of one to two per 1,000 adults (Tysnes and Storstein, 2017).
Movement disorders lead to slower movement execution, increased movement variability and a decrease in functionality (van den Noort et al., 2017;Newman et al., 2017;Sanger, 2006;Lee et al., 2015a;Zhang et al., 2012). Both in early-onset and lateonset movement disorders, accurate evaluation is indispensable for the follow-up of the disease course-especially in progressive movement disorders-and to evaluate and optimize the effect of treatment strategies. Currently, the effect of an intervention program on upper limb function or the presence and/or severity of movement disorders is mostly evaluated using clinical assessment scales such as functional scales and movement disorder severity scales (Jackman et al., 2016;Umar et al., 2018;Cohen et al., 2021). The Unified Parkinson's Disease Rating Scale (UPDRS), the Movement Disorders Society revised version of this scale (MDS-UPDRS) and the Hoehn and Yahr scales are currently the most often used assessment scales in PD, whereas the Essential Tremor Rating Assessment Scale is used to rate the severity of essential tremor during nine functional tasks (Movement Disorder Society Task Force on Rating Scales for Parkinson 's Disease, 2003;Goetz et al., 2008;Hoehn and Yahr, 1967;Elble et al., 2012). To evaluate the severity of ataxia, the Scale for the Assessment and Rating of Ataxia (SARA) is most often applied (Schmitz-Hübschdu Montcel et al., 2006). In stroke, the Wolf Motor Function Test (WMFT) and Fugl-Meyer Assessment (FMA) are mainly used to evaluate motor function post-stroke (Movement Disorder Society Task Force on Rating Scales for Parkinson 's Disease, 2003;Hoehn and Yahr, 1967;Fugl-Meyer et al., 1975;Wolf et al., 2001;Gladstone et al., 2002). The Action Research Arm Test (ARAT), Box and Block test, Nine Hole Peg Test and Jebsen-Taylor Test evaluate hand function in multiple pathologies, amongst other multiple sclerosis (MS) and stroke, whereas the Monkey Box test was recently developed to evaluate bilateral motor function in Huntington's Disease (HD) (Platz et al., 2005;Bennasar et al., 2018;Repnik et al., 2018). For children with CP, the Melbourne Assessment is a validated measure for upper limb activity (Gilmore et al., 2010;Spirtos et al., 2011). Apart from upper limb activity evaluation scales, the severity of movement disorders such as dystonia can be evaluated with the Burke-Fahn-Marsden Dystonia Rating Scale (BFMDRS) or the Dyskinesia Impairment Scale (DIS) in children and adolescents with dyskinetic CP (Burke et al., 1985;Monbaliu et al., 2012).
A common drawback of all abovementioned activity and movement disorder severity assessment scales is that they have to be evaluated by clinicians through the use of standardized guidelines or definitions with respect to task execution or presence/severity of the movement disorder. This clinical judgement induces subjectivity, as not all clinicians may interpret a definition or guideline in exactly the same manner. Moreover, the attribution of scores by a clinician based on video recordings is time-consuming, especially if frequent monitoring is required to evaluate disease progression or the effect of an intervention.
In an effort to reduce the subjective aspect in the evaluation of movement disorders, motion analysis has been widely introduced as an alternative to objectify movement disorders, as well as to evaluate the effect of treatment interventions in PD (Agostino et al., 2003;Pang et al., 2020), CP (Kreulen et al., 2006;Butler and Rose, 2012;Simon-Martinez et al., 2020) and stroke (Lang et al., 2009;Alt Murphy et al., 2018;Cuesta-GómezCarratala-Tejada et al., 2019). While three-dimensional motion analysis is the gold standard in movement analysis, it requires a specially equipped expensive laboratory whereby patients need to visit the hospital or study center for study participation or assessment of rehabilitation.
With both the time-consuming aspect of clinical scoring and the location-restricted aspect of three-dimensional motion analysis as main drivers, multiple studies have recently attempted to automate clinical scales with the use of wearable sensors or inertial measurement units (IMUs). These devices are attractive because of their ease-of-use and portability, omitting the necessity for a standardized laboratory which is in particular relevant for long-time follow-up or home-based measures for less mobile patients. IMUs measure linear acceleration and angular velocity of the segment they are placed on, whereas accelerometers measure only acceleration and gyroscopes measure only angular velocity. Specific features derived from acceleration and angular velocity measures can be used to characterize (pathological) movement patterns during multiple tasks or daily life activities. The use of wearable sensors for objective assessment has been previously discussed in PD (Maetzler et al., 2013), but this overview focused on all symptoms of PD, consequently providing very little information on specific upper limb tasks. Similarly, Tortelli and others discussed the use of portable digital sensors in HD, whereby the focus was mostly on the assessment of activity and gait (Tortelli et al., 2021). In dyskinetic CP, a recent review discussed instrumented measures for the assessment of dyskinetic CP symptoms, but this scope was not limited to IMUs and therefore less detailed on the topic (Haberfehlner et al., 2020). While these previous reviews provide much needed insights in the domain of each pathology, an overarching view of sensor protocols and features for the assessment of movement disorders during upper limb tasks could enhance standardisation of data collection. Such standardisation facilitates multi-centre studies and international collaborations and comparison between characteristics of movement disorders between diseases. Therefore, the primary objective of this review was to provide an overview of sensor set-up and type, included tasks, sensor features and methods that are used to evaluate movement disorders during upper limb tasks in multiple pathological populations. The secondary objective was to identify the most sensitive sensor features for the detection and quantification of movement disorders on the one hand and to describe the clinical application of the proposed methods on the other hand.

Search strategy
The full literature search was conducted following the Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA) guidelines (Page et al., 2021). A literature search using three different databases was performed: Scopus, Web of Science, and PubMed until July 2022. Following terms were used in "all fields": #1: sensor OR inertial measurement unit; #2: arm OR upper limb; #3: movement disorder.
Subsequently, all three databases were searched for #1 AND #2 AND #3.

Article screening
Articles (n = 990) retrieved from the literature search were extracted. An overview of the articles retained at each stage of the screening process can be found in the PRISMA flow diagram presented in Figure 1 (Page et al., 2021). Any duplicated articles, retrieved by more than one database, were removed by deduplication based on congruity in authors, title, and year of publication.
Unique articles (n = 903) were screened for inclusion by a researcher with experience in the field of upper extremity sensor measurements according to the criteria below in two consecutive stages: 1) title-abstract; and 2) full-text screening.
Articles were screened for inclusion along a set of pre-defined eligibility criteria for 1) the title-abstract and 2) the full-text screening stages. These criteria were designed in line with the PICO/PECO framework (Morgan et al., 2018), which clarifies the review objectives and inclusion criteria across four domains: (P) it was required that the participants were adults or children with a neurological disease subsequently leading to a movement disorder in (but not limited to) the upper limb. (I/E) a minimum of one wearable sensor was placed on the upper limb for the evaluation of movement disorders during the execution of an upper limb task. (C) Multiple comparisons were possible: 1) a group with movement disorders compared with a healthy group, 2) comparison of sensor features before and after an intervention or 3) comparison of sensor features with scores of a clinical scale. (O) Outcome measures needed to include sensor features derived from acceleration or angular velocity signals. Studies from the same authors who mentioned the exact same features in the same population as a study that was already included were excluded. Additionally, to meet the inclusion criteria, articles were required to be original research containing empirical data. Finally, only articles published after the year 2000 were included.

Data extraction
Relevant information from each included article was extracted in a custom-made Excel based (Microsoft Office, Microsoft, Redmond, WA, United States) data extraction form. Information regarding goal population, sensor type, number of sensors, location of sensor(s), included tasks, sensor features and statistical method was obtained to address objective 1. To address objective 2, the sensitivity and/or responsiveness of the sensor features were extracted for the articles that provided the contribution of individual sensor features. Finally, the clinical application of the proposed method was extracted.

General information
From the 166 full-text articles screened for eligibility, 62 were finally included. Additionally, 39 articles were included from citations of screened articles. The full-text articles that were screened but excluded and the reasons for exclusion can be found in Supplementary Table S1.

Upper limb tasks
The upper limb tasks occurring in more than one study are listed in Table 1. Wrist pro/supination was included in 25 studies whereas finger tapping was included in 23 studies, with the majority being studies in PD patients. Keeping arms in front of the body was included in 23 studies in PD, tremor and MS, as well as finger-to-nose which was additionally included in one ataxia study. Drinking from a can/cup was included in 13 studies in PD, tremor, stroke and spasticity. Opening/closing of the hand was included in seven PD studies as well as writing/drawing which was used for PD and tremor patients. Eating was included in six PD studies as well as pouring water, which was used in PD, tremor and MS. Reaching/grasping to objects was included in five studies in stroke, ataxia and CP and teeth brushing and putting clothes on/off were both used four times in PD studies. In stroke, the Wolf Motor Function test or parts of this clinical scale were included four times and two PD, one ataxia and one CP study measured activities in an unrestricted home environment. Combing hair, typing and folding laundry were included in three PD studies and forwards and sideways reaching in one CP and two stroke studies. The box and block test was included in one PD and one CP study and tasks from the ARAT were additionally included in PD and one CP study. Finally, following tasks were included once: reaching sideways , the monkey box test (Bennasar et al., 2018), holding a weight with the wrist , wrist extension , wrist ab/adduction, flexion/extension, elbow flexion/extension and pro/supination (Chan et al., 2022), and following a bent wire shape with a wand loop . One study included wrist supination/flexion, hand behind back and wrist flexion/pronation (Zhang et al., 2012). In CP, one study included outwards reaching (Sanger, 2006), one included the drinking test, the bean bag test and the nine hole peg test  while Strohrmann et al.  (   . Table 2 provides an overview of the calculated sensor features in the time-and frequency domain, as well as a formula or feature description when given in the original study. Supplementary  Table S3 presents the sensor features grouped per pathology. As an easy and straightforward feature, execution time was often calculated for the upper limb tasks for stroke Zhang et al., 2012;Repnik et al., 2018), MS (Carpinella et al., 2014), PD (van den Noort et al., 2017;Bonato et al., 2004;Yokoe et al., 2009;Djurić-JovičićPetrovic et al., 2016;Rabelo et al., 2017;di Biase et al., 2018), tremor Ali et al., 2022), CP (Strohrmann et al., 2013) and ataxia . The frequency of movements was popular in multiple studies in PD, mostly in repetitive tasks such as finger tapping and pro/supination (Lee et al., 2015b;Cavallo et al., 2019).
Peak-to-peak and magnitude of angular velocity were additionally used in PD Pulliam et al., 2018;di Biase et al., 2018;Hssayeni et al., 2021), whereas Repnik and others calculated a rotational jerk index for angular velocity values to evaluate hand rotation in stroke . Finally, a study in PD used the square root of the sum of squares of jerk signals and named this feature 'segment velocity' (Keijsers et al., 2003).
For signal smoothness, RMS of jerk was often used as a straightforward measure in PD (Parnandi et al., 2010) and stroke (Knorr, 2005;Hester, 2006), as well as a jerk metric for which multiple definitions were given, mostly RMS jerk normalized over time/peak velocity or mean jerk Zhang et al., 2012;Lonini et al., 2018;Habets et al., 2021;Romano et al., 2021;Hester, 2006;Carpinella et al., 2014;den Hartogvan der Krogt et al., 2022;Bai et al., 2021). Additionally, smoothness measures were also described as the difference between movement accelerometer readings and smoothed readings, number of movement units or number of speed peaks Otten et al., 2015;Bai et al., 2021).
Coefficient of variation was often used as a measure of variability or rhythm for different signals such as excursion angle (Heldman et al., 2011a;Espay et al., 2011;Tamás et al., 2016;Kwon et al., 2018), (angular) velocity Lee et al., 2015b;Kwon et al., 2018), amplitude (Lee et al., 2015b; Frontiers in Robotics and AI frontiersin.org 13 Djurić-JovičićPetrovic et al., 2016) and movement frequency (Lee et al., 2015b;Cavallo et al., 2019;Tran et al., 2020), while two studies in PD defined 'rhythm' via the STD of intervals of a finger tapping movement (Okuno et al., 2006) and any sequence of regularly occurring events (Martinez-Manzanera et al., 2016). Finally, a stroke study defined variability as the RMS error between a reference trial and a warped trial . Considering the geometrical structure of a non-linear time-series, Newman et al. included Higuchu's fractional dimension in children with CP .
With respect to orientation and rotational information, correlation between the different axes of the accelerometer or gyroscope was often included as a feature in PD (Lonini et al., 2018;Shawen et al., 2020;Habets et al., 2021), HD (Bennasar et al., 2018) and stroke (Hester, 2006). Additionally, the peak of the normalized cross-correlation from pairs of acceleration time series and the lag of first peak in autocorrelation acceleration were included in two PD studies (Patel et al., 2009;Cole et al., 2010). Concerning trajectories and travelled distances, multiple studies used different definitions for this feature. 3D hand trajectory and length of 3D trajectory (van Meulen et al., 2015) and path-length ratio were used in stroke , while the index of curvature (deviation from a straight line) was used in dyskinetic CP (Sanger, 2006). Elevation angle was included in a CP study, while in stroke, the similarity of hand trajectories was used Repnik et al., 2018). Two studies in patients with ataxia used mean and standard deviation of Euclidian distance from the mean trajectory and curved and straight-line similarity analysis Dominguez-Vega et al., 2021). In PD, Heldman et al. used a bradykinesia index, based on variability in time and amplitude of task execution whereas Tamas et al. and Garza-Rodriguez and others quantified hypokinesia using velocity decrement, which is defined as a decrease in velocity between subsequent data parts Garza-RodríguezSanchez-Fernandez et al., 2020).

Sensitivity and/or responsiveness of most prevalent sensor features
RMS of angular velocity was reported in 18 studies, with sensitivity results for 11 studies. In PD, Van   . Kwon et al. and Luksys et al. found significant higher RMS angular velocity values for PD patients in comparison with controls, whereas two studies found a correlation of -0.78 between RMS angular velocity and clinical scores of the UPDRS (Jun et al., 2011;Kim et al., 2011;Kwon et al., 2018;Lukšys et al., 2018). Additionally, Heldman et al. found a correlation of -0.78 between RMS angular velocity values and the modified bradykinesia rating scale (Heldman et al., 2011a) and Salarian et al. found good correlation between RMS angular velocity values and the UPDRS bradykinesia subscore, as well as good correlation between RMS angular velocity of the roll axis and the tremor subscore of the UPDRS (Salarian et al., 2007). In patients with tremor, spearman correlation between RMS angular velocity and tremor severity scores ranged from 0.19 (finger-to-nose) to 0.73 (keeping arms extended in front of the body) for Lopez-Bianco et al. (López-Blanco et al., 2018) and between 0.41 and 0.70 for Kwon et al. (Kwon et al., 2020), whereas Heo et al. found lower RMS angular velocity values after electrical stimulation .
Seventeen studies reported mean acceleration as a feature, but only two PD studies and one ataxia study discussed its sensitivity. Romano et al. found lower mean acceleration for PD patients in comparison with the control group, while Zwartjes et al. did not find significant differences between ON and OFF stimulation states of deep brain stimulation (Zwartjes et al., 2010;Romano et al., 2021). In patients with Ataxia, Samotus et al. found lower mean acceleration after botulinum-toxin-A injections . Execution time was included in 17 studies, with reported sensitivity for 11 studies. Execution time significantly differed between different severity levels  and between healthy controls and patients with stroke Repnik et al., 2018) and MS (Carpinella et al., 2014;Carpinella et al., 2015) and between the paretic and non-paretic arm in children with unilateral CP . Execution time was significantly longer for PD patients in comparison with Frontiers in Robotics and AI frontiersin.org 16 healthy controls di Biase et al., 2018) and for patients with multiple system atrophy of parkinsonian type and progressive supranuclear palsy in comparison with healthy controls (Djurić-JovičićPetrovic et al., 2016). Third, execution time was significantly different between the ON and OFF medication state in PD patients (van den Noort et al., 2017). In CP, execution time was one of the three features to best estimate upper limb performance in a regression analysis (Strohrmann et al., 2013).
The dominant frequency domain was included in 15 studies, but only Hoff et al. reported individual contributions of this feature, reporting that amplitude in 1-4Hz and 4-8 Hz frequency bands correlated with the modified Abnormal Involuntary Movement Scale (Hoff et al., 2001). Peak power was included in 12 studies, of which six discussed its individual sensitivity. Jun et al. reported a good correlation between peak power and clinical bradykinesia scores and Kim et al. reported decreasing peak powers with increasing UPDRS scales steps (Jun et al., 2011;Kim et al., 2011), while Makabe et al. reported increasing peak powers with increasing severity stages of the Hoehn and Yahr scale (Makabe and Sakamoto, 2000). Similarly, Summa et al. reported increases in peak power in ON vs. OFF medication state . In essential tremor, Heo et al. reported higher peak power after electrical stimulation  and Kwon et al. reported high correlation between peak power and tremor severity scores .
Sample entropy was included in 11 studies, but only two PD studies reported its sensitivity. Chelaru et al. found significantly higher entropy for dyskinetic PD patients in comparison with non-dyskinetic PD patients, as well as Liu et al. who found a significant difference between PD patients and healthy controls and good correlation with UPDRS scores Liu et al., 2016). RMS of angular displacement was included in 11 studies, of which ten reported sensitivity. Tamas et al. found significant differences in RMS amplitude before and after subthalamic stimulation and Espay et al. found significant differences between ON and OFF medication state in PD Tamás et al., 2016). Kwon et al. found significantly lower RMS amplitudes for PD patients in comparison with controls and Jun et al. found decreasing angular displacement with increasing bradykinesia scores, but this was based on visual observation (Jun et al., 2011;Kwon et al., 2018). Chan et al. found higher values for angular displacement for patients with PD with tremor in comparison with essential tremor (Chan et al., 2022). Kim et al. additionally found a significant difference between PD patients and controls , whereas Heldman et al. found a correlation of −0.81 between RMS excursion angle and clinical scores (Heldman et al., 2011a). Delrobaei et al. found a higher tremor severity score (which was composed of the RMS values of angular velocity) for tremor-dominant PD patients in comparison with non-tremor dominant PD patients and good correlation between tremor severity score and UPDRS scores . In essential tremor, Kwon et al. and Chan et al. found correlations ranging from 0.29-0.66 and 0.80-0.93 respectively, between RMS angular displacement and tremor severity scores (Chan et al., 2018;Kwon et al., 2020). Energy and STD of acceleration were included in 10 studies, but none reported sensitivity.
RMS of acceleration was included in 10 studies, but only van den Noort et al. discussed its specific contribution in PD patients, reporting increased RMS acceleration in ON vs. OFF medication state during a finger tapping and opening/closing of the hand task (van den Noort et al., 2017). Mean angular velocity was also included in 10 studies with six of them reporting sensitivity. In PD, three studies found lower mean angular velocity for PD patients in comparison with healthy controls (Lee et al., 2015b;Djurić-JovičićPetrovic et al., 2016;Romano et al., 2021), whereas one study additionally identified significant differences between ON/OFF DBS stimulation (Salarian et al., 2007). Garza-Rodriguez et al. found lower angular velocity values for PD patients with higher clinical severity (Garza-RodríguezSanchez-Fernandez et al., 2018). In patients with ataxia, Oubre et al. found significant differences between patients and healthy controls (Oubre et al., 2021).
Jerk metrics were calculated in nine studies with five reporting on its sensitivity. Romano et al. used the dimensionless jerk index as a jerk metric and found a significant difference between PD patients and healthy controls, while Habets et al. did not find a significant difference between ON and OFF medication state in PD patients Romano et al., 2021). Carpinella et al. found a significantly higher jerk measure for patients with MS in comparison with healthy controls and a negative correlation between the jerk measure and ARAT score (r = −0.90) (Carpinella et al., 2014). In children with unilateral CP, Newman and others found a significantly higher normalised jerk index for the paretic arm in comparison with the non-paretic arm, but no correlation with the Melbourne Assessment Scale . In children with spasticity, the normalized jerk score improved significantly after botulinum-toxin A injections .
Coefficient of variation (CoV) was included in eight studies, where CoV of time and amplitude was mostly calculated to evaluate bradykinesia. Djuric-Jovicic and others found significant differences between PD patients and healthy controls for both CoV of time and amplitude, whereas Lee et al. found significant differences for CoV of speed, amplitude and frequency between PD patients and controls (Lee et al., 2015b;Djurić-JovičićPetrovic et al., 2016). Kwon et al. additionally found significant differences between PD patients and controls for the CoV of angles and velocity . Tamas et al. found that the coefficient of variation-also called 'rhythm'-improved significantly after bilateral and contralateral subthalamic Frontiers in Robotics and AI frontiersin.org stimulation, whereas Espay et al. found significant differences between ON and OFF medication state for CoV in PD patients Tamás et al., 2016). Spectral power was used in seven studies of which four reported sensitivity. Bravo et al. compared power spectral density (PSD) plots between PD patients and healthy controls and found both higher and lower PSD amplitude for PD patients in comparison with healthy controls, depending on the individual (Bravo, 2016). In patients with dystonia, Legros et al. found a decrease of the area under the spectrum curve after deep brain stimulation surgery (Legros et al., 2004). Ali et al. found higher PSD ratios for patients with essential tremor in comparison with healthy controls (Ali et al., 2022), whereas Heldman et al. found correlations from 0.77-0.83 between the logarithm of peak power and the UPDRS scores (Heldman et al., 2011b). The range of acceleration was additionally calculated in seven articles, but only two articles reported its sensitivity. Rabelo et al. found a significantly higher acceleration range for healthy controls in comparison with PD patients, while Habets et al. did not find a significant difference between ON and OFF medication state in PD patients Habets et al., 2021). Approximate entropy was also included in seven studies, but only two PD studies included its sensitivity, where Liu et al. and Luksys et al. found significant differences between PD patients and a control group Lukšys et al., 2018).
Range of angular displacement was calculated in six studies, but only four discussed its sensitivity. Djuric et al. reported a higher range for healthy controls in comparison with PD patients, whereas van den Noort et al. reported lower displacement in the ON vs. OFF medication state and improved amplitude in the ON compared to OFF state (van den Noort et al., 2017;Djurić-JovičićPetrovic et al., 2016). Romano et al. found significant differences between PD patients and healthy controls for wrist flexion and shoulder movements and Salarian et al. found significantly lower angular displacements at the level of the wrist for PD patients compared to healthy controls (Salarian et al., 2007;Romano et al., 2021). Energy of acceleration in the frequency domain and STD of acceleration were included in 11 articles, but all of them included these features as part of a feature set for machine learning, without discussing its individual contribution.

Frontiers in Robotics and AI
frontiersin.org 18

Discussion
The primary objective of this scoping review was to provide an overview of sensor set-up and type, included tasks, sensor features and statistical methods that are used to evaluate movement disorders during upper limb tasks in multiple pathological populations. We identified 101 studies in eight pathological conditions using wearable sensors placed on the upper limb during upper limb tasks and including at least one sensor feature based on linear acceleration or angular velocity. Of all included studies, 55% were studies in PD, 12% were studies with essential tremor patients, 11% were studies in stroke patients, 8% were studies in adults or children with ataxia, 6% were studies including participants with MS and 5% included children with CP. Adults with HD and spasticity and dystonia in children represented only 1% of the included studies. When comparing these numbers with the prevalence of the abovementioned conditions, an important imbalance emerges. Worldwide, approximately 101 million people are living poststroke (Feigin et al., 2022), 25 million people live with essential tremor (Song et al., 2021), 17 million people live with CP (McIntyre et al., 2011), 10 million people are estimated to live with PD (Van Den Eeden, 2003;Okubadejo et al., 2006;Marras et al., 2018), approximately 0.2-3 million people live with ataxia, depending on the type (Musselman et al., 2014;Ruano et al., 2014) and 0.2 to 0.5 million people live with HD, depending on the geographical area (Crowell et al., 2021;Medina et al., 2022). While stroke is much more prevalent than PD or essential tremor, this ratio is not reflected in the number of available studies per condition. More surprisingly, where CP is the most prevalent neurological childhood condition included, its high prevalence does not correspond with the number of studies investigating the associated movement disorders using wearable sensors. Current findings thus identify a major gap between prevalence of a condition and insights in the related movement disorders. Especially for early-onset conditions such as CP, more insights in the disturbed movement patterns from an early age could benefit targeted therapy and long-term treatment management.
The abundance of included PD studies reflects its more advanced state-of-the-art assessment in comparison with other pathological populations. These insights offer opportunities and learning experiences for clinicians and researchers aiming to bridge the gap between technology and clinical measures in the quantitative evaluation of movement disorders. Although widespread in research, the clinical implementation of IMU-based analysis of movement disorders is lacking in clinical practice in all populations, mainly due to the lack of validation of algorithms in real-world conditions (Del Din et al., 2021).
With respect to sensor type, IMUs containing both an accelerometer and gyroscope were most often used, where a time-related trend was clearly visible in the included PD studies: between 2000 and 2010, all PD studies included either an accelerometer or a gyroscope, whereas after 2010, IMUs were almost exclusively used. This trend is presumably supported by technological advancements, allowing more sensors in a smaller device with longer battery life combined with more affordable prices for IMUs.
Sensor location, number of included sensors and upper limb tasks were separately discussed to provide a comprehensive overview. However, conclusions should be drawn on a combination of these settings as they are closely inter-related. E.g., all but one of the nine studies that placed one sensor on the index finger included the finger tapping (Okuno et al., 2006;Heldman et al., 2011b;Hoffman and McNames, 2011;Kim et al., 2011;Tamás et al., 2016) or finger-to-nose task Bravo, 2017; and of the nine studies who placed a sensor one the thumb and index finger, all included finger tapping (Yokoe et al., 2009;Heldman et al., 2011a;Espay et al., 2011;Lee et al., 2015b;Djurić-JovičićPetrovic et al., 2016;Liu et al., 2016;Summa et al., 2017;Li et al., 2020;Park et al., 2021). Finger tapping, fingerto-nose, wrist pro/supination and opening/closing hand were the only tasks included in studies with sensors solely on the index finger and/or thumb and all of those were in PD patients. From Frontiers in Robotics and AI frontiersin.org all included studies, 68 placed a sensor on the wrist and/or forearm and 35 on the dorsal hand but out these 35, 22 placed a sensor on both the hand and wrist. The more proximal sensor placement of hand, wrist and forearm was used in all pathologies and in combination with more functional tasks such as drinking, writing and eating. Additionally, the four studies that measured activities in a home environment all placed sensors on the wrist, mostly likely due to the high comfort and ease of use of wristworn sensors (Griffiths et al., 2012;Habets et al., 2021;den Hartogvan der Krogt et al., 2022). When selecting a specific sensor set-up, one should thus carefully consider whether the aim is to only automate a clinical scale, or to evaluate movement disorders during a range of functional tasks. For the latter, one sensor on the hand, wrist or forearm could be sufficient to maximise adherence and wide applicability. The collection of upper limb tasks included in the selected studies reflects the insight that the choice of upper limb task is heavily dependent on the movement disorder. The high prevalence of finger tapping and wrist pro/supination in the PD studies follows from their presence in the Motor Examination part of the (MDS-)UPDRS (Goetz et al., 2008), whereas the finger-to-nose task and keeping arms extended in front of the body are part of both the (MDS-)UPDRS and the Essential Tremor Rating Assessment Scale (Elble et al., 2012). Both tasks are well-suited to quantify decrease and slowness of movements, corresponding with the clinical symptoms of hypokinesia and bradykinesia in PD. Since the (MDS)-UPDRS and Essential Tremor Assessment Scale are well implemented in clinical practice, patients are often requested to perform these tasks in the presence of a neurologist, facilitating combination of this clinical appointment with research purposes. In stroke, the Wolf Motor Function Task was most popular, presumably because this scale is used in daily practice for the evaluation of upper extremity rehabilitation progress. An important notion is that the aetiological differences between PD/tremor on the one hand and CP, stroke and dystonia on the other hand influence the potential of task execution. In CP and stroke, functional ability can be impaired to a level where execution of specific functional tasks is not possible, which requires a very different approach in comparison with PD or tremor, where most tasks can be executed but performance may be impaired. When the level of physical impairment prohibits the execution of specific tasks, one should focus on monitoring of the movement disorders during home-based activities such as powered mobility (e.g. joy-stick steering) or in rest positions in the case of severe CP or stroke (den Hartogvan der Krogt et al., 2022).
In the case of severe functional impairment occurring in e.g. dystonia or spasticity, there are some extra challenges with respect to sensor adherence and reliability of data streams which need to be taken into account. Sensor fixation should be sufficiently tight in the case of severe movement disorders, to avoid sensor dislodgement and subsequent data loss. From the studies involved in current selection, only one study included participants with such severe movement disorders that they were only evaluated during rest or power mobility driving since other tasks were impossible (den Hartogvan der Krogt et al., 2022). This specific study did not report any information on data loss apart from the fact that linear interpolation from adjacent time stamps was used in case of missing data stamps. There are multiple reviews discussing the use of wearable sensors for the detection, of motor symptoms in e.g. HD and PD, but none of those mention missing data or data loss of the included studies (Maetzler et al., 2013;Tortelli et al., 2021). To allow quality control, future studies measuring in natural environments and for longer duration should discuss missing data and data loss more thoroughly.
The secondary objective was to identify the most sensitive sensor features for symptom detection and quantification and describe the application of the proposed methods in clinical practice. Similar to the requested tasks, the derived sensor features were dependent of the movement disorder under investigation. Mean amplitude, movement/amplitude decrement and RMS, range and IQR of angular displacement were only used in PD studies and are hypothesized to correlate with the definition of hypokinesia (reduction in movement amplitude) in the (MDS-)UPDRS. Range and RMS of angular displacement can detect differences between PD and TD groups and quantify the severity of hypokinesia, implying that these features can be used in clinical practice as simply interpretable triggers of movement reduction. Velocity decrement and peak-to-peak, magnitude, IQR and mean of angular velocity were additionally only used in PD studies and are hypothesized to relate to the bradykinesia (slowing of movement) aspect in the (MDS-)UPDRS, emphasizing their clinical usefulness for early detection of bradykinesia symptoms (Garza- RodríguezSanchez-Fernandez et al., 2018). Coefficient of variation of both amplitude and velocity as well as rhythm, were included to reflect the interruptions as described in the (MDS-)UPDRS. CoV values are easy to calculate and interpret and showed to be sufficiently sensitive to discriminate between medication and stimulation states in PD patients from both finger-and wristworn sensors. This parameter could thus be implemented to evaluate objective intervention effects in large-scale medication or stimulation studies. In essential tremor and studies focusing on tremor in PD patients, occurrence and amplitude of peaks in specific frequency bands as well as power in these frequency bands were most often included, owing to the rhythmical aspect of tremor. However, the selected frequency bands were not always similar. The 4-12 Hz frequency band was most often considered as tremor (Heldman et al., 2011b;Ali et al., 2022), while Heo et al. and Kwon et al. used 3-12 Hz (Heo et al., 2015;Kwon et al., 2020) Lopez-Bianco et al. used a high-pass filter with cut-off 4 Hz followed by low-pass filter with cut-off 8 Hz (López-Blanco et al., 2018). These differences suggest that a solid definition of tremor frequency is required in order to standardize instrumented tremor quantification, to allow comparison of methodologies on a largescale cross-sectional level and to facilitate data merging and sharing.
In pathologies not related to PD or tremor, path length or similarity of hand trajectories were often calculated. This was the case in stroke, dyskinetic CP and pathologies associated with spasticity, reflecting the importance of the impact of the movement disorder on reaching movements. The frequent use of sensor features such as smoothness and jerk metrics might reflect the effect of the location of the brain lesions on the smooth execution of functional tasks and its impact on daily-life activities in these pathologies. For clinical implications, it is important to acknowledge the clinical differences between 'rhythm' and 'jerk'. Rhythm is a self-constructed concept and its meaning is study-dependent, but the focus is on 'regularly reoccurring events' (Okuno et al., 2006;Martinez-Manzanera et al., 2016). Jerk-measures on the other hand are always based on the first derivative of the acceleration and/or the second derivative of the angular velocity signal and focus on the jerky, unpredictable movements in the signal. Rhythm thus implies stable and/or recurrent patterns in signals, whereas jerk measures represent quite the opposite. This is an important distinction that reflects the clinical difference between rhythmic movement disorders such as tremor and arrhythmic movement disorders such as dystonia and choreoathetosis (Sanger et al., 2010).
The clinical application of the included studies varied from discrimination of groups to prediction of severity levels and was closely related to the method used to obtain this specific result. With respect to the discrimination of groups, the sensor features sufficiently sensitive to detect differences between a control group and pathological patients could be used for early detection of e.g. PD or MS symptoms, allowing for early intervention and possibly preventing rapid worsening of symptoms. For the prediction of severity levels, all PD studies correlated the sensor features to the (MDS-)UPDRS, the AIMS or the Hoehn and Yahr scale. In CP and stroke, sensor features were correlated with the Melbourne Assessment Scale and ARAT, whereas in another CP study, the Jebsen-Taylor Test, the Quality of Upper Extremity Skills Test (QUEST) and the Box and Blocks Test were included. When the clinical application was the (side) effect of intervention, six out of 16 studies used sensor features to assess dyskinesia in PD patients, as this is a well-known levodopa-induced motor complication (Jankovic, 2005). The clinical scales in PD and tremor rate symptom severity, while the Melbourne Assessment Scale, the ARAT, the Jebsen-Taylor Test, the Quality of Upper Extremity Skills Test and the Box and Blocks Test in CP and stroke mainly evaluate upper extremity function. The severity of the movement disorder in stroke and CP is often dependent on the location of the brain lesion, which was not researched in detail in the included studies and has not been fully elucidated to date in most movement disorders (Bansil, 2012). To this end, wearable sensors provide opportunities for detailed exploration of the connection between the location of the brain lesion and the aetiology and severity of movement disorders.
IMUs have mostly been used to assess upper limb use and for detection of activity periods in daily life in patients with PD and/ or essential tremor (Nguyen et al., 2017;Pham et al., 2017;Serrano et al., 2017), CP (Braito et al., 2018;Beani et al., 2019;Ahmadi et al., 2020) or stroke (Biswas et al., 2015), but their application to quantify movement disorders in the upper limb is less extensive. Activity measures mostly focus on the amount of time that acceleration measures exceed a pre-defined threshold (e.g., Activity Index), which yields information about the quantity of movement, but not about the quality. To facilitate follow-up of intervention or long-term rehabilitation programs, a combined assessment of both movement quantity and quality can provide more insights in both the presence and severity of movement disorders. Ideally, long-term monitoring is executed in a home-environment (i.e., low patient-burden while collecting long-term data), while a contact moment to record pathologyrelated tasks in a standardized setting could be added to the study protocol since this allows more specific data analysis, e.g., through the presence of video recordings of the performed tasks.
To maximise the use of wearable sensors for the quantification of upper limb movement disorders in clinical practice, one should acknowledge the differences in clinical symptoms between PD and tremor and movement disorders, such as dystonia and choreoathetosis. In PD, the expression of bradykinesia, hypokinesia and tremor is standardized and relatively easy recognisable. The features implemented should thus embody this pattern such as velocity decrement and mean/IQR of angular velocity for bradykinesia, movement decrement and range/RMS of angular displacement for hypokinesia and occurrence and amplitude of peaks in frequency bands for tremor. For dystonia and choreoathetosis, movement disorders known as being much less consistent, research should first focus on the search for sensor features capable of accurately discriminating between distinct movement disorders and their ability to quantify their severity. For this purpose, multi-centre studies are required, considering the low prevalence of individuals with dystonia and/or choreoathetosis.
For the quantification of spasticity, IMUs on the lower limb have been used to explore the relationship between maximal angular velocity and stretch velocity during passive stretches, but few studies focusing on upper limb measures with IMUs are available ( Bar-On et al., 2014). When using IMUs for the instrumented assessment of spasticity, one should not only take into account active tasks but also passive fast and slow stretches to differentiate its neural and non-Frontiers in Robotics and AI frontiersin.org 21 neural components as well as its velocity-dependent component. In the current review, only Bai et al. and Strohrmann et al. used IMUs to evaluate spasticity, and they included similar sensor features such as mean and standard deviation of acceleration and movement trajectory (Strohrmann et al., 2013;Bai et al., 2021). One extra challenge in spasticity may be range of motion restrictions, which can be evaluated using IMUs if sufficient passive movement is possible. If sensor placement on the hand or wrist is not possible due to severe positional deformities, the upper arm can be used as an alternative. Overall, IMUs have been scarcely used for the assessment of spasticity and the current review can serve as a facilitator to explore the different facets of spasticity using wearable sensors.
The use of these sensor features retrieved from one sensor on the hand, wrist or arm in combination with a home-based protocol to assess the effect of an intervention can greatly increase our understanding into the impact of current treatment management plans on the severity of upper limb movement disorders. The insights obtained for PD can accelerate the development of wearable sensors protocols in the remaining pathologies, provided that there is sufficient attention for the standardisation of protocols, tasks, feasibility and data analysis methods.

Conclusion and future directions
Wearable sensors offer a myriad of opportunities for the quantification of movement disorders in multiple pathologies, but the abundance of available information could threaten its usability. Our findings illustrate that there are a lot of similarities between pathologyrelated sensor protocols and tasks, but the agreement is yet not sufficient to allow data pooling or international multi-centre studies. For this purpose, higher-level standardisation with respect to task selection and sensor feature extraction per pathology is strongly recommended. Although multiple sensors can provide a lot of information, researchers should think carefully about the balance between information gain and accessibility. One sensor on the index finger for PD or on the hand, wrist or forearm for other pathologies could be attached in a non-obstructive way, allowing for better adherence and less missing data due to e.g., battery loss. Current overview can support clinicians and researchers to select the most sensitive pathologydependent sensor features and measurement methodologies for detection and quantification of upper limb movement disorders and for the objective evaluations of treatment effects.

Data availability statement
The original contributions presented in the study are included in the article/Supplementary Material, further inquiries can be directed to the corresponding author Author contributions IV: Conceptualisation, methodology, data extraction, analysis, data curation, writing-original draft, visualisation. HH: Methodology, writing-reviewing and editing. JD: Writing-reviewing and editing. EV: Reviewing and editing. HF: Writing-reviewing and editing. KD: Writing-reviewing and editing. J-MA: Writing-reviewing and editing. EM: Conceptualisation, writing-reviewing and editing, supervision, project administration, funding acquisition.

IV received an FWO fellowship (Fonds Wetenschappelijk
Onderzoek Vlaanderen, grant number 11C0222N).The funder had no involvement in study design, collection, analysis and interpretation of data, writing of the report, or in the decision to submit the article for publication.