Smoothness Metrics in Complex Movement Tasks

Smoothness is a main characteristic of goal-directed human movements. The suitability of approaches quantifying movement smoothness is dependent on the analyzed signal's structure. Recently, activities of daily living (ADL) received strong interest in research on aging and neurorehabilitation. Such tasks have complex signal structures and kinematic parameters need to be adapted. In the present study we examined four different approaches to quantify movement smoothness in ADL. We tested the appropriateness of these approaches, namely the number of velocity peaks per meter (NoP), the spectral arc length (SAL), the speed metric (SM) and the log dimensionless jerk (LDJ), by comparing movement signals from eight healthy elderly (67.1a ± 7.1a) with eight healthy young (26.9a ± 2.1a) participants performing an activity of daily living (making a cup of tea). All approaches were able to identify group differences in smoothness (Cohen's d NoP = 2.53, SAL = 1.95, SM = 1.69, LDJ = 4.19), three revealed high to very high sensitivity (z-scores: NoP = 1.96 ± 0.55, SAL = 1.60 ± 0.64, SM = 3.41 ± 3.03, LDJ = 5.28 ± 1.52), three showed low within-group variance (NoP = 0.72, SAL = 0.60, SM = 0.11, LDJ = 0.71), two showed strong correlations between the first and the second half of the task execution (intra-trial R2s: NoP = 0.22 n.s., SAL = 0.33, SM = 0.36, LDJ = 0.91), and one was independent of other kinematic parameters (SM), while three showed strong models of multiple linear regression (R2s: NoP = 0.61, SAL = 0.48, LDJ = 0.70). Based on our results we make suggestion toward use examined smoothness measures. In total the log dimensionless jerk proved to be the most appropriate in ADL, as long as trial durations are controlled.


INTRODUCTION
Despite the great importance of analyzing ecologically valid activities in clinical research and diagnostics, the quantification of activities of daily living (ADL) was typically limited to subjective scorings of videos (1,2) or timed trials (3)(4)(5), although a parametric quantification of movement quality was not feasible. With technologies like advanced motion-tracking devices, it recently became more feasible to investigate human behavior in a natural, ecological valid setting. In the performance of such tasks, goals can be achieved in various ways and actions can't be precisely predicted (6). However, certain actions, like phases of inactivity, transporting, grasping, rotating, circling, or balancing, repeatedly appear in ADL. In kinematic analyses, smoothness is a main characteristic of goal-directed movements. It is suggested that the planning of movement dynamics are based on smoothness (7). Thus, deficits in planning are reflected in reduced movement smoothness. Consequently, the assessment of smoothness has become a central metric for the diagnosis and supervision of motor rehabilitation in neurological disease. Therefore research, especially in upper-limb rehabilitation in stroke survivors, shows interest in quantifying smoothness (8)(9)(10)(11)(12)(13).
For different types of movement tasks, trajectories and corresponding velocity profiles differ in the signals' structures (14). Assessing complex tasks, like ADL, the signal can show peculiarities like phases of inactivity or different quantities of actions (6,15). This can lead to misestimations, when using approaches that utilize data processing that is developed for less complex signals, which can be periodic (16,17) [brushing teeth, sawing, or hammering (18)] or of short lengths (8) [reaching or grasping movements (8,13)]. Common smoothness parameters can be of basically three different types: Velocitybased parameters, acceleration-based parameters and arc-lengthbased parameters. The present classification bases on the differentiation between velocity peak, different kinds of jerk and spectral analysis metrics (12,17,19), whereas spectral analysis metrics follow the assumptiom that less smooth movements are more complex in terms of their frequency composition when approximating the original signal with a Fourier series [for an easy introduction to spectral analysis see (20)]. Various velocitybased parameters were suggested: the number of velocity peaks (6,(21)(22)(23), the normalized average speed (12,24), the relative level of activity (6,21,24,25), or the peak-based composition of the velocity profile (12,26). Acceleration-based parameters are examining the rate of change of movement acceleration and are usually forms of the jerk metric like the normalized jerk (27,28), the log dimensionless jerk (8), the normalized squared jerk (14), or the normalized mean jerk (12). The arc-lengthbased parameters measure movement smoothness by defining the complexity of the signal by the arc length of its profile, that is the velocity profile (8) or the profile of its power spectrum (8,17). So far, the were several attempts to compare and classify smoothness metrics, including investigations in a stroke samples (8,12), a sample of patients with cerebral palsy (13), and of a general kind (17). However this research focused on mainly reaching movements and did not examine the behavior of smoothness metrics when confronted with the complex signals of ADL performance in healthy elderly. It has been shown that even in healthy elderly movement smoothness is decreased (29-31), however, not a drastically as in neurological patients.
In this study we assessed the validity of different smoothness parameters by comparing the hand movements of young and elderly participants in an ADL. We investigated to what degree the used parameters were sensitive, variable, and independent of general movement characteristics like velocities, trial durations, path lengths, or the activity level. Further, we examined if the parameters are quantifying a general ability to produce smooth movements. ADL offer a way to examine behavior in the context of disease and aging in an ecological valid way and should therefore be considered in research and clinical assessment (6,10,11,21,32). This study tries to explore the behavior of different parameters in an empirical way in order to give suggestions and an outlook on future developments.

METHODS
We compared the movements of the dominant hand of 8 healthy young adults (26.9a ± 2.1a) with the movements of the dominant hand of 8 healthy elderly participants (67.1a ± 7.1a). The ADL task was to unimanually prepare a cup of tea with milk and sugar (21). All participants were right handed and each participant executed the task once. Ethical approval was obtained by the local ethics committee of the Medical Faculty of the Technical University of Munich. All subjects provided written informed consent.
The experimental set-up was similar to the one in Gulde et al. (21). Participants stood in front of a table with the following items placed on its surface in a semi-circular order from left to right: a container with room temperatured water, a milk carafe, a saucer for used tea-bags, an open container with tea-bags, an open container with sugar cubes, an open container with coffee powder (as a distractor item), and an empty kettle. Additional and in front, there were a mug and spoon located on the table.
Participants were asked to execute the task in a natural way without emphasis on speed.
The positional data of the dorsum of the hand was obtained by a Qualisys motion capturing system (Qualisys Inc. Gothenburg, Sweden) incorporating 5 Oqus cameras sampling with a frequency of 120 Hz. There were no gaps in the recordings. All post-processing was computed with MatLab (MATLAB R2017a, MathWorks, MA). After differentiation the data were smoothed using a 0.1s local regression filter ("loess") (33). This short smoothing window (12 frames) was chosen in order to preserve as much information as possible, to not corrupt the outcomes of spectral analysis, and in order to not fully eliminate noise, which can have a specific impact on jerk metrics (17). The time to boil the water, being dependent on water temperature and filling level, was not excluded from the signal (typically no hand movements occur in this waiting period). This was done in order to provoke meaningless variance in the signal and activity index (relative amount of time with hand movements).
The used smoothness parameters were the spectral arc length (8), the (negative-see description) number of velocity peaks per meter (21), the speed metric (12) and the log dimensionless jerk (8). The spectral arc length is calculated from the arc length of the power spectrum of a Fourier transformation of the velocity signal (8).
Calculation of the spectral arc length based on a velocity profile v, with [0, ωc] being the frequency band and V(ω) the Fourier magnitude spectrum (8).
For the number of velocity peaks per meter all peaks of the velocity profile, which exceed a prominence of 0.05 m/s, are counted and divided by the traveled path length. The resulting number is inverted, so that higher values indicate smoother movements (21).

number of velocity peaks per meter −peaks/ v
Calculation of the number of velocity peaks per meter based on a velocity profile v and peaks being maxima with a prominence exceeding 0.05 m/s (21).
The speed metric is obtained by dividing the average velocity by the maximum velocity (12).
Calculation of the speed metric based on a velocity profile v (12). The log dimensionless jerk results from the logarithm naturalis of the sum of the squared acceleration multiplied with the trial duration to the power of three and divided by the squared peak velocity (8).
Calculation of the log dimensionless jerk based on a velocity profile v with the time window t 1 to t 2 (8).
Note that all parameters but the speed metric output negative values, and for every parameter values closer to zero represent smoother movements. These four parameters were considered as prototypical agents for the different classes of smoothness measures listed in the introduction. The speed metric was added, since its computation strongly differs from the peaks metric and therefore its behavior could not have been derived.
The smoothness parameters were compared between groups using t-tests for independent samples with α = 0.05. Effect-sizes were calculated with Cohen's d (34). Sensitivity and within-group variability were analyzed on the basis of z-scores (reference: distribution of elderly subjects). Sensitivity was expressed as the mean z-score and within-group variability was calculated by 1-std(z-scores)/mean(z-scores) (with =1 being perfectly stable and <0 being perfectly unstable). The z-standardization was performed in order to be able to compare the sensitivity between the different parameters. Multiple linear regression models were applied to check for independence from the following kinematic parameters: trial duration (21), path length (21), mean peak velocity (average of velocity peaks with a minimum prominence of 0.05 m/s) (21), and relative activity (the relative amount of the trial duration in which the hands rest, defined by a velocity below 0.05 m/s) (21). The critical variance inflation factor (VIF) was set to 5. To avoid a moderation of age, the models of multiple linear regression were based on z-scores (within-group). Further, the trials were split into two halves (on the basis of trial duration) and the first half was compared to the second half of the trial (intra-individually) by correlational analyses. This can be considered as an alternative form of test-retest reliability, since by splitting the task into halves we can examine the within-subject behavior over two different tasks-which can be considered an increased generalizability of the outcomes. Of course, this lacks the comparison between two different time periods of the same task, but this would be prone to learning effects (and moderations by age) and therefore changes in movement smoothness in such a complex task and the reliability of a classic test-retest approach would therefore be questionable. Additionally, the four parameters were correlated with each other in order to get an estimate, if they are generally measuring the same phenomenon. The effect-sizes of the regressions were defined according to Cohen (34) as r > 0.1 being weak, r > 0.3 being moderate and r > 0.5 being strong. The small sample size can impact the outcomes of the statistical tests, especially of the models of multiple linear regression. The impact of variables can be therefore misestimated. The statistical power for all tests was determined post-hoc with a critical power of 0.80 using G * Power (G * Power 3.192, 2014, HHU Düsseldorf, Germany).

RESULTS
The comparison of the groups revealed significant differences for all four parameters ( Table 1) reaching from Cohen's ds of 1.69-4. 19. In all cases the young participants' movements were classified as being smoother. The sensitivity was measured by the z-scores of the young participants referenced to the elderly participants. Table 2 contains the resulting magnitudes of sensitivity and within-group variability for the different parameters. Sensitivity was calculated 1.60-5.28 and withingroup variability 0.11-0.72. The results of the comparison of the first half with the second half of each trial are covered by Table 3, with R 2 s ranging from 0.22 to 0.91. Note, that only the speed metric and the log dimensionless jerk reached significance and delivered power estimates of at least 0.80. The models of multiple linear regression showed significant models for number of velocity peaks per meter, spectral arc length, and log dimensionless jerk. The models for speed metric were all nonsignificant ( Table 4). The correlations between the parameters were, except log dimensionless jerk & speed metric, all significant and strong ( Table 5).

DISCUSSION
In the present study, we analyzed movement smoothness that is known as a highly characteristic aspect of task performance. Since measures of smoothness were typically established for simple continuous or discrete movements, we here analyzed the suitability of various measures for the evaluation of the complex activity of daily living of tea making. The comparison of the four smoothness parameters revealed that all of the methods were able to detect the differences in smoothness between young and elderly participants in the ADL of tea making. With mean z-scores between 1.60 and 5.28 all four parameters proved to be highly sensitive. Three of the parameters showed a within-group variance index above or equal 0.6, meaning that within group variability was low in the number of velocity peaks per meter, the spectral arc length, and the log dimensionless jerk, while it was very high in the speed metric. The intra-trial comparisons (first half vs. second half) further revealed that three of the parameters were significantly correlated between the two halves with the strength of the correlations being strong. Note, that in one case the statistical power was lower than 0.80 (spectral arc length). High correlations between the two halves support a generalization beyond this specific ADL. By splitting, two different tasks were artificially created and in case of high intra-trial correlations, the metric shows the capability to estimate the participant's general and not task restricted movement smoothness. Lastly, the models of multiple linear regression revealed an impact of kinematic parameters on three of the parameters. All of the models were strong (r > 0.5), although the sample size was small with 16 participants. The small sample size could have led to missing a possible dependence of the speed metric on kinematic parameters. Each of the four parameters was strongly connected with at least one other smoothness parameter ( Table 5), leading to the assumption that the used parameters are basically measuring the same phenomenon. Table 6 provides an overview of the outcomes of the parameter analysis.
Of the four parameters, none proved to be fully suited for a general quantification of smoothness in the tested ADL, although log dimensionless jerk did reveal good characteristics except its very strong association with trial duration. The number of velocity peaks per meter was able to detect the group differences, showed high sensitivity, low within-group variance, but the correlation between the first and the second half of the trial was non-significant (although a trend was observed, p = 0.07) and the model of multiple linear regression revealed a strong dependence on the traveled path length and mean peak velocity. The impact of mean peak velocity on smoothness can be explained by the fact that smooth movements promote faster movements and the change in the signal-to-noise ratio (17). The dependence on path length is surprisingly not an artifact of the calculation For the speed metric, none of the models was significant.   method of the parameter. The impact was negative, which means that the smoothness decreases with longer trajectories. Smaller movements, like repetitive approaches to the mug or hesitating and retrieving the hand could explain this: They produce peaks with small movements and at the same time add (unnecessary) path length. This would make the parameter better suited for the comparison of groups with similar path lengths and a comparable number of actions, this is for instance given in stroke patients and age-matched controls, who revealed comparable path lengths and action steps in the ADL of tea making (6). The spectral arc length was also able to detect the group differences, proved to be highly sensitive, revealed a low within-group variance, but showed a correlation in the intra-trial comparison with a power lower than 0.80, and revealed a strong dependence on kinematic parameters. Trial duration negatively impacted the model, resulting in less smooth movements in longer trials. The basis could be that trial duration in both groups was strongly associated with the general motor capability-this has been at least reported in stroke patients (10). The model was also impacted by the mean peak velocity, same as in the model for the number of velocity peaks per meter (with comparable ßs of 0.60 and 0.66). This would make the parameter well suited for comparisons with equal trial durations and movement speeds (also see power of intra-trial correlation). Unimanual ADL performance with the dominant and the non-dominant hand would be a possibility, where young as well as elderly revealed no difference in mean peak velocity (21). The speed metric was able to detect to group differences, proved highly sensitive and had a strong correlation in the intratrial comparison, but revealed a very high within-group variance. It further proved to be independent of other commonly used kinematic parameters, although this could be due to the small sample size. It is advisable to use this parameter with caution, due to its high within-group variance. It appeared to be not suited for an analysis of the ADL. The log dimensionless jerk was able to detect the group differences, proved highly sensitive and was strongly correlated in the intra-trial comparison. It further revealed a low within-group variance, but a strong dependence on the trial duration. The impact of trial duration was negative, same as for the spectral arc length. The log dimensionless jerk appeared to be well suited for the analysis of ADL, as long as the variability in the trial duration is controlled for. Although trial duration and movement capacity can be associated in patient groups (10), there is evidence that trial durations in ADL are based on other factors like movement strategy or cognitive factors in healthy adults (21), therefore controlling for trial duration in spectral arc length as well a log dimensionless jerk appear substantial.

CONCLUSIONS
The analysis of the four smoothness parameters revealed that there is still the necessity for a novel, well-suited parameter for the analysis of movement smoothness in ADL. Still, three of the four parameters proved to deliver good estimates, when controlling for certain aspects of an experiment. Since the sample size was relatively small, our findings have to be interpreted with care, although the statistical power was mostly high. In addition, we tested only one ADL task and the question is how much our findings can be extended to ADL in general. However, certain patterns and combinations of actions, for instance phases of inactivity, grasping, or transporting, repeatedly appear in ADL. This is particularly true in an ADL that demands manual interactions with serval objects like the one analyzed here. We therefore believe that our findings can be generalized to a broad class of ADL and draw the following recommendations for the use of smoothness parameters in ADL: The number of velocity peaks per meter needs comparisons with equal lengths of the trajectories and comparable quantities of performed actions in a task. The spectral arc length needs tasks with comparable trial durations and movement speeds. The log dimensionless jerk, having a very strong intra-trial correlation, very high sensitivity, low within-group variability, but a very strong dependence on trial duration, should deliver good estimates on a wide range of ADL tasks, as long as the trial durations are controlled for. Given these prerequisites, these three parameters (number of velocity peaks per meter, spectral arc length & log dimensionless jerk) can deliver appropriate estimates of movement smoothness in complex motor tasks like ADL. Future research should examine the different sub-types of those measures and different ADL tasks to see if the behavior of the parameter class and the ADL (tea making) are generalizable. The search for a universal smoothness parameter for ADL should have a high priority in neurorehabilitational research in order to assess motor capacity and supervise the rehabilitation process of patients suffering from neurological diseases like stroke, Parkinson's disease, or cerebral palsy. So far, comparisons of movement smoothness with control groups or in patient groups with high (kinematic) variabiliy and during the supervision of the rehabilitation process with changes in trial durations, mean peak velocities, or path lengths, are limited. A promising approach could be using wavelet transformation in order to estimate the complexity of the signal (analog to the spectral arc length metric).

LIMITATIONS
There are clear limitations in this study. Although examining an ADL, ecological validity was limited by the unimanual execution of the task. However, previous research has shown that the transfer from bimanual to unimanual performance has no interaction with age (21), and the unimanual execution controlled for variability in hand use in bimanual conditions. Considering the relatively small sample size, appropriate statistical power was still given. Further, the examination of a non-standardized, complex task with parameters that are fitted to quantify movement smoothness in discrete, single movement or cyclic tasks without further data processing like segmentation is a questionable approach considering validity and reliability. However, the results of this study strongly suggest that it is indeed possible to quantify movement smoothness in ADL by the existing parameter classes, although adaptations are still necessary. Another limitation is the small sample size. Therefore some associations could have been underestimated (e.g., the dependence of the speed metric on other kinematic parameters). Still, the analyses were able to reveal certain associations and the power estimates were acceptably high. Last, an assessment of test-retest reliability was only partially possible, since in such complex task fast adaptations through learning can be expected, as well as moderations of learning rates by age.