Parameters and Measures in Assessment of Motor Learning in Neurorehabilitation; A Systematic Review of the Literature

Upper limb function, essential for daily life, is often impaired in individuals after stroke and cerebral palsy (CP). For an improved upper limb function, learning should occur, and therefore training with motor learning principles is included in many rehabilitation interventions. Despite accurate measurement being an important aspect for examination and optimization of treatment outcomes, there are no standard algorithms for outcome measures selection. Moreover, the ability of the chosen measures to identify learning is not well established. We aimed to review and categorize the parameters and measures utilized for identification of motor learning in stroke and CP populations. PubMed, Pedro, and Web of Science databases were systematically searched between January 2000 and March 2016 for studies assessing a form of motor learning following upper extremity training using motor control measures. Thirty-two studies in persons after stroke and 10 studies in CP of any methodological quality were included. Identified outcome measures were sorted into two categories, “parameters,” defined as identifying a form of learning, and “measures,” as tools measuring the parameter. Review's results were organized as a narrative synthesis focusing on the outcome measures. The included studies were heterogeneous in their study designs, parameters and measures. Parameters included adaptation (n = 6), anticipatory control (n = 2), after-effects (n = 3), de-adaptation (n = 4), performance (n = 24), acquisition (n = 8), retention (n = 8), and transfer (n = 14). Despite motor learning theory's emphasis on long-lasting changes and generalization, the majority of studies did not assess the retention and transfer parameters. Underlying measures included kinematic analyses in terms of speed, geometry or both (n = 39), dynamic metrics, measures of accuracy, consistency, and coordination. There is no exclusivity of measures to a specific parameter. Many factors affect task performance and the ability to measure it—necessitating the use of several metrics to examine different features of movement and learning. Motor learning measures' applicability to clinical setting can benefit from a treatment-focused approach, currently lacking. The complexity of motor learning results in various metrics, utilized to assess its occurrence, making it difficult to synthesize findings across studies. Further research is desirable for development of an outcome measures selection algorithm, while considering the quality of such measurements.


INTRODUCTION
Neurological disorders affect a significant amount of people worldwide. Two common disorders are stroke and cerebral palsy (CP). Stroke occurs due to interruption of the blood supply to the brain or as a result of ischemia or bleeding (WHO, 2006), and has a prevalence of ∼795,000 new or recurrent events in the United States each year (Lloyd-Jones et al., 2010). CP is the most common neurodevelopmental motor disorder in children, which begins in early childhood and persists throughout lifespan (Bax et al., 2005), with a prevalence of 2-2.5 per 1,000 live births (Himmelmann, 2013). A common problem experienced by these populations is impaired upper extremity function. About 70% of stroke survivors lose motor skills of the paretic arm and hand (Lloyd-Jones et al., 2010). Even mild impairment results in significant daily function limitations and has a negative impact on the quality of life (Lai et al., 2002;Nichols-Larsen et al., 2005). Thirty-five percent of children with CP are diagnosed with hemiplegia, with their upper limb usually more affected than the lower extremity (Wiklund and Uvebrant, 1991). Regaining optimal upper extremity function is essential for participation in daily life, and for this reason, is one of the goals of neurorehabilitation.
During rehabilitation, "a process of relearning how to move to carry out their needs successfully" (Carr and Shepherd, 1987), patients improve their activity by either development of compensatory strategies (i.e., generation of the motor task with alternative movement patterns) or by reacquisition of the pre-lesion patterns, defined as recovery (Levin et al., 2009). Despite the difference in the underlying neuronal mechanisms of compensation and recovery (Tanaka et al., 2011), they both require learning (Kitago and Krakauer, 2013). Therefore, a great amount of therapeutic interventions apply motor learning principles, assuming these principles can enhance motor recovery and that permanent improvements in motor function can be achieved by training (Kitago and Krakauer, 2013). Motor learning was defined as "a set of internal processes associated with practice or experience, leading to a relatively permanent change in the capability for movement" (Schmidt, 1988). As these internal neural and cognitive processes cannot be directly observed nor measured at the behavioral level, motor learning can be estimated only by observing the performance (Cahill et al., 2001;Schmidt and Wrisberg, 2008).
The questions whether neurological patients are capable of learning and whether they have specific motor learning deficits are difficult to definitively answer due to the variety of motor tasks that rely on different learning processes, all associated with Abbreviations: AMI, active movement index; ARAT, action research arm test; BAT, bilateral arm training; BBT, box and block test; CP, cerebral palsy; CIMT, constrained induced movement therapy; CT, continuous tracking; CV, coefficient of variation; FMA, fugl-meyer motor assessment; FIM, functional independence measure; ICF, international classification of functioning, disability and health; MD, mean distance; MGA, maximum grip aperture; MS, movement smoothness; MT, movement time; MVT, maximal voluntary torque; NMT, normalized movement time; NMU, normalized number of movement units; nPL, normalized path length; PV, peak velocity; RMSE, root mean square error; RT, reaction time; SRTT, serial reaction time task; %TPV, percentage time to peak velocity; WMFT, wolf motor function test. various functional and anatomical brain structures (Krakauer and Mazzoni, 2011;Kitago and Krakauer, 2013). Moreover, the heterogeneity of patients, some having additional impairments masking their learning abilities, can make it difficult to demonstrate learning abnormalities (Krakauer, 2006;Kitago and Krakauer, 2013).
Several systematic reviews looked into measurable parameters in the rehabilitation process. Huang and Krakauer (2009) reviewed studies that explored the change in rehabilitation outcome as a function of different aspects of the intervention, such as amount, type, timing, and intensity of practice, and their effects on post stroke rehabilitation. Because motor learning is compiled from various processes (Krakauer and Mazzoni, 2011), diverse parameters are used to assess different learning types and aspects. For evaluation of an intervention's efficacy it is important to choose the appropriate measure (Huang and Krakauer, 2009). Huang and Krakauer (2009) distinguished between the adaptation and motor skill learning processes, placing the learning at a higher level of the motor control hierarchy. Also, various clinical outcome measures of functional performance [e.g., the Action Research Arm Test (ARAT; Carroll, 1965;Lyle, 1981), the Wolf Motor Function Test (WMFT; Wolf et al., 1989), Functional Independence Measure (FIM; Hamilton et al., 1994) etc.] were compared to a measurement of impairment [the Fugl-Meyer Motor Assessment (FMA; Fugl-Meyer et al., 1975;Gladstone et al., 2002)] (Huang and Krakauer, 2009). Sivan et al. (2011) identified outcome measures utilized in robot-assisted exercise therapy in stroke patients. Measures were clustered based on which domain within the International Classification of Functioning, Disability and Health (ICF) they evaluate. From the ICF framework, the patient characteristics and the reliability and validity of measures, an algorithm for selection of outcome measures was suggested. Kinematic measures, FIM, FMA, and the WMFT were identified as suitable for use in robot trainings, each for a different ICF domain, severity of impairment and time since stroke (Sivan et al., 2011). Only stroke patients undergoing a robot therapy intervention were included. The time of performance evaluation throughout the training and the relation of these measures to motor learning were not focused on. On the other hand, Kantak and Winstein (2012) highlighted that performance during acquisition might be transient and influenced by independent factors. They suggested that implementation of retention or transfer tests, which measure lasting improvements of motor execution of a skill, are essential in order to infer learning (Kantak and Winstein, 2012). Overall, these reviews focused little or not at all on application of common measurements to understanding the motor learning process.
Currently, there are no standard procedures regarding the choice of outcome measures (Huang and Krakauer, 2009). Inaccurate deduction of learning, caused by inadequate metric selection, might for example suggest a failure of training, when in fact inaccurate choice of measure is at fault. Moreover, reliable assessment and understanding of patients' motor learning process may reveal the impaired component within the process, and therefore facilitate the development and selection of an adequate and specific treatment to enhance recovery (Kitago and Krakauer, 2013). The assessments and measures of motor learning should demonstrate sensitivity to relevant change and remain invariable when there is no change in function.
Our main goal was to review the different parameters and underlying measures used in available studies to assess and measure the occurrence of motor learning following an intervention. We aimed to categorize the different parameters depending on the type and process of learning they evaluate, and to present the characteristics of the different parameters. Understanding the timing, purpose, advantages, and disadvantages of parameters and measures, will help both researchers and clinicians to better design studies, evaluate patients and treatment efficiency, compare interventions and better understand the underlying mechanisms of recovery. To the best of our knowledge no such comprehensive collection of information has previously been done.

Search Strategy
PubMed, Pedro, and Web of Science databases were searched for studies published between January 2000 and March 2016. For PubMed and Web of Science databases the following key words were searched: (1) learning OR motor learning OR motor control AND (2) stroke OR cerebral palsy AND (3) parameter OR measure OR assessment OR adaptation OR acquisition OR retention OR transfer AND (4) upper limb OR upper extremity OR arm OR hand AND (5) rehabilitation OR treatment OR training. Due to the considerable number of key words, modification of the search strategy was made for the Pedro database, to match the database requirements. The following searches were performed: (1) cerebral palsy AND learning AND hand (2) cerebral palsy AND motor control and hand (3) stroke AND learning AND hand (4) stroke AND motor control AND hand. The key word "hand" was chosen to represent the upper limb studies, as from overview of the retrieved records, it included results of most studies performed on the upper extremity.

Study Selection
To be included in the review, a study had to: (1) involve either stroke survivors or persons with CP, (2) assess process or type of learning after an intervention with motor learning principles, (3) be applicable for rehabilitation, (4) use outcome measures relevant for motor control, (5) focus on the upper extremity, (6) be written in or translated to English. Studies were excluded from the review if the study: (1) did not focus on human subjects, (2) used only clinical outcome measures, (3) included an intervention of virtual reality, transcranial magnetic stimulation or of electric muscle stimulation.
Titles and abstracts of retrieved records were merged into one database on the reference management software, and duplicates were removed.

Quality Assessment
Titles and potential abstracts were screened independently by two researchers (First and last authors). Titles that contained any of the exclusion criteria were excluded based on title only. Relevant full text articles and full texts of abstracts that were inconclusive regarding their relevancy were assessed, and studies that did not correspond with the inclusion criteria were excluded. Fitting articles were also extracted from reviews relevant to the topic and from full text article references. Data regarding studies' designs was extracted. All study designs of any methodological quality were included. Due to our objective to perform a comprehensive data collection of the various parameters and measures, we did not factor the strength of experimental evidence provided by the studies. In addition to studies that examined the efficacy of an intervention, we included studies that explored the feasibility of tools, hypotheses regarding mechanisms of learning and recovery and the implementation of mathematical models. In such studies, assessment of the methodological quality would yield no benefit, due to their different objectives. A narrative synthesis of the literature was performed.

Data Collection and Synthesis
First, all included articles were reviewed. Data was extracted and compiled from each research on: (1) studies' characteristics and methodology, (2) the grouping method, (3) type of intervention, (4) study protocol (5) outcome measures. We further gathered information regarding the characteristics of the outcome measures utilized to assess learning. Parameters utilized to infer learning were clustered according to the motor learning principle or form of learning that they assessed. Underlying measures for each parameter were defined as the measures used to quantify the parameter.

RESULTS
The database search retrieved 1,029 records. After removal of duplicates and eligibility assessment of the remaining articles, 32 studies were included in the review. Ten additional studies were identified by scanning the reference lists of relevant full test articles and of reviews relevant to the topic. In total, 42 studies were included in the review (Figure 1). Table 1 presents the study characteristics and methodological information of the 42 reviewed studies. Thirty-two of the studies addressed stroke patients, and 10 examined children or adolescents with CP. Our decision to focus on stroke and CP patients was due to their relatively high prevalence (Bax et al., 2005;Lloyd-Jones et al., 2010;Himmelmann, 2013). Also, they are a topic of considerable research because of their commonly affected motor skills of the upper extremity (Lai et al., 2002;Nichols-Larsen et al., 2005;Lloyd-Jones et al., 2010). Twentyone studies trained and assessed learning of the paretic hand, seven studies of the less affected hand, and in 14 of the studies both hands were either trained, evaluated or both. In 16 of the 35 studies that assessed the motor learning by examining only the affected or both hands, the affected arm was reported to be supported against gravity, in the remaining studies the arm was not supported. Additionally, 17 studies reported minimizing compensatory strategies of the upper limb or trunk (by a belt, harness etc.). Only two studies (Cirstea and Levin, 2007;Massie (Table 1).
The parameters utilized to infer learning varied across the studies ( Table 2) and included: adaptation, anticipatory control, after-effects, de-adaptation, performance, acquisition, reacquisition, retention, and transfer. Underlying variables used for measurement of the parameters included kinematic metrics in terms of timing, position, velocity, and acceleration, dynamic metrics of force generation, measures of accuracy, consistency, dexterity, and coordination. In all apart from three of the reviewed studies (Patton et al., 2006;Hemayattalab and Rostami, 2010;Sterpi et al., 2012), additional clinical measures were mentioned. In 17 of them clinical tests were performed only before the intervention, or before the beginning of the study for assessment of subjects' eligibility for participation (i.e., based on inclusion criteria). In 13 studies clinical measurements were taken before and after treatment, and in nine studies additional correlations were analyzed, or relationships examined, between the clinical and motor learning measures ( Table 2). Apart from three studies (Hemayattalab and Rostami, 2010;Geerdink et al., 2013;Hemayattalab et al., 2013), all of the reviewed studies performed kinematic analysis of some sort for measurement of the parameter. In 18 of them both the velocity and accuracy components were evaluated, whereas in nine of them only the accuracy component of the movement was tested; 12 studies examined only the velocity component ( Table 2). Six of the reviewed studies addressed measures of force generation (Raghavan et al., 2006;Chang et al., 2007;Colombo et al., 2008;Mawase et al., 2011;Bourke et al., 2015;Gilliaux et al., 2015).
The synthesized data provides an outline of the various parameters utilized in the studies for assessment of motor learning through an overview of each study. Parameters are emphasized in bold throughout the text. Measures used to assess the parameter are stated for each study, and underlined throughout the text.  (2) (1) N =

12.5
Training of point-to-point reaching movements in the horizontal plane, 5 days a week for at least 3 weeks Paretic; Support against gravity Restriction of trunk compensation Cirstea and Levin, 2007 Chronic stroke (range 3-24 months) Reaching movements with (1) knowledge of results (KR) vs.
Data is presented by mean ± SD unless specified differently.

Assessment of Adaptation and Anticipatory Control
Six of the reviewed studies examined the adaptation process (Dancause et al., 2002;Takahashi and Reinkensmeyer, 2003;Patton et al., 2006;Scheidt and Stoeckmann, 2007;Masia et al., 2011;Bourke et al., 2015). Two studies assessed the predictive control (Raghavan et al., 2006;Mawase et al., 2011). Patton et al. (2006) and Takahashi and Reinkensmeyer (2003) evaluated stroke subjects performing robotic reaching movement training. A baseline phase without perturbations was followed by a learning phase with constant exposure to forces. Then, after-effects were measured by a catch trial phase that included intermittent removal of the force field. Finally, when forces were completely removed (i.e., washout period), de-adaptation was assessed (i.e., tendency of the after-effects to disappear). Patton et al. (2006) measured the initial direction error, testing the effectiveness of the shift of the early part of the movement, and the average shift in initial direction from the un-perturbed baseline trials to the after effects catch trials to establish the adaptation capacity. Takahashi and Reinkensmeyer (2003) also assessed the adaptation capacity of stroke subjects by the spatial reaching error from baseline. But the late correction (the distance from the maximal and final deviation from the reference path) was also measured, to assess the whole movement following applied and removed force fields. Performance improvement was assessed as the reduction in reaching error from baseline to final phase (Takahashi and Reinkensmeyer, 2003;Patton et al., 2006). Scheidt and Stoeckmann (2007) examined the adaptation to velocity dependent perturbations with pseudo-random magnitude, using measurement of the initial direction error, and the end point accuracy. The compensatory response acquired during training was measured by the hand movement onset speed, peak speed point, penultimate position point (i.e., the moment the speed dropped below 20% of its maximal value) and by the final position point. Dancause et al. (2002) explored error correction strategies following an unexpected spring-like load presented in 30% of elbow flexion movement trials. The strategies were identified by comparison between the angular positions and torques of the initial movement before the load was applied and the correction following the load. Bourke et al. (2015) assessed the corrective responses after unexpected external perturbations to the elbow or shoulder, while instructed to maintain their hand at a spatial goal. The response was quantified by the posture speed before the perturbation, deceleration time and maximal displacement after perturbation, return time to baseline position, end point error and by the joint velocity offset that represents the multi-joint coordination. Masia et al. (2011) explored the ability of children with CP to adapt to randomized center-out movements while performing reaching movements toward peripheral targets. For each reaching movement lateral deviation from straight line, acceleration peak, and peak and average speeds were measured at familiarization, force field adaptation, and washout phases. Directional analysis was performed for all measures to assess the anisotropy index (i.e., the roundness or flatness of the ellipse). The learning index was established as the degree of adaptation measured by the lateral deviations in force field and catch trials. Mawase et al. (2011) explored the predictive control of CP subjects using a grasp and lift task of a virtual object. A sequence of increasing weights appeared randomly within trials of random weights. The planning of grasp precision was assessed by measurement of the grip force at the beginning of the lifting task and by the vertical trajectory of the object estimating the motor command. The precision of grasp execution was assessed by measurement of the temporal coordination. Raghavan et al. (2006) also explored the planning of precision grasp and precision of grasp execution among stroke patients. The peak grip and load forces and the timing and efficacy of grip load force coordination were measured respectively. . Three studies assessed the performance of stroke patients performing a reaching movement task following an intervention Caimmi et al., 2008;Durham et al., 2014). Wu et al. (2007) measured the RT, PV, movement time (MT) and the total displacement. Caimmi et al. (2008) measured the movement duration, end of movement angle, mean angular, and target approaching velocities, consistency of the target approaching and the movement smoothness (MS). The study objectives were to examine whether kinematic analysis is a sensitive and reliable measure, whether it can identify the mechanisms leading to improvement and quantify the functional improvement. Durham et al. (2014) measured the movement duration, PV, percentage time to peak velocity (%TPV), or to peak deceleration and to peak aperture, peak aperture size, peak elbow extension, and the MS. Wu et al. (2011) assessed the performance of stroke patients during pressing a desk bell and pulling a drawer tasks by measurement of the NMT, NMU, PV, percentage of movement time where peak velocity occurred, and the MS. Christopher and Johnson (2014) examined the performance of stroke patients performing a drinking task by measuring time to completion and the MS. Aluru et al. (2014) evaluated the performance of stroke subjects by measuring the movement speed, electromyographic activity of wrist extension, activation of wrist extensor and flexor, and co-activation of antagonist muscles during a wrist flexion-extension task.

Assessment of Performance and Skill Acquisition
While Wu et al. (2007), Aluru et al. (2014), Caimmi et al. (2008), Christopher and Johnson (2014), and Durham et al. (2014) studied only the performance, which is an individual execution of a skill (Krakauer and Mazzoni, 2011;Kitago and Krakauer, 2013), four studies (Thaut et al., 2002;Colombo et al., 2008;Massie et al., 2009;Geerdink et al., 2013) explored the skill acquisition, which depends on extended practice of the performance (Krakauer and Mazzoni, 2011;Kitago and Krakauer, 2013). Colombo et al. (2008), Thaut et al. (2002) assessed the performance change of stroke patients' reaching movements. Colombo et al. (2008), with the purpose of better understanding the learning mechanisms of stroke patients, measured the efficacy [the active movement index (AMI)], accuracy [mean distance (MD) from theoretical path], efficiency [normalized path length (nPL)], MS, MT, and the force control (error in orientation of force generation) of reaching movements. Massie et al. (2009) measured the trajectory variability, MT, reach velocity of reaching movements and the difference in compensatory strategies during treatment. Thaut et al. (2002) measured the movement durations, arm kinematics, variability of timing, reaching trajectories, and rhythmic synchronization. Geerdink et al. (2013) assessed the learning curve of manual dexterity of children with CP to establish the time during the intervention when maximal effects were reached using the box and block test (BBT) (counts the number of blocks that are transferred with a single hand from one compartment to another within 60 s).

Assessment of Transfer and Retention
Retention and transfer tests vary in the information received regarding the obtained learning. The transfer test assess a skill that was not practiced, whereas the retention test examines the trained task after a time interval (Schmidt and Lee, 2004;Kantak and Winstein, 2012).
Six studies explored whether trained movement generalized to untrained movement using transfer tests (Dipietro et al., 2007(Dipietro et al., , 2009Senesac et al., 2010;Dipietro et al., 2012;Krebs et al., 2012;Kitago et al., 2015). Dipietro et al. (2007Dipietro et al. ( , 2009Dipietro et al. ( , 2012, and Krebs et al. (2012) examined the generalization of trained reaching movements to untrained circle drawing movements by measurement of the axes ratio metric (i.e., indication for shoulder-elbow coordination). Additionally the studies measured speed profiles, MS, and measures of submovements [i.e., discrete ballistic movements that are a part of a more complex movement (Rohrer et al., 2004)] (not all measures were used in all studies, see Table 2). Dipietro et al. (2007) also examined changes in flexor-extensor abnormal synergies by measuring the joint angles correlation metric (independence of elbow and shoulder movements), orientation (best fitting line of hand path) and the major and minor axes of the drawn ellipse. Senesac et al. (2010) assessed the spatial generalization of improved proximal inter-joint coordination to two untrained reaching tasks, one spatially similar and the other different. Reach end-point kinematics (hand path curvature, time to peak velocity, PV, MS, and acceleration) were measured. Kitago et al. (2015) measured the reaching trajectories and their quality (i.e., trajectory analysis), MT, directional error, MS and and-point accuracy to assess the transfer of trained goal-directed reaching movements to untrained out-and-back straight movements. Two studies suggested that motor recovery after stroke and motor habilitation of children with CP resembles a motor learning model more than an adaptation model, as the trained movements generalized to the untrained movements Krebs et al., 2012).
While previously discussed studies implemented transfer tests assessing the generalization from trained to untrained movements (Dipietro et al., 2007(Dipietro et al., , 2009Senesac et al., 2010;Krebs et al., 2012;Kitago et al., 2015), Sterpi et al. (2012) implemented a transfer test that assessed the generalization of movements trained in a certain workspace (reaching movements to form a path of a square) to a different workspace (within and outside the square) of stroke subjects. The AMI, MT, MD (error of movement accuracy), nPL (error of movement efficiency) and the MS were measured. Gilliaux et al. (2015) examined the performance of children with CP by measuring the amplitude and coefficient of variation (CV) of straightness for a reaching-asfar-as-possible task, the speed index metric for reaching toward a target task, and the CV of jerk and speed metrics for drawing a square and a circle tasks. In addition, BBT and a wide range of clinical and functional measures of activity and participation were performed (see Table 2). As some of the tasks included movements that were not trained, the described measures were also included as part of the transfer parameter ( Table 2). Schaefer et al. (2013) measured the performance during a feeding task by measuring the number of successful repetitions, defined as spooning and transferring at least one bean from one cup to another. Transfer was assessed by sorting (BBT, spatiotemporally similar), dressing (spatiotemporally different), and dual tasks. The dual task conditions assessed whether automaticity transferred across tasks, measured by the difference between the reported and correct number of times a letter was heard in a sequence of letters.
Four studies examined the effect of different feedback frequencies on motor skill learning (Cirstea and Levin, 2007;Hemayattalab and Rostami, 2010;Hemayattalab et al., 2013;Burtner et al., 2014). Burtner et al. (2014) assessed the performance accuracy and consistency of an elbow extensionflexion reversal movement, which were assessed separately on acquisition at first day, retention (without feedback) and reacquisition (with feedback) on the second day. Accuracy was measured using the root mean square error (RMSE) (i.e., the average difference between the goal movement trajectory and the participants response), and consistency, by the variability of the RMSE. Hemayattalab and Rostami (2010) and Hemayattalab et al. (2013) evaluated new motor skill learning of children with CP-darts and bean-bags throwing tasks. The accuracy of scores was measured by the proximity to a center-peripheral target at acquisition and at retention, 3 days after practice. Cirstea and Levin (2007) examined the performance and retention following 1 month of pointing movements training to the contralateral workspace by measurement of the angular motions of the elbow and shoulder joints, the elbow-shoulder interjoint coordination and by the amount of trunk's anterior and rotational displacement. A transfer test included pointing movements toward an ipsilateral target. Molier et al. (2011) examined the effect of position resistance feedback, provided when a deviation from a predefined path occurred, during three reaching task trainings (moving hand; making a curve; lifting hand to shelf). The average use of the feedback was calculated, and the difficultly level was established by measurement of the reached height and diameter of predefined path. In addition, elbow and shoulder joint excursions, positions, and coordination, and an isometric strength task for which the maximal voluntary torque (MVT) were measured during a task of circular arm movements. The parameters that were utilized to assess learning were not directly specified in the study. Therefore, as the circular movements were not trained, we placed the measures under the transfer principle (see Table 2). The measured feedback frequency and difficulty level during training corresponds with the performance change that also may be addressed as the acquisition parameter.  aimed to determine the feasibility in implementing kinematic measurements (MT, PV, absolute initial directional error, path curvature, systematic error, number of submovements) of arm reaching and wrist pointing tasks, and of clinical measures for understanding the motor recovery process within 2 weeks after CIMT (i.e., retention. Chang et al. (2007) assessed the performance and retention of stroke subjects after reaching training, by measuring the PV, %TPV, MT, normalized jerk score, and limb muscle strength. Chen et al. (2014) assessed the performance and retention at 3-and 6-months by examining the speed and dexterity during 8 object manipulation tasks (Bruininks, 1978), by functional ability measures (see Table 2) and by kinematic analysis (RT, NMT, MS, PV, maximum grip aperture (MGA), and the percentage of movement where MGA occurs) during a reach-to-grasp task. Casadio and Sanguineti (2012) examined the performance change of stroke patients over a robot-assisted arm extension task practice. Performance measures (i.e., speed, precision, and smoothness), retention rate (dependence of voluntary control on previous trials), learning rate (dependence of voluntary control of next trials on current trial), assistance rate, noise (voluntary control not accounted for learning) and vision bias were measured. Retention was examined by estimating the correlation between the retention rate during performance to the percentage change in the FMA 3 months post the rehabilitation trial.

Motor Sequence Learning
Five studies assessed motor sequence learning of stroke patients (Boyd and Winstein, 2001, 2004, 2006Pohl et al., 2006;Orrell et al., 2007). Two of them included practice of the serial reaction time task (SRTT), which includes a movement to press one of four targets when cued. When the correct key is pressed, the next cue was delivered (Boyd and Winstein, 2001;Orrell et al., 2007). Boyd and Winstein (2001) assessed learning by performance change over practice of the median reaction time, whereas Orrell et al. (2007) assessed the median response time both during performance and retention. Orrell et al. (2007) also implemented two transfer tests, once by changing the motor sequence and by changing the required movement from index finger only to whole arm movement. Pohl et al. (2006) examined the performance change during practice of an implicit motor learning task of stroke patients undergoing a motor task with random and repeated sequences by the mean response time and by the CV of response time. To examine the performance following practice, subjects were requested to perform the sequence practiced in the repeated conditions once again. In one study (Boyd and Winstein, 2004) subjects practiced the continuous tracking (CT) task that included tracking of the vertical path of a target cursor. The middle third of each tracking trial was repeated, whereas the first and last third were random. Reduction in tracking errors measured by the RMSE and spatial-temporal accuracy were measured to assess performance change over practice and retention (Boyd and Winstein, 2004). In the final study subjects practiced both the SRTT and CT task (Boyd and Winstein, 2006). Learning was inferred by the median response time and RMSE respectively, at performance and retention.

DISCUSSION
The purpose of this review was to identify the different parameters and the variety of measures utilized in the literature to assess motor learning, and to categorize them based on the learning type and process they examine, while taking into consideration the patients' features and the studies' methodology and intervention. Over the past 15 years, 42 studies were identified as directly assessing the learning process of persons following stroke or with CP following different interventions. The studies varied by the parameters and measures utilized to infer motor learning. A sensible selection of an outcome measure is an important part when planning an intervention. The questions we raised were the studies' methods for selection of the relevant parameters and metrics, the differences between the measures and the information each metric can provide.

Measures of Adaptation vs. Skill Learning
Adaptation was assessed in the reviewed studies using dynamic perturbations by induced force fields during reaching movements (Dancause et al., 2002;Takahashi and Reinkensmeyer, 2003;Patton et al., 2006;Scheidt and Stoeckmann, 2007;Masia et al., 2011). Despite the variability in methodological designs (e.g., practice-number of trials and repetitions, utilized measures, etc.) between the studies, all induced a change in environment and assessed the resulted change in behavior. The manifestation of the adaptation parameter, in all studies, was evaluated by some sort of measurement of the extent and/or quickness in which the performance returned to the pre-perturbation level and of the sensory-prediction errors reduction. The changes in the aftereffects assess the update of the internal model (Krakauer, 2006;Huang and Krakauer, 2009). Robots were described as suitable to record movement data (e.g., position, velocity, and joint torques), which allows quantitative reliable measurement of kinematics and dynamics during recovery (Huang and Krakauer, 2009).
The utilized measures and methods of their implementation can affect the obtained results. For example, Patton et al. (2006) concluded that stroke survivors preserved the ability to adapt, whereas Takahashi and Reinkensmeyer (2003) found their adaptive capability to be reduced. Their different results may be explained by the fact that Patton et al. (2006) utilized a metric that measured only the early part of the movement, whereas Takahashi and Reinkensmeyer (2003) assessed the whole movement. Evaluation of the anticipatory control should include a time limit for the movements performed. The limit's purpose is to minimize online corrections and focus on deficits in the feedforward mechanism . For example, Raghavan et al. (2006) estimated the predictive control of children with CP by estimation of the motor command at the first 70 ms of a grasp and lift task. Without the time limit, the prolonged time they needed to receive sensory feedback for grip force generation, would not have been identified (Raghavan et al., 2006). Contrarily, the findings of Patton et al. (2006) cannot serve as a measure of feed-forward control error as it does not consider the time between the feed-forward control and movement initiation (Patton et al., 2006).
It is interesting to note that only studies that explored the anticipatory control utilized measures of force generation, while other studies used only kinematic measures. This is true of all but four of the reviewed studies that did include force components as part of their measurements (Chang et al., 2007;Colombo et al., 2010;Bourke et al., 2015;Gilliaux et al., 2015). Takahashi and Reinkensmeyer (2003), who did not measure movement dynamics, suspected that the rate of force development explains the impaired anticipatory control of the paretic arms. Measurement of the force generation may clarify whether the deficit in the anticipatory control is due to inability to form internal models or to implement them (Takahashi and Reinkensmeyer, 2003). Scheidt and Stoeckmann (2007) found that stroke patients adapt similarly to healthy individuals; however, they may require more practice, as they had more influence of prior error on subsequent movement. Dancause et al. (2002) added that severely affected individuals might require more practice, as they required more trials to diminish errors than mildly affected individuals. It can be inferred that measurement of the rate at which the errors decrease throughout trials and the influence of the prior error on the next movement, can be used as measures to evaluate the amount of practice an individual will require.
Adaptation may be an initial ingredient in a motor learning model (Bastian, 2008), and was implied to be a form of implicit learning that requires no awareness (Krebs et al., 2001). However, skill acquisition often requires conscious awareness and practice of the performance (Krakauer and Mazzoni, 2011;Kantak and Winstein, 2012). Performance curves indicating change throughout practice can be established for a variety of measures. Geerdink et al. (2013) used the performance curve of manual dexterity improvement to evaluate the point in time where maximal effects are learned and achieved, finding that age has an effect on the speed of dexterity gain. It can be inferred from these results that the learning curve can be used as a measure to study and evaluate the maximal effect of various interventions, which might later on assist to better establish individualized training timings to fulfill maximal potential.
None of the studies that examined the adaptation parameter assessed the persistence of the after effects. Measures of performance can be implemented at different times throughout the training, but can also be used at a time interval after the last practice to assess the retention. As performance might be affected by transient factors, such as feedback, attention, fatigue etc., performance measures may be limited to the acquisition phase. Therefore, a retention test is preferable to infer learning, as it assess long-lasting changes indicating constancy of the level of performance achieved at acquisition and strength of the motor memory (Schmidt and Lee, 2004;Kantak and Winstein, 2012).
Another important aspect of learning is the extent to which what was learned during practice generalizes outside of practice settings (Schmidt and Lee, 2004;Kantak and Winstein, 2012), assessed with transfer tests and reflects the flexibility of the motor memory (Schmidt and Lee, 2004;Kantak and Winstein, 2012). According to the motor learning theory of Fitts and Posner (1967), presenting a three stage model of learning-cognitive, associative, and autonomous, achieving transfer (automatization) and retention of a skill is necessary to indicate learning has occurred (Cano-de-la-Cuerda et al., 2015). It was suggested that when the trained and untrained tasks share similar neural demands, generalization to the untrained task is increased (Sainburg and Wang, 2002;Schmidt and Lee, 2004;Shadmehr, 2004). This implies that the transfer test should encompass and examine similar motor control requirements as the training. The transfer parameter is essential, since the rehabilitation aim is to grant the patient optimal function and independence, only possible when improvements generalize to real-life situations.
Some prominent theories of motor learning require a comprehensive analysis of movement that includes many of the measurements used in the reviewed studies (Cano-dela-Cuerda et al., 2015). However, most of these studies did not qualify to infer learning according to these theories. For example, parameters and measures implemented during tasks with restriction on degrees of freedom contradict Bernstein's model of motor learning (Bernstein, 1967) that requires the learner to increase the degrees of freedom as an indication of learning. Similarly, many of the studies evaluated the efficiency and consistency of the movement without testing transfer and vice versa. Both are essential to demonstrate learning according to Gentile's theory of motor learning (Gentile, 1972). Therefore, studies who aim explicitly to demonstrate learning must confide to a theory, and take it into consideration when designing the study.

Measures of Recovery vs. Compensation
Improved activity can arise from either recovery of impairment, development of compensatory movements, or both. There is a necessity to differentiate between impairment of execution and impaired motor learning (Boyd and Winstein, 2004). The implemented settings and outcome measures of a study may affect the interpretation of the results. Patton et al. (2006) supported the hand against gravity, which could decrease the effect of the impairment in movement execution (due to weakness, spasticity etc.) and result in detection of adaptive capability. On the contrary, Takahashi and Reinkensmeyer (2003) did not support the hand, which might led to poor adaptive capability among subjects, when in fact execution impairment increased due to gravity.
The state of mind presented in a study by Nourrit-Lucas et al. (2013), who explored long term retention of neurologically intact subjects, might also be relevant to studies performed on neurologically impaired individuals. The authors suggested that the experimental settings in which the learning is assessed, affects the measurements and the conclusions drawn regarding the learning process. Performance variables usually assess simple tasks with few degrees of freedom. They are defined as parameters measuring performance in terms of speed and accuracy, representing the outcome of the behavior with respect to the goal of the task (Nourrit-Lucas et al., 2013). A more complex task has a larger number of degrees of freedom, and learning can be explored by coordination variables measuring the spatiotemporal functional organization between body segments in terms of phase relations (Kelso, 1997;Nourrit-Lucas et al., 2013). In most of the reviewed studies, simple tasks were examined, among the rest, to decrease the probability for compensation. Moreover, in approximately half of the discussed studies the arm was supported against gravity, reducing even more the degrees of freedom. Retention tests assess the same coordinative pattern: an improvement is expected due to the motor plan that was established during practice, whereas a transfer test would examine the adaptability of the motor program. Therefore, transfer tests are more suitable to examine the coordinative measures (Nourrit-Lucas et al., 2013). The coordinative improvement can represent the reacquisition of the pre-lesion patterns, addressed as recovery (Levin et al., 2009). None of the reviewed studies assessed the coordination of a more complex task.
Kinematic analysis of various sorts for measurement of parameters were performed by all but three of the reviewed studies (Hemayattalab and Rostami, 2010;Geerdink et al., 2013;Hemayattalab et al., 2013). Subramanian et al. (2010) separated the kinematic variables to measures of motor performance and to measures of movement quality. Two of the discussed performance measures that were often utilized in the reviewed studies to assess the parameters of motor skill learning are measures of accuracy and velocity of movement. Reduction in errors of these measures indicates improvement of performance (Krakauer, 2006;Schmidt and Wrisberg, 2008). However, in motor execution of a task there is a relationship between movement speed and accuracy, and depending on task requirements, the accuracy or speed component can be prioritized (Fitts, 1954;Kitago and Krakauer, 2013). If only one of them is assessed, an improvement does not necessarily indicate an improved skill. For example, a subject can make more errors as speed increases, or slow down for a more accurate movement (Kitago and Krakauer, 2013). The movement quality measures include the configuration of the examined limb and measurement of compensatory movements, and were suggested as able to distinguish between recovery and compensation (Subramanian et al., 2010). Almost half of the reviewed studies restricted compensatory movements, preventing examination of movement quality required in real-life settings. How then, can complex motor skills be assessed in terms of qualitative organization for neurologic patients with a variety of heterogenic impairments, potentially masking learning? In some of the reviewed studies the learning following a treatment was inferred by examination of the less affected limb (Table 1). This may distinguish motor execution impairments due to hemiparesis, which may mask motor learning, from deficient motor learning (Boyd and Winstein, 2004).
Many studies evaluate the performance ability of their subjects using functional tests (Kitago and Krakauer, 2013). Functional tests often include more complicated movements with multiple degrees of freedom, resemble real-life activities, and are categorized under the "activities" domain of the ICF classification (Sivan et al., 2011). However, clinical measures do not consider the quality of the movement, and therefore do not measure a decrease in impairment or return to a normal motor control . Therefore, functional improvement in clinical measures, but not in measures of impairment (such as the FMA and kinematic analysis), can be attributed to compensatory strategies . Kinematic variables were suggested to be valid for differentiating compensation from recovery and for measurement of upper limb impairment (Subramanian et al., 2010). Huang and Krakauer (2009) reviewed rehabilitation strategies and outcome measures for impairment versus function. They suggested that in the acute and sub-acute stages of recovery, rehabilitation treatment should focus on the impairment level while refraining from compensatory adjustments, and only after a certain level of improvement is achieved, focus on functional performance. It can be inferred that the measure of assessment should also be fitted to the stage and timing within the recovery process, and should be able to distinguish between them. Movement quality measures can be sensitive and useful for complementing clinical assessment (Subramanian et al., 2010). Moreover, Sivan et al. (2011) suggested that when evaluating a rehabilitation program it is important to measure each domain within the ICF classification. While most clinical tests were classified as evaluating the "activities" domain, most of the measures that we reviewed, such as kinematic analysis and the BBT, were placed under assessment of the body functions domain. This different classification might explain the absence or low correlation between the measures in some of the studies, as they might assess different features of learning. This implies that when assessing motor learning measures different features should be taken into consideration.

Reliability and Validity of Outcome Measures
The validity and reliability are important components for a meaningful behavioral research. The validity represents the degree to which a study measures what it intends to measure, and reliability is the consistency of the results (Forzano and Gravetter, 2009). Therefore, both the validity and reliability are important in order to infer whether the parameters and measures used in the study provide accurate representation of the change during and after practice.
Kinematic movement quality measures were found to be valid and sensitive for recognizing upper limb impairments of stroke patients performing pointing and reach-to-grasp tasks (Subramanian et al., 2010). However, as mentioned previously, some of the reviewed studies did not examine quality measures, due to restriction of the upper limb during movement and of compensatory movements, or due to measurement of solely motor performance measures.
While the validity and reliability of the clinical measures were mostly stated in the reviewed articles, these were less addressed for the motor learning parameters and laboratorybased measures. Some studies mentioned the objective behind their motor control outcome measures. For example, Gilliaux et al. (2015) and Durham et al. (2014) measured kinematics previously established and described sensitive, respectively Geerdink et al. (2013) used the BBT because of its feasibility, validity and reliability (Jongbloed-Pereboom et al., 2013). Other studies focused on examining the strength of their measures. Bourke et al. (2015) found good to excellent reliability coefficients for task performance measures. Kinematic analysis was described as sensitive for motor control measurement (Chen et al., 2012), and for evaluation of motor recovery (Caimmi et al., 2008). Kitago et al. (2015) suggested that their method of analyzing reaching kinematics, based on functional principal component analysis (Yao et al., 2005;Goldsmith et al., 2013), is more sensitive than measures such as the end-point accuracy or PV kinematic measures due to its additional ability to evaluate the entire trajectory of movement.
Across the reviewed studies, not all parameters were equally studied and the reasoning for parameter selection was not addressed in all but one study. The validity and reliability of the assessed parameters were also not addressed. Kantak and Winstein (2012) suggested that retention and transfer tests, rather than solely the performance parameter, should be implemented to negate transient performance changes and detect motor learning. In this review, the performance was most often assessed, in 24 of the reviewed studies, whereas transfer was evaluated in 14, and retention in only eight of the studies. Retention was examined following a time interval in all studies, but the length of the retention interval varied. Kantak and Winstein (2012) categorized retention into immediate and delayed (i.e., implemented at least 24 h after practice), and suggested that different retention intervals can yield different conclusions regarding the obtained learning (Kantak and Winstein, 2012).

Review's Limitations
This review has several limitations. Firstly, we focused only on studies of the upper limb. As the lower limb role and function differs from the upper limb, it is important to further address the metrics utilized for assessment of the lower extremity with relation to their place within the motor learning model. Secondly, only studies assessing CP and stroke subjects were included. Important information regarding the parameters and underlying measures might have been already studied in other populations and settings that were excluded from the review. It is also possible that we missed additional relevant studies when searching the databases, despite the multiple key words, due to inconsistency in terminology. Furthermore, the included studies were very heterogenic in their methodology, motor tasks, and measures. We categorized the identified measures according to the evaluated parameters. We could not synthesize the findings across the studies to a better extent due to the measures' high variability between the studies. Due to the variability in study design, and the narrative nature of this review we did not assess the methodological quality of the studies. Despite the broad utilization of kinematic analysis in the reviewed studies and despite reports of validity and reliability of some measures, we could not evaluate the appropriateness of measures' selection and the quality of measurements. An additional drawback is that we examined the evidence for the reliability and validity only within the reviewed studies, possibly missing other studies whose sole purpose was to establish these features rather than examine the outcome of a treatment. We suggest that a review with that scope in mind should include additional search of the databases with focus on the reliability, validity, sensitivity, and specificity of the implemented measures, while taking into consideration the variety in the characteristics of studies. This may serve to establish an algorithm for the selection of outcome measures.

CONCLUSIONS
Motor learning is fundamental for improvement of affected motor skills following brain lesion. When designing a rehabilitation training, there is a great importance in selection of appropriate outcome measures for patients' evaluation. In this review, we described the diverse parameters and measures utilized by studies for identification of motor learning in stroke and CP patients.
We overviewed the literature and differentiated the parameters based on the type of motor learning they assess. There is an agreement throughout the studies regarding the meaning and implication of the parameters. However, not all parameters are equally studied. Despite long-lasting effects of practice and generalization of training to real-life movements being a fundamental part of the motor learning theory, the majority of the reviewed studies did not assess the retention and transfer parameters.
Similar metrics can be utilized to measure and quantify different parameters, without evident exclusivity of measures to a specific parameter. Different metrics, the timing and method of their implementation, might reveal inconsistent results regarding a patient's ability to learn. Utilization of solely clinical metrics or the lack of movement quality measurements might only estimate the learning of compensatory movements, rather than of recovery. Additionally, restriction of compensatory movements might result in learning of solely simple movements, with consequences of little generalization to real-life functional movements. The necessity to differentiate between recovery and compensation, and between learning and execution deficiencies, suggests that a combination of measures assessing different features of learning and function might provide a more accurate information about patients' abilities and progress. There is no consensus about the relations between clinical and motor learning measures, and measures described in this study are seldom available in clinical settings. Therefore, we suggest that joint effort by researchers and practitioners should focus on translation of research findings into feasible clinical practices.
To conclude, we have performed a comprehensive data collection of measures used to infer motor learning. While the extant and variability of the measurements produced a raw descriptive review, it should be used as a stepping stone. First, toward qualitative comparison of measurements. Ultimately, we hope it will lead toward an algorithm of outcome measure selection according to population, intervention and concordance with the different motor learning theories, a tool that is currently lacking. Researchers set to examine motor learning should address the entire set of parameters described in motor learning theory, and clinicians should know what measurements are the most valid and reliable to assess treatment progress. Until such an algorithm is developed, intelligible study design and reasoned measurement selection can both improve current studies, and generate the data required to develop the algorithm for future research.

AUTHOR CONTRIBUTIONS
NS and SB decided on the review's subject, selected the search terms and performed the literature search. NS wrote the review, tables and figures. SB and IM supervised, commended and critically revised the manuscript. All authors substantially contributed to this review and all approved the final version.

FUNDING
This work was partially supported by the Helmsley Charitable Trust through the Agricultural, Biological and Cognitive Robotics Initiative of Ben-Gurion University of the Negev, Israel and supported by a Master degree scholarship from faculty of health sciences at Ben-Gurion University of the Negev, Israel.