Machine Learning Approaches for Motor Learning: A Short Review

Machine learning approaches have seen a considerable number of applications in human movement modeling but remain limited for motor learning. Motor learning requires that motor variability be taken into account and poses new challenges because the algorithms need to be able to differentiate between new movements and variation in known ones. In this short review, we outline existing machine learning models for motor learning and their adaptation capabilities. We identify and describe three types of adaptation: Parameter adaptation in probabilistic models, Transfer and meta-learning in deep neural networks, and Planning adaptation by reinforcement learning. To conclude, we discuss challenges for applying these models in the domain of motor learning support systems.


Introduction
The use of augmented feedback on movements enables the development of interactive systems designed to facilitate or improve motor learning.Such systems, that we call motor learning support systems, require capturing and processing movement data and returning augmented feedback to the users.These systems have primarily been investigated in rehabilitation (e.g.motor recovery after injury [1]), or in other forms of motor learning induced contexts, such as dance pedagogy [2] or entertainment [3].
Motor learning support systems require to model human movements, taking into account the underlying learning mechanisms.While computational models have been proposed for simple forms of motor learning [4], modeling the processes found in more complex skill learning remains challenging.As a matter of fact, the need for computational advances linked to motor learning has been recently raised in the field of neurorehabilitation [5,6].
We believe that a data-driven strategy, using machine learning, represents a complementary approach to analytical models of movement learning.While recent results in machine learning have shown impressive advances in movement modeling, for example in gesture & action recognition or movement prediction [7], it is still yet difficult to apply such approaches to motor learning support systems.In particular, computational models must meet specific requirements in order to address the different variability mechanisms induced by motor adaptation and motor skill learning.The former, motor adaptation, is the process by which the motor system adapts to perturbations in the environment [8].Adaptation tasks take place over hours or days and does not involve learning a new motor policy.The latter, motor skill acquisition, involves learning a new control policy, including novel movement patterns and shifts in speed-accuracy trade-offs [9,10].Complex skills are typically learned over months or years [11,12].Thus, these models have to account for both fine-grained changes in movement execution arising from motor adaptation mechanisms, and more radical changes in movement execution due to skill acquisition mechanisms.This poses several challenges for the computational models, which should inevitably adapt over time to the variations of human motor abilities and expertise.
This article proposes a short review of the field of machine learning based movement modeling (Section 2).It then surveys different adaptation approaches, from adapting model parameters to transfer learning where the model can be generalized to different movement contexts (Section 3).We believe that these adaptation mechanisms are the core components of successful motor learning support systems that we envision as discussed in Section 4.

Machine Learning based Movement Modeling
Computational modeling of human movements using data-driven machine learning has been found successful for many tasks in animation, movement, action recognition and robotics.The modeling strategies largely differ according to the application context and task, the type of device used for capturing movements, as well as the amount of data available.

Probabilistic movement models
Probabilistic approach of movement modeling has a long history, in particular generative models such as Gaussian Mixture Models, Hidden Markov Models and Dynamic Bayesian Networks for movement generation and recognition [13,14], which recently gained interest in context with limited amount of data.More advanced techniques, such as Gaussian Processes, have been explored in movement manifold learning.

Learning shallow models
Gaussian Mixture Models (GMMs) have been used in robotics to learn movement trajectory models from few demonstrations given by a human teacher, and were shown efficient for rapid task adaptation and task generalization [15,16].GMMs can learn the parameter of a motion primitive model (e.g.dynamical motion primitives [17,18]), as proposed by [19,20,21].In particular, [16] has shown that task-parametrized GMMs that integrate multiple coordinate systems can be relevant for the adaptation of the movement to the characteristics of the task.In user-centered design of movement-based interactions, GMMs have been used for soft recognition of conducting gestures [22] and continuous motion-sound mapping by adapting the auditory feedback to individual user's movements [23].
When tasks require to encode the dynamics and temporal evolution of the movement, generative sequence models such as Hidden Markov Models (HMMs), as applied to gesture recognition [24,25] and movement generation [26,27].HMMs can be combined with Gaussian Mixture Regression to model dynamical systems for robot skill acquisition [28].Extensions of HMMs include Hidden Semi-Markov models providing explicit duration distributions over the hidden states [29].They have been applied to movement analysis in music performance [30] and to template-based real-time gesture segmentation and recognition [31].Movement primitives can also be learned incrementally using Hierarchical Hidden Markov Models [32].The method relies on unsupervised movement segmentation and uses the Kullback-Leibler divergence to automatically extract new primitives [32,33].

Learning movement manifold
Manifold learning in human movement modeling is motivated by the inherent correlations in human limb movement.Movement data often lies near a nonlinear manifold that has a lower dimensionality than the input data.Gaussian Processes (GP) [34], an extension of multivariate Gaussian distribution to (an infinite) function space, can be used to learn the mapping from the latent space to the observed human pose data.Seminal works in the field are the Gaussian process latent variable model (GPLVM) [35] and the Gaussian process dynamical model (GPDM) [36], which have been applied to human gait modeling.
Recent studies have looked at learning more expressive manifolds whose structure can represent a wider scope of human gaits [37,38].For instance, [38] proposed to stack Gaussian Process layers in order to encompass more human movement diversity.On the same line of work, deep versions of Gaussian processes have been shown to account for movements performed by two subjects at the same time, finding a latent space common to both movements [39].

Deep neural networks
Deep neural networks have been shown suitable for learning rich spatio-temporal representations [40].Spatial representation accounts for human body interdependence, while temporal representation accounts for non-linear dynamics as observed in human movement control and learning [41].

Recurrent Neural Networks
Recurrent Neural Networks (RNNs) are a specific set of neural networks that are able to process input data sequences through recurrent connections between their neural activations at successive time steps.In the past decade, RNNextended architectures have been applied to problem of human movement modeling, with the objective to generate movement or predict movement sequence based on few amounts of frames [42,43,44,45,46,47].Typical RNN architectures used are Long-Short Term Memory (LSTM) with an Encoder-Decoder layers if considering video input [42].[45] then proposed a 3-layer LSTM architecture where the outputs are processed through Dropout (dropping randomly output connections [48]) to avoid drifting in movement prediction.[49] showed that motion modeling can be improved by adding higher-level graph to RNN. [46] added Residual layers in RNN, improving movement prediction, while [50] advocated for leveraging movement velocity in the model.Finally, long-term prediction has been shown to be efficiently handled by adding attention mechanisms to RNN [51].

Alternative deep learning approaches
Alternative approach to RNN include temporal convolutions.[52] proposed a Convolutional sequence to sequence learning for Human movement generation and prediction.This model involves convolutional model to learn both spatial and temporal structure at the same time, usually not possible with RNN-like architecture.Convolutional sequence to sequence learning comes from machine translation [53], and was first applied to human movement in [52].Hierarchical extension has been recently proposed [54].The technique has then been applied to the special case of ski jumps by athletes [55].
Neural network based models have shown impressive results in movement prediction and generation.Their training procedure usually relies on large datasets and do not extrapolate well to drastic variations of the inputs.

Adaptation in Movement Modeling
Adaptation in movement modeling is required in contexts where the movement to analyze or generate significantly differs from the instances used for training.This typically appears when the movement varies over time, as in motor learning.

Parameter adaptation
Parametric movement models are characterized by a set of trained parameters.One approach for adapting such models to new conditions is to set the adaptation problem as a regression task.As example, a regression model can be learned between some contextual parameters (linked to the task) and the movement model parameters.[16] proposed such approach in robotics to adapt the robot movement parameters when new target coordinates are set for the robot arm.The underlying model is a GMM trained from few human movement demonstrations.In the context of movementbased interaction, [23] proposed a similar adaptation process where the inputs are the movement model parameters and the outputs are sound synthesis parameters.The underlying model is an HMM.They showed that adaptation is effective only for sufficiently small variations.
Model parameters could also be adapted online through specific mechanisms.For example [56] proposed tracking approach based on particle filtering to update states parameters characterizing learned trajectories (for example, scale and orientation for the case of 2D gestures).[32] proposed a method to iteratively trained a HMM for gesture recognition and generation.This is allowed thanks to regularization and fast optimization procedures.Considering neural networks, [57] proposed to train offline a RNN-based movement model and adapts the last layer parameters through recursive least square errors.Therefore, model adaptation has mostly been used as fine-grained adaptation mechanisms with pre-trained models.The typical use case is learning by demonstration (in human-robot interaction), or personalization (in human-machine interaction).

Transfer and meta-learning
Transfer and meta-learning aim to tackle the problem of adapting a pre-trained model to new tasks, unseen during training.One of the goal of transfer learning is thus to solve the problem of insufficient training data on a given task for a given complex model (such as deep neural networks).

Transfer learning
Transfer learning has gained interest in deep learning, where significantly large datasets are required for training, making this approach costly and inapplicable for applications where only limited data resource are available.Transfer learning in deep neural networks typically involves three strategies [58]: Embedding Learning learns embeddings from the source domain that provide a good discrimination between instances of difference classes; Few-shot learning focuses on the case where there are very few instances (typically less than 20 instances per class) in the target domain; Weight transfer uses the trained weights on a source domain as an initialization point of a second network to be trained on the target domain.
Transfer learning for movement modeling has primarily used embedding learning.[59] trains a convolutional-based auto-encoder on a large motion capture dataset to learn embeddings of the human motion.Embeddings are vector-like entities with usually lower dimensions than the original inputs.Then, based on a small dataset, a mapping is trained from the embeddings to high-level movement parameters, such as movement trajectory in the Cartesian frame.A similar approach is proposed by [60] to learn stereotypical motor movements (SMM) and transfer to classification for diagnosis of autism spectrum disorder.[61] uses a similar approach for inter-and intra-user adaptation of a gesture recognition system.Finally, [62] uses deep convolutional network to test the transfer of EMG-based movement recognition.They showed that transfer learning (as embedding learning) systematically improved the classification accuracy.
In summary, transfer learning has been shown to tackle the problem of limited data available in movement analysis and to increase classification accuracy.Most pre-trained models are based on convolutional layers.Previous models introduced in Section 2 have not yet been tested in task transfer.

Meta-learning
Meta-learning designates the ability of a system to learn how to learn by being trained on a set of tasks (rather than a single task) such as learning faster (with fewer examples) on unseen tasks.Meta-learning is related to transfer learning, and in particular few-shot learning.
Meta-learning of movement skills was proposed in robotics and human-robot interaction, to efficiently train robot actions from one or few demonstration.[63] proposed a one-shot imitation learning algorithm where a regressor is trained against the output actions to perform the task, conditioned by a single demonstration sampled from a given task.This approach is close to previous regression-based technique presented in Section 3.1, but formalised on a set of tasks.Considering raw input images, a more general approach has been proposed in [64,65].The adaptation process relies on the model-agnostic meta-learning (MAML) algorithm [66].The algorithm trains the model parameters (mapping observations to actions) on a task sampled from the set of task, then propagates the optimized parameters to other tasks (randomly sampled).This procedure creates faster adaptation to new tasks.
The MAML method has also found application in human motion forecasting [67].In this case, the goal is to predict a sequence of movement frames, as a continuation of a given sequence of movement frames.They used MAML to obtain a predictor that is able to adapt to new tasks rapidly.The also leverage the available large datasets to drive the optimization from few frames.

Adaptation through reinforcement learning
Most approaches to imitation learning rely on a supervised paradigm where the model is fully specified from demonstrations without subsequent self-improvement [68].To ensure a good task generalization, imitation learning requires a significant number of high quality demonstrations that provide variability while ensuring high performance.Alternative successful approaches to movement learning in robotics have relied on Reinforcement Learning (RL) to endow robots with the ability to learn from experience rather than demonstration [69].While RL can raise impressive performance, the learning process is often very slow and can lead to unnatural behavior.A growing body of research investigates the combination of these two paradigms to improve the models' adaptation to new tasks, making the learning process more efficient and improving the generalization of the tasks from few examples.
Demonstrations can be integrated in the RL process in various ways.One approach consists in initializing RL training with a model learned by imitation [69].A second strategy consists in deriving cost functions from demonstrations, for instance using inverse reinforcement learning [70,71].Building upon the success of Generative Adversarial Networks in other fields of machine learning, Generative Adversarial Imitation Learning (GAIL) has been proposed as an efficient method for learning movement from expert demonstrations [72].In GAIL, a discriminator is trained to discriminate between expert trajectories (considered optimal) and prolicy trajectories generated by a generator that is trained to fool the discriminator.This approach was then extended to reinforcement learning through self-imitation where optimal trajectories are defined by previous successful attempts [73].Several extensions of the advsersarial learning framework were proposed to improve its stability or to handle unstructured demonstrations [74,75].Recently, [76] proposed simultaneous imitation and reinforcement learning through a reward function that combines GAIL and RL.In comparison with GAIL or RL alone, the evaluation shows that the combination learns faster and reaches better performance.

Discussion and Perspectives
This paper surveyed the different approaches to Human movement modelling using machine learning, divided between probabilistic models and deep neural networks1 , as well as the adaptation mechanisms proposed to address limited training data and improve model generalisation.We now discuss how adaptive movement models can be used in support of motor learning, understood as motor adaptation and motor skill acquisition.
First, motor adaptation mechanisms involve, due to external perturbations, variations of an already-trained skill.Computationally, motor adaptation can be seen as an optimisation process that learned and cancelled external effects in order to return to baseline [78].Accounting for these underlying variations require rapid mechanisms and robust statistical modeling.Model parameter adaptation techniques of probablistic models (Section 3) might be an approach to handle movement variability induced by such processes, but remains to be comprehensibly evaluated.In addition, recent works highlighting the stochastic nature of the trial-to-trial motor variability [41] could be used to improve existing algorithms.More generally, handling complex statistics of motor variability in parameter adaptation algorithms represents a promising research direction.
Second, accounting for more dramatic changes in movement patterns (as induced by motor skill learning) might require computational adaptation that involves re-training procedures.Transfer and meta-learning describe the adaptation of high-capacity movement models to new tasks, and could account for variability induced by skill learning.Many challenges remain.One of difficulty is to assess to what extent transferring a given model to new motor control policies would induce the model to forget past skills.For instance, it was found that movement models relying on deep neural networks might lead to catastrophic forgetting [79].Also, meta-learning algorithms such as MAML [66] are currently not suited to adapt to several new tasks.Self-imitation mechanisms could help to generalise to a wider set of tasks.These approaches remain to be experimentally assessed in motor learning context.
Finally, a last challenge that we want to raise in this paper regards the continuous evolution of motor variation patterns.Motor execution may continuously vary over time, due to skill acquisition and morphological changes.Accounting for such open-ended task may require new form of adaptation such as in continuous online learning, as proposed by [80].We think that this is a promising research direction with high potential impact in motor learning application.
In closing, to be integrated in motor learning support systems, the aforementioned machine learning approaches should be combined with adaptation mechanisms that aim to generalise models to new movements and new tasks efficiently.We do not advocate solely for adaptive machine learning explaining motor learning processes.We propose adaptation procedures that can account for variation patterns observed in behavioural data, leading to performance improvements in motor learning support systems.