Machine Learning Approaches for Motor Learning: A Short Review

Caramiaux, Baptiste; Françoise, Jules; Liu, Wanyu; Sanchez, Téo; Bevilacqua, Frédéric

doi:10.3389/fcomp.2020.00016

MINI REVIEW article

Front. Comput. Sci., 29 May 2020

Sec. Human-Media Interaction

Volume 2 - 2020 | https://doi.org/10.3389/fcomp.2020.00016

This article is part of the Research TopicArtificial Intelligence and Human Movement in Industries and CreationView all 7 articles

Machine Learning Approaches for Motor Learning: A Short Review

Baptiste Caramiaux¹^*

Jules Françoise²

Wanyu Liu^1,3

Téo Sanchez¹

Frédéric Bevilacqua³

¹Université Paris-Saclay, CNRS, Inria, LRI, Gif-sur-Yvette, France
²Université Paris-Saclay, CNRS, LIMSI, Orsay, France
³STMS IRCAM-CNRS-Sorbonne Université, Paris, France

Machine learning approaches have seen a considerable number of applications in human movement modeling but remain limited for motor learning. Motor learning requires that motor variability be taken into account and poses new challenges because the algorithms need to be able to differentiate between new movements and variation in known ones. In this short review, we outline existing machine learning models for motor learning and their adaptation capabilities. We identify and describe three types of adaptation: Parameter adaptation in probabilistic models, Transfer and meta-learning in deep neural networks, and Planning adaptation by reinforcement learning. To conclude, we discuss challenges for applying these models in the domain of motor learning support systems.

1. Introduction

The use of augmented feedback on movements enables the development of interactive systems designed to facilitate motor learning. Such systems, which we refer to as motor learning support systems, require the movement data be captured and processes and that augmented feedback be returning to the users. These systems have primarily been investigated in rehabilitation [e.g., motor recovery after injury (Kitago and Krakauer, 2013)] or in other forms of motor learning-inducing contexts, such as dance pedagogy (Rivière et al., 2019) or entertainment (Anderson et al., 2013).

Motor learning support systems model human movements, taking into account the underlying learning mechanisms. While computational models have been proposed for simple forms of motor learning (Emken et al., 2007), modeling the processes at play in more complex skill learning remains challenging. Motor learning usually refers to two types of mechanisms: motor adaptation and motor skill acquisition. The former, motor adaptation, is the process by which the motor system adapts to perturbations in the environment (Wolpert et al., 2011). Adaptation tasks take place over a rather short time span (hours or days) and do not involve learning a new motor policy. The latter, motor skill acquisition, involves learning a new control policy, including novel movement patterns and shifts in speed-accuracy trade-offs (Shmuelof et al., 2012; Diedrichsen and Kornysheva, 2015). Complex skills are typically learned over months or years (Anders Ericsson, 2008; Yarrow et al., 2009).

The need for computational advances in motor learning research has recently been pointed out in the field of neurorehabilitation (Reinkensmeyer et al., 2016; Santos, 2019). We believe that data-driven strategies using machine learning represent a complementary approach to analytical models of movement learning. Recent results in machine learning have shown impressive advances in movement modeling, such as action recognition or movement prediction (Rudenko et al., 2019). However, it is still difficult to apply such approaches to motor learning support systems. In particular, computational models must meet specific adaptation requirements in order to address the different variability mechanisms induced by motor adaptation and motor skill learning. These models have to account for both fine-grained changes in movement execution arising from motor adaptation mechanisms and more radical changes in movement execution due to skill acquisition mechanisms.

We propose a short review of the adaptation capabilities of machine learning applied to movement modeling. The objective of this review is not to be exhaustive but rather to provide an overview of recent publications on machine learning that we found significant for motor learning research. We believe that such an overview is currently missing and can offer novel research perspectives, targeting primarily researchers in the field of motor learning and behavioral sciences. In order to build the review presented in this paper, we focused on recent articles (typically <10 years old). At the time of writing (the end of 2019), we queried four online databases (Google Scholar, PubMed, Arxiv, ACM Digital Library) by combining the following keywords: “Human Movement,” “Motor Model,” “Modeling/Modelling,” “Tracking,” “Control,” “Synthesis,” “Movement Generation,” “Movement Prediction,” “On-line Learning,” “Adaptation,” “Gesture Recognition,” “Deep Learning,” and “Imitation Learning.” We then compiled the papers in a spreadsheet and conducted a selection based on the type of model adaptation, the modeling technique, the field, and the input data considered. We summarize the review in Table 1 and identify three adaptation categories in machine learning-based human modeling: (1) Parameter adaptation in probabilistic models, (2) Transfer and meta-learning in deep neural networks, and (3) Planning adaptation by reinforcement learning. We present the selected papers according to the type of adaptation and discuss their use in motor learning research.

TABLE 1

Table 1. Summary of the papers selected in our short survey, classified according to the type of adaptation involved in machine learning-based movement modeling.

2. Parameter Adaptation in Probabilistic Models

Research in movement recognition and generation has, for a long time, used parametric probabilistic approaches, such as Gaussian Mixture Models (GMM), Hidden Markov Models (HMM), or Dynamic Bayesian Networks (DBN). These models are characterized by a set of trained parameters that can be adapted during execution, either by providing new examples during the interaction or adapting the model parameters online according to the characteristics of the task.

GMMs have been used in robotics and HCI to learn movement trajectory models from a few demonstrations given by a human teacher (Calinon et al., 2007). In robotics, Calinon (2016) proposed such an approach to adapt the robot movement parameters when new target coordinates are set for the robot arm. The underlying model is a GMM trained from a few human movement demonstrations. In the context of movement-based interaction, Françoise et al. (2016) proposed a one-shot user adaptation process where the input movement associated with a sequence of sound synthesis parameters can be estimated from a single demonstration in order to retrain the underlying GMM. They showed that user-adapted feedback can support the consistency of movement execution but that the adaptation process is efficient for limited movement variations. Sarasua et al. (2016) used GMM for soft recognition of conducting gestures that can adapt easily to user idiosyncrasies. The GMM-based mapping is learned from gesture demonstrations performed while listening to the desired musical rendering. The model is able to interpolate between demonstrations but cannot account for dramatic input variations. When tasks require that the dynamics and temporal evolution of the movement be encoded, generative sequence models, such as HMMs have been applied to gesture recognition from a few examples (Françoise and Bevilacqua, 2018) as well as movement generation (Tilmanne et al., 2012). Such adaptation techniques are often efficient when variations remain small in comparison with the overall movement dynamics.

Another approach, proposed by Caramiaux et al. (2015), consists of tracking probability distribution parameters representing input movement variations from a set of gesture templates. Tracking uses particle filtering, which updates state parameters representing movement variations (such as scale, speed, or orientation). The method can account for large, slow variations. However, the tracking method does not learn the structure of the gesture variations and forgets previously observed states.

Finally, parametric probabilistic models can be trained online to account for new movement classes. Kulić et al. (2008, 2012) proposed an HMM-based iterative training procedure for gesture recognition and generation. The method relies on unsupervised movement segmentation, from which it automatically extracts existing and new primitives (using Kullback-Leibler divergence). This strategy enables both the fine-grained adaptation of existing motor primitives and the extension of the vocabulary of motor skills. However, unsupervised segmentation remains difficult for complex gestures, and the learning remains cumulative, with an ever-growing vocabulary rather than a continuous adaptation to motor learning. Other online strategies for segmentation with adaptive behavior are described in Kulic et al. (2009).

In summary, parametric adaptation enables fine-grained adaptation to task variations and restricted input movement variations. The typical use case is learning by demonstration (in human-robot interaction), or personalization (in human-machine interaction).

3. Transfer and Meta-learning in Deep Neural Networks

Transfer and meta-learning are techniques aiming to accelerate and improve the learning procedures of complex computational models, such as Deep Neural Networks (DNN). The objective is to adapt pre-trained DNN efficiently to new tasks or application domains that are unseen during training. This research is based on the literature on deep learning applied to movement modeling, which typically involves large datasets and benchmark-driven tests. The most popular approaches of this kind are Recurrent Neural Networks (RNN) (Fragkiadaki et al., 2015; Mattos et al., 2015; Alahi et al., 2016; Ghosh et al., 2017; Martinez et al., 2017; Kratzer et al., 2019; Wang and Feng, 2019), and Temporal or Spatio-temporal Convolutional Neural Networks (CNN) (Gehring et al., 2017; Li et al., 2018, 2019; Zecha et al., 2018).

3.1. Transfer Learning

Transfer learning adapts a pre-trained model on a source domain to new target tasks. Several strategies exist (Scott et al., 2018). Transfer learning for movement modeling mainly relies on embedding learning: movement features (or embeddings) are learned from the source domain, providing well-shaped features for the target domain.

Movement embeddings are learned from large movement datasets. A first strategy involves one-dimensional convolutions over the time domain (Holden et al., 2016; Rad and Furlanello, 2016). Rad and Furlanello (2016) propose that learning be embedded using temporal convolution in order to improve diagnostic classification of autism spectrum disorder from inertial sensor data. The benefits of transfer learning are assessed on two datasets collected from the same participants three years apart. In another context, Holden et al. (2016) makes use of transfer learning to synthesize movements from high-level control parameters that are easily configurable by human-users. Based on pre-trained movement embeddings from motion capture data, a mapping between high-level parameters and these embeddings can be efficiently learned according to the user needs.

Spatio-temporal convolutions can also be used to extract movement embeddings. Kikui et al. (2018) used this approach for inter- and intra-user adaptation of a gesture recognition system using photo reflective sensor data from a headset. They showed that transfer learning improves accuracy when the number of examples per class is low (lower than 6 ex/class). Also for classification, Côté-Allard et al. (2019) showed that embedding learning systematically improved the classification accuracy of EMG-based movement data; in particular, they found that embedding learning using CNNs on Continuous Wavelet Transform (CWT) gives the best results.

Finally, RNN can also be used to learn movement embeddings, although this is not the most common approach. In the context of human-robot interaction, Cheng et al. (2019) trained an RNN-based movement model offline and then adapted the last layer parameters through recursive least square errors. The goal is to adapt the robot control command to human behavior in real-time.

In summary, transfer learning of movement features has been proposed (1) to enable interactive movement generation or (2) to improve classification performance. Several problems remain to be addressed, especially in the context of motor learning. First, it is unclear how the model architecture and the size of the training set of the transfer task affect the approach. Second, the extent to which successive transfers would provoke dramatic forgetting of previously transferred tasks remains unexplored.

3.2. Meta-learning

Meta-learning designates the ability of a system to learn how to learn by being trained on a set of tasks (rather than a single task), such as learning faster (with fewer examples) on unseen tasks. Meta-learning is close to transfer learning, but, while transfer learning aims to use knowledge from a source application domain in order to improve or accelerate learning in a target application domain, meta-learning improves the learning procedure itself in order to handle various application domains.

Meta-learning of movement skills has been proposed in robotics and human-robot interaction to efficiently train robot actions from one or a few demonstrations. Duan et al. (2017) proposed a one-shot imitation learning algorithm where a regressor is trained against the output actions to perform the task, conditioned by a single demonstration sampled from a given task. This approach is close to the regression-based technique previously presented in section 2 but is formalized on a set of tasks. For example, the task could be to train the robot-arm to stack a variable number of physical blocks among a variable number of piles. The evaluation methods rely on tests on seen and unseen demonstrations during training. Their results showed that the robot performed equally well with seen and unseen demonstrations.

Adaptation through meta-learning in motor learning has also been investigated with the model-agnostic meta-learning (MAML) method (Finn et al., 2017a), which allows faster weight adaptation to new examples representing a task. Finn et al. (2017b) and Yu et al. (2018) extended the MAML approach for one-shot imitation learning by a robotic arm. Finn et al. (2017b) first demonstrated that vision-based policies can be fine-tuned from one demonstration. They conducted experiments using two types of tasks (pushing an object and placing an object on a target) on both a simulated and a real robot using video-based input data. Their results outperformed previous results (see for instance Duan et al., 2017) in terms of the number of demonstrations needed for adaptation. Yu et al. (2018) then addressed the problem of one-shot learning of motor control policies with domain shift. Their experiments on simulated and real robot actions showed good results on tasks, such as pushing, placing, and picking-and-placing objects.

The MAML method has also found applications in human motion forecasting (Gui et al., 2018), for which large amounts of annotated motion capture data are typically needed. They propose an approach based on combining MAML and model regression networks (Wang and Hebert, 2016; Wang Y.-X. et al., 2017), allowing a good generic initial model to be learned and enabling efficient adaptation to unseen tasks. They showed that the model outperforms baselines with five examples of motion capture data of walking.

4. Adaptation Through Reinforcement Learning

Reinforcement Learning (RL) enables robotic agents to acquire new motor skills from experience, using trial-and-error interactions with their environment (Kober et al., 2013). Contrary to the imitation learning approaches discussed in section 2, where expert demonstrations are used to train a model encoding a given behavior, RL relies on objective functions that provide feedback on the robot's performance.

Most approaches to imitation learning rely on a supervised paradigm where the model is fully specified from demonstrations without subsequent self-improvement (Billard et al., 2016). To ensure good task generalization, imitation learning requires a significant number of high-quality demonstrations that provide variability while ensuring high performance. While RL can raise impressive performance, the learning process is often very slow and can lead to unnatural behavior. A growing body of research investigates the combination of these two paradigms to improve the models' adaptation to new tasks, making the learning process more efficient and improving the generalization of the tasks from a few examples.

Demonstrations can be integrated in the RL process in various ways. One approach consists of initializing RL training with a model learned by imitation (Kober et al., 2013), typically by a human teacher. Demonstrations of such tasks are used to generate initial policies for the RL process, enabling robots to rapidly learn to perform tasks, such as reaching, ball-in-a-cup, playing pool, manipulating a box, etc. A second strategy consists of deriving cost functions from demonstrations, for instance using inverse reinforcement learning (Kolter et al., 2008; Finn et al., 2016). Finn et al. (2016) showed that using 25–30 human demonstrations (by direct manipulation) to learn the cost function was sufficient for the robot to learn how to perform dish placement and pouring tasks.

Building upon the success of Generative Adversarial Networks in other fields of machine learning, Generative Adversarial Imitation Learning (GAIL) has been proposed as an efficient method for learning movement from expert demonstrations (Ho and Ermon, 2016). In GAIL, a discriminator is trained to discriminate between expert trajectories (considered optimal) and policy trajectories generated by a generator that is trained to fool the discriminator. This approach was then extended to reinforcement learning through self-imitation, where optimal trajectories are defined by previous successful attempts (Guo et al., 2018). Several extensions of the adversarial learning framework were proposed to improve its stability or to handle unstructured demonstrations (Hausman et al., 2017; Wang Z. et al., 2017). These recent approaches have been evaluated on a standard set of tasks using simulated environments, in particular OpenAI Gym MuJoCo (Todorov et al., 2012), including continuous control tasks, such as inverted pendulums, 4-legged walk, and humanoid walk.

Recently, Zhu et al. (2018) proposed simultaneous imitation and reinforcement learning through a reward function that combines GAIL and RL. Zhu et al. (2018) evaluated their approach on several manipulation tasks (such as lifting and stacking blocks, clearing a table, and pouring) with a robot arm. Demonstrations were performed using a 3D controller, the training was done in a simulated environment, and the tasks were performed by a real robot arm. In comparison with GAIL or RL alone, the evaluation shows that the combination learns faster and achieves better performance.

5. Discussion

This paper reviews three types of adaptation in machine learning applied to movement modeling. In this section, we discuss how adaptive movement models can be used to support motor learning, including both motor adaptation and motor skill acquisition.

First, motor adaptation mechanisms involve variations of an already-trained skill. Computationally, motor adaptation can be seen as an optimization process that learns and cancels external effects in order to return to baseline (Shadmehr and Mussa-Ivaldi, 1994). Accounting for these underlying variations requires rapid mechanisms and robust statistical modeling. Probabilistic model parameter adaptation (section 2) appears to be a good candidate for understanding the movement variability induced by motor adaptation processes. However, while motor adaptation has been widely studied, very little is known on the statistical structure of motor adaptation, particularly trial-to-trial motor variability (Stergiou and Decker, 2011). Transfer learning could also be used: pre-trained models (RNNs or CNNs) that capture some structure of movement parameters (i.e., low-dimensional subspaces of the parameter space), can be adapted online for fine-grained variations. Here, open questions concern how such variations can account for structural learning in motor control (Braun et al., 2009).

Second, more dramatic changes in movement patterns, as induced by learning new motor skills, might require computational adaptation that involves re-training procedures. Transfer and meta-learning (section 3) involve the adaptation of high-capacity movement models to new tasks and could be used in this context. One difficulty is to assess to what extent transferring a given model to new motor control policies would induce the model to forget past skills. For instance, it was found that movement models relying on deep neural networks might lead to catastrophic forgetting (Kirkpatrick et al., 2017). Also, meta-learning algorithms, such as MAML (Finn et al., 2017a) are currently not suitable for adaptation to several new motor tasks. Self-imitation and reinforcement mechanisms (section 4) could help to generalize to a wider set of tasks. A current challenge is to learn suitable action selection policies. Although exploration-exploitation is known to be central in motor learning (Herzfeld and Shadmehr, 2014), it is yet unclear what process drives action selection in the brain (Carland et al., 2019; Sugiyama et al., 2020). These approaches still need to be experimentally assessed in a motor learning context.

Finally, the last challenge that we want to raise in this paper regards the continuous evolution of motor variation patterns. Motor execution may vary continuously over time due to skill acquisition and morphological changes. Accounting for such open-ended tasks may require new forms of adaptation, such as continuous online learning, as proposed by Nagabandi et al. (2018). We think that this is a promising research direction, raising the central question of computation and memory in motor learning (Herzfeld et al., 2014).

In closing, to be integrated into motor learning support systems, the aforementioned machine learning approaches should be combined with adaptation mechanisms that aim to generalize models to new movements and new tasks efficiently. We do not advocate solely for adaptive machine learning explaining motor learning processes. We propose adaptation procedures that can account for variation patterns observed in behavioral data, leading to performance improvements in motor learning support systems.

Author Contributions

BC made the first draft. All authors contributed to the manuscript, adding content, and revising it critically for important intellectual content.

Funding

This research was supported by the ELEMENT project (ANR-18-CE33-0002) and the ARCOL project (ANR-19-CE33-0001) from the French National Research Agency.

Conflict of Interest

The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

References

Alahi, A., Goel, K., Ramanathan, V., Robicquet, A., Fei-Fei, L., and Savarese, S. (2016). “Social LSTM: human trajectory prediction in crowded spaces,” in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (Las Vegas, NV), 961–971. doi: 10.1109/CVPR.2016.110