Edited by: Rava A. Da Silveira, Ecole Normale Supérieure, France
Reviewed by: Robert C. Wilson, Princeton University, USA; Gianluigi Mongillo, Paris Descartes University, France
*Correspondence: Alaa A. Ahmed, Neuromechanics Laboratory, Department of Integrative Physiology, University of Colorado, Boulder, 1725 Pleasant St., UCB 354, Boulder, CO 80309-0354, USA e-mail:
This article was submitted to the journal Frontiers in Computational Neuroscience.
This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.
Risk frames nearly every decision we make. Yet, remarkably little is known about whether risk influences how we learn new movements. Risk-sensitivity can emerge when there is a distortion between the absolute magnitude (actual value) and how much an individual values (subjective value) a given outcome. In movement, this translates to the difference between a given movement error and its consequences. Surprisingly, how movement learning can be influenced by the consequences associated with an error is not well-understood. It is traditionally assumed that all errors are created equal, i.e., that adaptation is proportional to an error experienced. However, not all movement errors of a given magnitude have the same subjective value. Here we examined whether the subjective value of error influenced how participants adapted their control from movement to movement. Seated human participants grasped the handle of a force-generating robotic arm and made horizontal reaching movements in two novel dynamic environments that penalized errors of the same magnitude differently, changing the subjective value of the errors. We expected that adaptation in response to errors of the same magnitude would differ between these environments. In the first environment, Stable, errors were not penalized. In the second environment, Unstable, rightward errors were penalized with the threat of unstable, cliff-like forces. We found that adaptation indeed differed. Specifically, in the Unstable environment, we observed reduced adaptation to leftward errors, an appropriate strategy that reduced the chance of a penalizing rightward error. These results demonstrate that adaptation is influenced by the subjective value of error, rather than solely the magnitude of error, and therefore is risk-sensitive. In other words, we may not simply learn from our mistakes, we may also learn from the value of our mistakes.
Effective movement relies largely on adaptation: the process of correcting control from movement to movement, which, critically, is driven by movement error (Topka et al.,
A distortion between the magnitude (actual) and the subjective value of an error suggests a risk-sensitive decision-making process in the brain (Bernoulli,
Adaptation is frequently studied in humans by exposing them to novel dynamic environments (Lackner and Dizio,
We sought to determine if we could alter adaptation patterns by manipulating the subjective value of movement errors with identical magnitude. If adaptation to errors with identical magnitude differs with the subjective value of the error, this would quantitatively demonstrate that adaptation is risk-sensitive and influenced by the subjective value of error.
To examine the influence of the subjective value of movement error on adaptation, we created a task in which we modulated the subjective value associated with a movement error of a given magnitude. Participants made reaching movements while holding the handle of a force-generating robot arm (Figure
The first dynamic environment, Stable, was a velocity-dependent force field, which perturbed their reaching movements and required participants to compensate in order to reach the target. The force field pushed the handle to the left, away from the target which was directly ahead. To reach the target accurately, participants had to push to the right to effectively cancel out the perturbation. The perturbation generated by the robot changed in magnitude but not direction from trial-to-trial. This is a well-studied paradigm where results have consistently demonstrated that healthy adults adapt in a manner proportional to movement error magnitude (Shadmehr and Mussa-Ivaldi,
In the second dynamic environment, Unstable, movement errors in the right half of the screen were heavily penalized, thereby altering the subjective value of an error of a given magnitude relative to the Stable environment. The Unstable environment was identical to the Stable environment, except that we imposed a boundary on the right side of the screen, simulating a virtual cliff. Errors to the right of this boundary would lead to instability: large rightward perturbing forces that participants could not compensate for within that trial (Figure
If adaptation was influenced by an individual's subjective value of the error, then the relationship between the magnitude of error experienced on one trial, and the adaptation observed on the following trial in response to that error would differ between movement environments. To predict the nature of this difference, we developed a risk-sensitive model of movement adaptation by building upon a commonly-used model of movement adaptation that predicts proportional adaptation to movement error (Thoroughman and Shadmehr,
Here,
In a risk-sensitive formulation, the amount of adaptation can depend on the subjective value associated with a given error:
While this function can theoretically take a variety of forms, we will use a simple piecewise linear function to differentially weight rightward and leftward errors, similar to functions used to describe increased sensitivity to positive vs. negative rewards (Niv et al.,
Clearly, when α = β, rightward and leftward errors are valued equally and there is no distortion between the magnitude and subjective value of an error. Effectively, we are back to (2), which can now be described as risk-neutral adaptation. However, in the Unstable phase, the addition of a cliff on the right creates a subjective error-value function that explicitly penalizes rightward errors more than leftward errors such that α > β > 0. It can now be seen how a distortion between the magnitude and subjective value of an error will manifest as risk-sensitivity in the learning process which arises from sensitivity to outcome variance. The non-linear transformation of movement errors results in asymmetric learning from rightward and leftward errors. This asymmetry will lead to outcome variance being penalized or favored.
The risk-sensitive model described above was used to simulate adaptation to a random sequence of perturbation gains and the results are shown in Figure
In the present experiment, our independent variable is subjective value, which is modulated by the experimental environment: Stable vs. Unstable. Therefore, we compared adaptation to identical error magnitudes
Twenty healthy right-handed participants were recruited for the study. Nine participants performed the main experiment (mean ±
The task involved making horizontal reaching movements while seated and grasping the handle of a robotic arm (Interactive Motion Technologies, Shoulder-Elbow Planar Robot. The position of the handle was provided on each trial in the form of a yellow, circular cursor on a computer screen in front of the participants (Figure
The experimental protocol consisted of four phases: Baseline, Stable, Unstable, and Washout (Figure
In the above equation,
The divergent force field was present only in the region to the right of the white line, and was accompanied by an audiovisual cue that the participants had crossed the line (a bright red screen with black text indicating that the participants had crossed the “cliff”). While no explicit instructions were given regarding the presence of the white line, participants inevitably crossed the line, which resulted in experiencing the cliff-like dynamics. The divergent field was only present for the initial 50 trials of the Unstable phase. After the initial 50 trials the divergent field was removed, leaving only the audiovisual cue to indicate that the participants had crossed the line. For the remaining trials in the Unstable phase, participants were not adapting in the presence of a true instability, but rather the threat of instability. Importantly, the distance to the cliff was specifically chosen to be greater than the majority of movement errors, such that participants rarely crossed the cliff but were merely alerted to its presence. The Unstable phase was followed by the Washout phase, which consisted of 50 null (no force) trials to washout adaptation to the novel dynamics.
When participants are exposed to environments that are unstable throughout the workspace, they adapt by increasing joint stiffness through muscle coactivation (Burdet et al.,
Robot handle position, handle velocity, and robot generated force were recorded at 200 Hz. Movement error was calculated as the perpendicular displacement from a line connecting the start and target circles. Movement error was measured early in the reaching movement (5 cm in the y-direction from the center of the home circle, Figure
To test our predictions we analyzed adaptation in three different ways, and compared adaptation between phases. Although our predictions are based on the relationship between movement error and adaptation, it is difficult to control for movement error experimentally. Instead, we controlled for trial gain, as a proxy for movement error, and thereby ensured an equal number of trials for each gain. It is not unreasonable to do so, as previous studies have shown a linear relationship between gain and error (Fine and Thoroughman,
Adaptation was calculated based on the error observed in each trial, as a function of the gain experienced on the previous trial. Because the gain on each trial influenced the movement error on that trial, it was necessary to normalize errors before quantifying adaptation. Movement errors were normalized to the average error for the gain experienced on that trial. In order to quantify the influence of the gain on the current trial,
Adaptation to the gain,
We also sought to calculate adaptation in another manner, to confirm that the results were not dependent upon our specific definition of adaptation, provided in (8). To do so, we used the following state-space model (Fine and Thoroughman,
The output of this model,
Because our predictions are specifically based on adaptation to movement errors, a third analysis was performed on the experimental data, grouping the adaptation by experimental movement error rather than by trial gain. Trials were sorted into eleven 0.5 cm bins ranging from −3.0 cm through 2.5 cm. Leftward errors greater than 3.0 cm and rightward errors greater than 1.5 cm were excluded from the analysis. This size and range of the bins was selected to ensure an even distribution of trials between the bins, while minimizing the number of trials excluded from the analysis. Specifically, if there were an anomalously small number of trials in a given bin (<50), and any subject did not have a single trial in that bin, the bin was removed from the analysis.
To further explore our hypotheses regarding subjective value, a second experiment was conducted in which the experimental setup was reversed. The cliff edge was placed to the left of the center of the screen, and the associated divergent force field pushed the handle to the left if the cursor was placed to the left of the cliff. The curl force field now pushed the handle to the right such that participants were still pushed away from the cliff edge as in the main experiment. Otherwise all experimental parameters remained the same, such as trial number, trial type, force magnitude and distribution. Six naive participants completed this reverse experiment and we compared adaptation between the Stable and Unstable conditions. In this experiment, because the placement of the cliff increased the subjective value of leftward errors relative to rightward errors, the predictions were reversed, although the hypotheses conceptually remained the same. Participants were expected to reduce adaptation to rightward errors (away from the cliff) and/or increase adaptation to leftward errors (toward the cliff).
To explore the possibility that participants might reduce adaptation as a result of repeated exposure to the distribution of gains, a control experiment was conducted. Naive participants performed identical reaching movements as in the main experiment, with an identical set of gains, but without the presence of the unstable cliff. In this experiment, the white line that represented the edge of the cliff never appeared, nor did the divergent forces that were present beyond the cliff edge in the Main and Reverse experiment. The experiment consisted of four phases: Baseline, Early Stable, Late Stable, and Washout. The Baseline and Washout phases were identical to those in the main experiment, while the Early and Late Stable phase in the Control experiment consisted of a total of 600 reaching trials dynamically identical to those in the Stable Phase of main experiment. Since there was no Unstable phase in this experiment, we renamed the phases for clarity. The Early Stable and Late Stable phases correspond, in terms of trial number, to the Stable and Unstable phases of the main experiment: the first 200 force field trials and the last 350 force field trials, respectively.
In this study we altered the subjective value of error between the Stable and Unstable phases. To determine whether subjective value of error influenced adaptation, we used a linear mixed effects regression model to analyze the error and adaptation observed in each experiment. The mixed effects model was selected because of its ability to consider changes between the phases of the experiment while considering intra-participant variability. It is called a “mixed” effects model because it models both random and fixed effects. In this case, the within subject (random) and between subject (fixed) effects are both taken into account in the final output of the model. First, we determined whether gain was a suitable proxy for error by including gain as a factor. We also included phase as a factor to test whether there was a difference in error between the two phases. A phase by gain interaction term was also included to determine whether there were differential effects of phase at individual gains. To confirm that adaptation was influenced by gain we included gain in the model as a factor. We also included phase as a factor to determine whether phase influenced adaptation. Finally, to determine whether adaptation to the larger gains (leftward errors) and/or smaller gains (rightward errors) were differentially affected by phase, we also included a phase by gain interaction term. Planned comparisons were carried out on adaptation between phases to the strongest and weakest gains or to the most leftward and rightward errors (for the error-based adaptation). For the model gain-based adaptation, sensitivity, rather than gain, was included as a factor as well as a sensitivity by phase interaction term. Similarly, for the error-based behavioral adaptation, movement error, rather than gain, was included as a factor as well as an error by phase interaction term. The level for statistical significance was set at α = 0.05.
In order to determine whether the subjective value of an error can modulate adaptation we quantified adaptation to random perturbations during reaching movements in two novel dynamic environments: (1) a stable environment, and (2) an unstable environment in which we altered the subjective value of rightward movement errors compared with leftward errors (Figure
We began our analysis by examining the movement error for each gain. In both phases, rightward errors greater than 2.5 cm were rare. Trial movement error was grouped by gain into bins and separated by phase. The results of the linear mixed effects regression model indicated that there was a main effect of gain (
If adaptation differed between the two phases, this should influence the average movement trajectories and, accordingly, the average movement error. Specifically, if adaptation to leftward errors was reduced and adaptation to rightward error increased in the Unstable phase, then movement errors should be more leftward, away from the cliff. Indeed, the linear mixed effects regression model also indicated there was a main effect of phase. In other words, there was a significant difference in movement error between phases (
Participants moved with similar velocities in both conditions (paired
In the Stable phase, the average adaptation plotted as a function of the gain on the previous trial displays a linear relationship (Figure
Because the relationship between adaptation and gain appears to exhibit a non-linearity to the strongest gain (−40 Ns/m), we also explored the possibility that the reduction in adaptation to the strongest gain was significant enough to alone cause the change in the slopes of these curves. An identical analysis was performed on the data set after removing those data associated with the strongest gain. When we excluded the data corresponding to adaptation at the strongest gain, there was no longer a significant difference in the slopes of the adaptation curves for each phase (
We next turn to the model gain-based analysis of adaptation. While the elements of the sensitivity vector,
Finally, we performed an error-based analysis. Although the experimental design did not explicitly control for error, this analysis would provide confirmation, albeit inherently variable, that adaptation to a given error differed between phases. Despite the increased variability in the results, the analysis supported the findings of the gain-based analysis. Trials were sorted by the magnitude of movement error into 11 bins, each 0.5 cm wide, ranging from −3.0 cm through 1.5 cm. The same bins were used for all participants. Similar to the gain-based analysis, effect of bin and phase were observed, as well as a bin by phase interaction (
In the main experiment, the cliff was located to right, increasing the subjective value of rightward errors compared with leftward errors. In the Reverse experiment, we reflected the cliff location so that leftward errors had a greater subjective value than rightward errors. As in the main experiment, we compared both movement error and adaptation between the Stable and Unstable phases. As expected, there was a main effect of gain in both the error and adaptation analyses (both
To ensure that the changes in adaptation between the phases were not simply the result of prolonged exposure to the viscous curl field, a control experiment was conducted in which participants made 650 reaching movements without the presence of the unstable cliff region. Similar to the main experiment, we compared movement error and gain-based behavioral adaptation between phases. These data were also analyzed using the linear mixed effects regression model with gain and phase included as factors, and a gain by phase interaction term. As expected, there was a main effect of gain in both the error and adaptation analyses (both
Here we have presented results from three different experiments, using three different analyses demonstrating that subjective value can influence adaptation, and ultimately adapted behavior. When the cliff was on the right, adaptation to leftward errors, away from the cliff, decreased, leading to greater leftward errors (away from the cliff, Figure
In this study, we investigated the influence of the subjective value of movement error on adaptation during a novel reaching task. We found that introducing a cliff-like region in the workspace, and thereby changing the subjective value of error, we could modulate the degree to which participants would adapt to movement errors of the same magnitude. Weaker adaptation was observed in response to movement errors away from the cliff in the Unstable phase, when such errors had lower subjective value. These results are a demonstration of a risk-sensitive process in movement adaptation, in that adaptation was influenced by the subjective value of error rather than solely the magnitude of error. Our findings indicate that we don't simply learn from our mistakes, we may also learn from how much we value our mistakes.
It is intriguing that participants primarily demonstrated reduced adaptation to leftward errors away from the cliff (the strongest gains). They could have additionally, or alternatively, demonstrated stronger adaptation to the rightward errors toward the cliff (the weakest gains). However, all three analyses consistently demonstrated weaker adaptation to leftward errors. Even in the Reverse experiment, only weaker adaptation to rightward errors (away from the cliff) was observed. Why not use both strategies? First, let us emphasize that both strategies lead to overall under-compensation for the force field, and increasing errors away from the cliff. This is not surprising, as one would like to avoid the cliff as much as possible. However, the target must still be reached, and greater errors may compromise one's ability to reach the target within the time constraints. Participants may have realized that weaker adaptation to the leftward gains, under-compensating for only the strongest gains, sufficiently allowed them to avoid the cliff, yet reduce leftward errors as well. Under-compensation for all the gains would have reduced the possibility of crossing the cliff boundary even more, but at the expense of increased leftward errors. By reducing adaptation and under-compensating only to the largest gains, participants were effectively optimizing a tradeoff between performing the task and avoiding the worst-case scenario.
A critical element of the experimental design is that participants changed how they adapted without regularly experiencing the penalty. Because the forces experienced in the Stable phase caused movement errors away from the cliff region, participants rarely experienced errors large enough in magnitude to result in the cursor entering the unstable region. It was merely the threat of instability that led to changes in adaptation, not the instability itself or the surprise associated with the initial experience with the instability. The instability was actually not present for most of the experiment. These results demonstrate that adaptation can be modulated indirectly without explicitly constraining movement.
We propose our observation of a difference in adaptation is evidence of risk-sensitivity in the learning process, where risk-sensitivity is defined as sensitivity to the variance over outcomes (i.e., error). Although we do not explicitly modulate the error variance in this task, a distortion between the subjective value and actual value of an error will manifest as risk-sensitive behavior. Thus, a strong prediction that emerges from these results is that increasing or decreasing the variance of error in an environment that resembles the cliff-like environment created in the present study, will alter the adaptation process.
Recent studies have shown that subtle changes in the properties of a given movement error can drive distinct changes in properties of the adaptation process (such as savings and generalization to other movement contexts; Kluzik et al.,
The subjective value of an error could be interpreted as the reward associated with an error. Surprisingly, only a few studies have examined the influence of reward, or the role of reinforcement learning, on movement adaptation. One such study demonstrated that participants could learn a movement task using only reward feedback, in the absence of sensory feedback of the error (Izawa and Shadmehr,
While risk-sensitivity has been assessed in single movements (Wu et al.,
In the present study, subjective value was modulated by increasing the penalty associated with a given movement error. In other words, we provided negative rewards, and cannot necessarily extrapolate our findings to conditions where subjective value is altered by modulating positive rewards. A mounting body of evidence over the past few years has indicated that positive and negative reward differentially affect decision making. Results of a recent study seem to indicate that participants relied more heavily on positive feedback (Averbeck et al.,
It may be argued that the risk-sensitive behavior demonstrated in this task emerges only because we have explicitly designed it to, and does not represent natural movement tasks. In other words, humans do not inherently penalize rightward errors more than leftward errors; we explicitly designed a task that did just that. How then is this relevant to motor control? Such a distortion between the actual and subjective value of a movement error is inherent to many activities of daily living. The simple reaching movements frequently studied in laboratories over the past couple decades do not normally demonstrate this distortion so this phenomenon has been largely overlooked. But postural movements, which are inherently unstable and constrained within a given base of support, are a natural example of a situation where the magnitude of an error does not correspond to the subjective value of that error. A 2 cm movement error within the base of support, is very different that the same 2 cm error that moves the center of mass beyond the base of support. The latter will result in a loss of balance, the former will not. Results from a recent study suggest that postural learning is modulated asymmetrically by stability limits, suggesting that adaptation may be risk-sensitive (Manista and Ahmed,
The finding that adaptation can be modulated by changing subjective value is of great relevance to current rehabilitation programs for patients suffering from neurological impairments such as a stroke or Parkinson's disease. It may be possible to influence the adaptation process, in a manner tailored to each patient, by simply rewarding some movements more than others. Future studies should investigate alternative means of modulating subjective value, via implicit rewards like verbal instruction, encouragement or visual feedback. Alternatively, explicit rewards such as point rewards and penalty and/or monetary compensation could be investigated.
The experimental design prevented us from investigating participant's initial adaptation to the Unstable phase. It would be of interest to know how their adaptation changed during those initial 50 trials, but because of the variability and the small sample size during this period we were unable to perform this analysis. A second limitation is that the removal of the unstable forces after the initial 50 trials in the Unstable phase, may have influenced the strength of the overall effect. Participants did occasionally cross the cliff edge after the unstable forces were no longer present. While there was still an audiovisual penalty, the lack of unstable forces may have influenced the participants' avoidance of the cliff edge.
In summary, our results provide evidence for a risk-sensitive process underlying movement adaptation. Adaptation can be altered by modulating an individual's subjective error value function. The implications of these findings are far-reaching and could potentially lead to new and improved rehabilitation therapies that are tailored to each and every patient at an individual level. More generally, we hope they can lead to significant advances in our understanding of the neural mechanisms by which risk influences the learning process in both motor and non-motor tasks.
Michael C. Trent designed and performed research, analyzed data and wrote the paper. Alaa A. Ahmed designed and performed research, analyzed data, and wrote the paper.
The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.
This work was supported by the Defense Advanced Research Projects Agency Young Faculty Award (DARPA YFA D12AP00253) and National Science Foundation Grants SES 1230933 and CMMI 1200830.