Original Research ARTICLE
Active inference, attention, and motor preparation
- 1 The Wellcome Trust Centre for Neuroimaging, University College London, Queen Square, London, UK
- 2 Sobell Department of Motor Neuroscience and Movement Disorders, University College London Institute of Neurology, Queen Square, London, UK
Perception is the foundation of cognition and is fundamental to our beliefs and consequent action planning. The Editorial (this issue) asks: “what mechanisms, if any, mediate between perceptual and cognitive processes?” It has recently been argued that attention might furnish such a mechanism. In this paper, we pursue the idea that action planning (motor preparation) is an attentional phenomenon directed toward kinesthetic signals. This rests on a view of motor control as active inference, where predictions of proprioceptive signals are fulfilled by peripheral motor reflexes. If valid, active inference suggests that attention should not be limited to the optimal biasing of perceptual signals in the exteroceptive (e.g., visual) domain but should also bias proprioceptive signals during movement. Here, we investigate this idea using a classical attention (Posner) paradigm cast in a motor setting. Specially, we looked for decreases in reaction times when movements were preceded by valid relative to invalid cues. Furthermore, we addressed the hierarchical level at which putative attentional effects were expressed by independently cueing the nature of the movement and the hand used to execute it. We found a significant interaction between the validity of movement and effector cues on reaction times. This suggests that attentional bias might be mediated at a low level in the motor hierarchy, in an intrinsic frame of reference. This finding is consistent with attentional enabling of top-down predictions of proprioceptive input and may rely upon the same synaptic mechanisms that mediate directed spatial attention in the visual system.
During the preparation and execution of goal-directed movements, processing is biased toward the perceptual attributes of the goal (e.g., Baldauf and Deubel, 2010; Gherri and Eimer, 2010; Humphreys et al., 2010; Perfetti et al., 2010) and preparation or execution of an action improves perceptual processing in relevant sensory domains (Fagioli et al., 2007). This suggests motor planning and attention are inherently linked, such that “perceptual codes and action plans share a common representational medium, which presumably involves the human premotor cortex” (Fagioli et al., 2007). This relates to the concept of motor attention that is specific to the effectors employed (Rushworth et al., 2001) and decision making through attentional selection among motor plans (Goldberg and Segraves, 1987). Moreover, the premotor theory of visual attention (Rizzolatti et al., 1994) proposes that distinct maps are tuned to different effector representations and become active when a movement is prepared. In short, attention has a fundamental role in the selection and control of action; see Allport (1987) for a review.
The link between action and attention and was first proposed by James (1890) and Woodworth (1899): however, the cognitive and neural mechanisms responsible for this association remain largely unknown (Dalrymple and Kingstone, 2010). Greenwald (1970) provided evidence that attention to a particular sensory modality speeded movements that are detected in that modality: In the oculomotor system, visual discrimination performance is enhanced at the target location of a prepared saccade (Deubel and Schneider, 1996). Furthermore, stimulation of the superior colliculus can produce both eye movements (Robinson, 1972) and shifts of attention (Müller et al., 2005). Conversely, Craighero et al. (1999) showed that reaction times to visually presented objects are reduced when subjects grasp the objects being presented, illustrating the motor facilitation of sensory processing.
In this paper, we entertain the idea that motor attention uses exactly the same synaptic mechanisms as visual attention. This may sound strange because motor commands are usually considered to be outputs, whereas the visual channels selected by attention are inputs. However, recent theoretical treatments of motor control (active inference) regard action as being driven by proprioceptive prediction errors in exactly the same way that perception is driven by exteroceptive prediction errors (Friston et al., 2010). If true, this means that attentional modulation may operate at low levels in the motor system in the same way that it operates in the early visual system. We sought evidence for this by reproducing a classical visual attention paradigm (Posner et al., 1978; Posner, 1980) in the motor domain. Furthermore, by cueing attention to different attributes of movements we tried to locate the putative attentional modulation within the motor hierarchy. We hoped to show that attentional effects were expressed in low levels (in an intrinsic frame of reference) in much the same way that directed spatial attention operates in the early visual pathways. This paper comprises four sections. The first rehearses the theoretical background that motivated a reaction time study described in the second section. The third section presents our results, which are discussed in relation to theoretical considerations in the final section.
Active Inference and Motor Attention
In this section, we consider motor preparation as attention that is directed toward predicted proprioceptive sensations (Galazky et al., 2009), as opposed to the predicted exteroceptive consequences of action. This idea is motivated by the success of a recent computational model of attention in explaining reaction times benefits in visual detection tasks (Feldman and Friston, 2010). In this model, the effects of orienting cues on reaction times were explained by the Bayes-optimal encoding of precision in a hierarchical message-passing scheme (predictive coding). In this context, precision is the inverse variance or uncertainty associated with particular sensory channels, such that attention can be understood as weighting sensory signals in proportion to their precision (Feldman and Friston, 2010; Friston, 2010). In these predictive coding schemes, precision is encoded by the gain of units reporting bottom-up sensory information that has yet to be explained by top-down predictions. This sensory information is called prediction error and is generally associated with the activity of superficial pyramidal cells: these cells are the source of forward or bottom-up projections in the brain (Rockland and Pandya, 1979; Mumford, 1992; Friston, 2010). In these schemes, attention therefore reduces to the optimization of the postsynaptic gain of superficial pyramidal cells, of the sort associated with gamma-synchronization (e.g., Womelsdorf et al., 2006) and monoaminergic or cholinergic modulation (e.g., Herrero et al., 2008); both of which have been implicated in attention. Here, we pursue the notion that attention is the optimum weighting of prediction error in the context of action preparation (Mars et al., 2007; Bestmann et al., 2008). In short, we consider attention to boost the gain of proprioceptive channels during motor preparation, in the same way that attention selects particular visual channels when subjects prepare for a visual target. In what follows, we will briefly review predictive coding and active inference, with a special focus on the role of attention.
Predictive Coding and Active Inference
Predictive coding is based on the assumption that the brain makes inferences about the causes of its sensations. These inferences are driven by bottom-up or forward sensory information that is passed to higher brain areas in the form of prediction errors (Rao and Ballard, 1999; Friston and Kiebel, 2009). Top-down or backward connections convey predictions that try to suppress prediction errors until predictions are optimized and prediction error is minimized. This suppression rests on opposing excitatory and inhibitory effects of top-down predictions and bottom-up inputs on prediction-error units (usually considered to be superficial pyramidal cells: Mumford, 1992). Active inference (Friston et al., 2010) generalizes this scheme and proposes that exactly the same recursive message-passing operates in the motor system. The only difference is that prediction errors at the lowest level (in the cranial nerve nuclei and spinal cord) are also suppressed by movement, through classical reflex arcs. In this view, descending (cortico-spinal) signals are not motor commands per se but predictions of proprioceptive signals that the peripheral motor system fulfills (see Friston et al., 2010, 2011 for details). As illustrated in Figure 1, a cued movement is not regarded as a simple stimulus–response mapping but is generated by a high-level (sensorimotor) percept that predicts a particular pattern of proprioceptive and exteroceptive sensory signals. This percept arises to explain prediction errors caused by a cue in the exteroceptive domain, while motor reflexes suppress the ensuing prediction errors in the proprioceptive domain. This framework has been used to explain several features of the motor system and a series of behaviors, from visual tracking (Friston et al., 2010) to action observation (Friston et al., 2011). Active inference formalizes much of what is proposed by the ideomotor theory of action (Lotze, 1852; James, 1890). The ideomotor account of motor control posits that moving causes a bidirectional association to be formed between a movement and its perceptual consequences. Learning this association allows the perceptual consequences of a movement to be predicted, and anticipating the sensory consequences of a movement can be used to select an action. At the level of the stretch receptors, the similarity is clear: signaling the predicted sensory consequences of an action (under active inference) causes the action to occur. At higher hierarchical levels, movements can still be initiated in order to change the sensory input in another sensory system; indeed the free-energy principle demands the sampling of predicted information to minimize free energy or, more simply, surprise. See Figure 1 for a schematic illustration.
Figure 1. Active inference and predictive coding: Active inference is a generalization of predictive coding that covers motor behaviors and itself is a special instance of the principle of free-energy minimization. Free energy is a statistical quantity that bounds the surprise (self-information) associated with sensory signals. This surprise is quantified in relation to a generative model of how those signals were caused. Predictive coding uses prediction error as a proxy for free energy (cf, surprise) and rests on a hierarchical model, in which prediction errors are passed up the hierarchy (red arrows) to optimize high-level representations that provide top-down predictions (black arrows). In this schematic, prediction-error units are portrayed in red and units encoding the conditional expectations of the hidden causes of sensory input are shown in blue. During perception, the best explanation for sensory input emerges when the top-down predictions can explain as much of the prediction error (at each hierarchical level) as possible. Active inference takes this one step further and notes that certain sensory modalities can use prediction errors to drive motoneurons to eliminate prediction error directly (through classical motor reflex arcs). This is shown schematically on the lower left, using units in the dorsal and ventral horns of the spinal cord. Under active inference, a movement just fulfills the predictions afforded by percepts that predict both exteroceptive (e.g., visual) and interoceptive (e.g., stretch receptor) consequences. This high-level (sensorimotor) percept is activated by an exteroceptive (sensory) cue and the ensuing top-down predictions propagate to both sensory cortex (to suppress exteroceptive prediction error) and the motor system. However, in the motor system, the predictions engender a proprioceptive prediction error that is eliminated by movement. In this schematic, we have assumed that prediction errors are reported by superficial pyramidal cells (Mumford, 1992), while conditional representations are encoded by (top-down) projecting deep pyramidal cells. Darker units highlight those activated by the presentation of a target-stimulus.
Attention and Active Inference
Attention enters this picture through context or state-dependent optimization of the precision of prediction errors. This sort of prediction is about the second-order statistics of sensory signals (i.e., their variability or reliability). In predictive coding, top-down first-order predictions drive (or inhibit) neurons reporting prediction errors; while contextual, second-order predictions optimize their postsynaptic gain. It is this sort of top-down effect that is associated with attention. Neurobiologically, the distinction between first and second-order predictions can be related to the distinction between the driving and modulatory effects mediated by AMPA and NMDA receptors. Optimizing postsynaptic gain ensures that sensory information (prediction error) is weighted in proportion to its precision. This may sound complicated but is exactly the same procedure we use every day in statistics, when weighting a difference in means (prediction error under the null hypothesis) by SE (inverse precision) to form a t-statistic. Precision can thus be regarded as representing the reliability, ambiguity, or uncertainty about sensory signals. In summary, top-down predictions can have a direct (first-order) or a modulatory (second-order) effect on the responses of prediction-error units that make the ensuing predictions as efficient as possible. Reaction time (Goodman and Kelso, 1980), cortico-spinal excitability (Mars et al., 2007; Bestmann et al., 2008), and EEG data (Osman et al., 1995; Mars et al., 2008) all confirm that the motor system is highly sensitive to such second-order effects.
If ascending sensory signals are prediction errors and descending motor commands are predictions, then optimal predictions (and the resulting movements) should depend on optimizing precision in exactly the same way as in sensory processing. This suggests that, in the motor domain, cueing has a similar effect to that observed in the sensory domain: Rosenbaum (1980) first demonstrated an effect of movement cueing on reaction time in a way that is analogous to the accelerated detection of visual targets when they are preceded by valid cues in the Posner paradigm (Posner, 1980). However the movements cued in Rosenbaum (1980) were button presses, which required either visual or somatosensory attention to guide movement to the target. Thus, these non-proprioceptive aspects of button presses conflate attentional effects in visual, somatosensory, and proprioceptive domains. In other words, in previous work movements were planned in relation to an object in extra-personal space. Here, we used a simpler paradigm in which movements (wrist flexion and extension) could be performed using only proprioceptive information. This ensured that any attentional effects could be attributed to proprioception. Our motor analog of the Posner paradigm therefore allowed us to interpret our results in relation to visual attention as modeled in Feldman and Friston (2010); and to illustrate how active inference provides a framework in which to address questions about the functional anatomy of action preparation and attention.
Cueing in an Extrinsic or Intrinsic Frame of Reference?
A key question in the functional anatomy of motor attention is where biasing effects are located in the cortical hierarchy: see Grafton and Hamilton (2007) for a review of motor hierarchies. In the sensory domain, attention is usually considered to operate at the lower levels of sensory hierarchies to select among competing sensory processing channels. This is seen in both psychological (e.g., the distinction between object and spatial visual attention: Treisman, 1998; Macaluso et al., 2003) and electrophysiological treatments (e.g., biased competition models: Desimone and Duncan, 1995). If the functional anatomy of the motor hierarchy recapitulates that of sensory hierarchies, then one might expect to see attentional modulation in lower levels, which we will associate with representations in an intrinsic frame of reference.
Electrophysiological evidence demonstrates that between the ventral premotor cortex and M1 neurons change their response patterns from signaling movements in a visual (extrinsic) coordinate system that is independent of starting posture to a motor (intrinsic) coordinate system that depends on starting posture (Kakei et al., 1999, 2001, 2003). Thus in ventral premotor cortex, actions are largely encoded allocentrically, while in M1 they are predominantly encoded in terms of the joint angles and proprioceptive input required to reach the target (Soechting and Flanders, 1992). Shipp (2005) suggests that neurons representing movements in an intrinsic frame of reference send descending cortico-spinal predictions from M1. Kakei et al. (2003) provide a detailed discussion of movement representations in terms of the coordinate transformations that begin with an “extrinsic coordinate frame representing the spatial location of a target and end with an intrinsic coordinate frame describing muscle activation patterns.” It should be noted however, that the segregation of intrinsic and extrinsic representations between motor and premotor cortex may not be complete or unique (Wu and Hatsopoulos, 2007).
These observations suggest two possible levels of the motor hierarchy at which attentional cueing effects could operate. Consider movements with two dimensions or attributes that are cued in an extrinsic frame of reference; for example, moving the left or right hand (where) inward or outward (what). If attention operates at high levels of the motor hierarchy, then one might expect cues to move the hand inward will facilitate inward movements, irrespective of which hand is used. This is because the representation of the movement can be primed in extrinsic coordinates, prior to transformation to intrinsic coordinates. Conversely, if attention operates at lower levels, encoding the muscle groups involved in inward movements of the left hand, then attentional priming will only be expressed when the left hand is moved inward. In short, if attention operates on prediction errors in an intrinsic frame of reference, the effect of the what cue will depend upon the where cue.
In summary, if sensorimotor constructs mediate attentional biases in an extrinsic frame of reference, we would expect to see cueing effects on both dimensions independently. Conversely, if these representations instantiate top-down biases at a lower (intrinsic) level of the motor system, then only a particular movement (in an intrinsic frame of reference) will be cued. Figure 2 tries to make the different predictions clear in terms of top-down enabling of postsynaptic gain (indicated with blue arrows). Crucially, the profile of speeded responses (under valid and invalid cueing) is different for extrinsic and intrinsic levels of attentional gain. In the intrinsic (motor cortex) model, there should be an interaction between the validity effects of cues over both movement dimensions. Conversely, under the extrinsic (premotor cortex) model, there should be no interaction but two main effects due to the validity of both what and where aspects of the cue. It was this difference in the profile of validity effects on reaction times our experiment was designed to reveal.
Figure 2. Different levels of attentional bias: This schematic illustrates the top-down enabling of postsynaptic gain (blue arrows) at different levels in the motor hierarchy. In the left panel, the predictions of an inward (flexion) movement of the left-hand selectively bias the intrinsic prediction-error units that elicit inward movements of the left hand. This means that when a valid target-stimulus appears, these prediction errors will produce a more efficient and speeded movement (be eliciting stronger descending predictions). Conversely, if the attentional bias is mediated at the premotor (extrinsic) level, the prediction errors associated with both what and where aspects of the movement will facilitate speed responses over both movement dimensions; e.g., all left-hand movements and all inward movements. In this figure, darker units highlight prediction-error units with increased gain. The lower graphs show the predicted profile of reaction times (under valid and invalid cueing) for cueing at extrinsic (right) and intrinsic (left) levels. In the intrinsic (motor cortex) model, there should be an interaction between the validity effects of cues over both movement dimensions. In other words, the benefit using the expected hand will only be seen if the expected movement is required. Conversely, under the extrinsic (ventral premotor cortex) model, there should be no interaction but two main effects due to the validity of what and where aspects of the movement respectively.
Based on the results of Jentzsch et al. (2004) and the retinotopic frame of reference of attentional effects in the Posner paradigm (Woldorff et al., 1997), we hypothesize that attentional cueing operates in an intrinsic frame of reference. We therefore expected to see an interaction between the validity effects of cueing, with speeded responses when, and only when, both what and where dimensions were valid.
Materials and Methods
Eight healthy right-handed volunteers (two female), aged 19–42, participated in this experiment. All subjects provided written and informed consent and the experiments were conducted in compliance with the standards established by the local ethical committee.
Experimental Procedure and EMG Recordings
Subjects were seated in a comfortable reclining chair. Their wrists were in a semi-supine position with the palms facing each other and supported by a splint that restricted wrist and hand movement to pure flexion and extension. The hand-splints were mounted on vertical spindles, which allowed rotation in the transverse plane. The hands were positioned such that the wrist joints sat directly above the axes of rotation. Additional support of the forearms further ensured that movements were constrained to the wrists, and reduced fatigue. Stimuli were viewed on a screen placed at eye level. Each trial started with a (150 ms) cue stimulus, followed by a blank screen (see Figure 3). Seven hundred millisecond after the appearance of the cue, a target-stimulus appeared for 400 ms. A 50-ms white-noise mask was presented after the cue and target stimuli to prevent the appearance of visual after-effects. Participants were given 1000 ms after the appearance of the target-stimulus to make a response. No feedback was given. At the appearance of the target-stimulus, participants were required to respond as quickly as possible with the movement indicated. Four movements were possible – flexion or extension at the left or right wrist. The cue and target stimuli had two dimensions – color (blue, red) and spatial frequency (high, low). For four of the participants, the color of the stimulus cued the hand (e.g., blue = left, red = right) and the spatial frequency indicated the movement (e.g., high frequency = flexion, low frequency = extension). For the remaining four, the stimulus–response mapping was reversed, so that color indicated the movement to be made and spatial frequency the hand to be used. The stimuli subtended approximately 35° of visual angle. High-frequency stimuli were 2.5 c/deg, low frequency were 0.25 c/deg. The colors had RGB values ([128 0 0] [255 100 100]) and ([0 0 128] [100 100 255]).
Figure 3. Experimental Design: Top panel: Schematic showing the time-line of three experimental trials, which comprised cue stimuli that could be congruent (valid) or incongruent (invalid) over each of their two dimensions (what:extension vs. flexion; where:left vs. right hand). Bottom panel: Example EMG trace acquired from a single muscle, plotted with the transform used for identifying movement onset. The line shows the ad hoc threshold used to derive reaction times automatically.
Participants were required to relax their hands and lower arms until the appearance of the target-stimulus. Our paradigm independently cued which motor and (right or left) would implement one of two movements (wrist flexion or extension). Each cue contained two dimensions – one signaling the hand to be moved and one the movement. For each dimension (color, spatial frequency), cue stimuli could be valid (80%) or invalid with regards to the target-stimulus (20%). Since the validity of the cue in each dimension was independent, this gave 64% (0.8 × 0.8) of trials with a completely valid cue, 32% (0.8 × 0.2 × 2) of trials where either the hand or the movement required was invalidly cued and 4% (0.2 × 0.2) of trials where both the hand and movement were cued invalidly. The experiment comprised one training session and 25 experimental sessions. Each session contained 100 trials, which were balanced for the four types of cue and four movements. The large number of trials was needed to acquire sufficient data from trials with invalid cues in both dimensions. The sessions were conducted over three separate days.
Reaction times were evaluated using surface EMG. Ag/AgCl electrodes were placed on the left and right brachioradialis/extensor carpi radialis longus and flexor carpi ulnaris muscles. Muscle activity was monitored throughout the experiment to ensure the effector muscles were relaxed before the appearance of the target-stimulus. Signals were recorded via a CED 1401 laboratory interface (Cambridge Electronic Design Ltd., Cambridge, UK) and stored on a personal computer (for later analysis) at a sample rate of 5 kHz (Signal 2.0, Cambridge Electronic Design Ltd.). Data were bandpass-filtered between 3 Hz and 2.5 kHz.
EMG data were smoothed with a Butterworth low-pass filter with a cutoff frequency of 600 Hz to increase signal-to-noise. After full-wave rectification the data were log-transformed to provide normally distributed time series for further analysis. The mean of 100 consecutive data points was compared with the mean of the preceding 5000 data points, using two-sample t-tests and a sliding window. Reaction times were defined operationally as the first time at which the absolute value of the t-statistic exceeded 50. This ad hoc threshold identified the highest number of correctly performed trials. Incorrect trials, where a muscle other than the agonist for the correct movement showed the shortest reaction time, were excluded.
A standard summary statistic method was used for statistical inference, using the log of the mean reaction times (to correct for positive skew) over each of the four conditions, for each subject. Univariate five-way ANOVA was performed in SPSS, with factors HAND CUE VALIDITY (valid vs. invalid), MOVEMENT CUE VALIDITY (valid vs. invalid), HAND (left vs. right), MOVEMENT (flexion vs. extension). Factors SUBJECT and STIMULUS–RESPONSE MAPPING were nested and were implemented in two separate ANOVA models.
Thirteen percentage of trials (range over subjects 8–22%) were discarded. Of these trials, in 2% no movement was made or no movement could be identified. In the remaining 11%, an incorrect movement was made (error trials). Error trial frequency varied significantly by cue type (p < 0.001, χ2 > 400, 1 d.f.), with errors less likely on validly cued trials. The most common error (64% of errors) was making the incorrect movement with the correct hand. The least common error (10% of errors) was making the correct movement with the wrong hand. Among invalidly cued trials, performing the movement specified by the cue stimulus rather than the target-stimulus occurred significantly more often (p < 0.05, χ2 > 6.01, 1 d.f.). Since the EMG measured the onset of movement rather than the endpoint, changing the response before the movement was completed resulted in an error trial. This may explain the comparatively high error rate seen here, compared with more traditional button-press paradigms.
The grand average reaction time was 334 ms. There was no significant main effect of HAND, MOVEMENT, or STIMULUS–RESPONSE MAPPING, so the ANOVA model including SUBJECT as a factor was used for further analysis. There were significant main effects of HAND CUE VALIDITY [F(1,7) = 90.54, p < 0.001, partial η2 = 0.928], MOVEMENT CUE VALIDITY [F(1,7) = 171.12, p < 0.001, partial η2 = 0.961, η2 = 0.155], and SUBJECT [F(1,7) = 9.29, p < 0.003, partial η2 = 0.797]. There were two significant two-way interactions – MOVEMENT × MOVEMENT CUE VALIDITY [F(1,7) = 4.98, p = 0.048, partial η2 = 0.449], and, as anticipated, MOVEMENT CUE VALIDITY × HAND CUE VALIDITY [Figure 4; F(1,7) = 233.34, p < 0.001, partial η2 = 0.971]. As expected, the fastest mean reaction time was seen when both cues were valid (see Table 1). Figure 4 highlights the nature of this interaction with reference to the profiles predicted by high (extrinsic) and low (intrinsic) levels of facilitation in the motor hierarchy. It is clear that this profile is consistent with attentional bias at the (motor cortex) level of representation, in an intrinsic frame of reference. Quantitatively, these results suggests that the validity effect is expressed primarily when both cue dimensions were jointly valid.
Figure 4. Reaction time effects for the four combinations of cue validity: the top panels show the results predicted by the theoretical architectures of Figure 2. The green lines correspond to valid movement cues and the blues lines to invalid movement cues. The empirical results are shown in the lower panel using the same colors. The bars correspond to SE over subjects. The form of the interaction observed is very close to that predicted under a model where attention biases prediction errors in an intrinsic frame of reference (Figure 2).
Paired t-tests among the four validity categories confirmed that only one pair failed to show a significant difference (after Bonferroni correction): movement cue valid, hand cue invalid, and movement cue invalid, hand cue invalid (p > 0.2). All other pairwise differences were highly significant (p < 0.001).
We have pursued the idea that attention is an integral part of motor control and expresses itself through biasing the precision afforded to the proprioceptive and somatosensory consequences of an anticipated action (Galazky et al., 2009). This places previous proposals that link motor preparation and attention (cf, Allport, 1987; Goldberg and Segraves, 1987; Rizzolatti et al., 1994; Rushworth et al., 2001; Humphreys et al., 2010; see Tipper, 2004 for an overview) in the general framework of active inference and predictive coding. The important perspective provided by active inference is that movements fulfill predictions furnished by percepts with both exteroceptive (e.g., visual) and proprioceptive (e.g., stretch receptor) components.
We have previously demonstrated that the reaction time benefits of cueing can be understood as statistically optimal responses, where the associated optimization of precision can account for both psychophysical and electrophysiological phenomena fairly accurately (Feldman and Friston, 2010). In this paper, we asked whether similar reaction time benefits can be seen empirically in the motor domain. To this end, we adapted the paradigm developed by Rosenbaum (1980), in which two different visual dimensions (color and spatial frequency) cued the impending movement. As in Rosenbaum and Kornblum (1982), we predicted and confirmed that cueing effects would occur only when both cue dimensions were valid. Our predictions were based on the possible outcomes of attentional bias at different levels in the cortical hierarchy; which we associate with representations in extrinsic (higher) and intrinsic (lower) frames of reference: In an extrinsic model, one would predict that cueing effects enact their influence independently and to a comparable degree. As outlined above, the interaction between the two validity factors argues for an intrinsic model, in which hand and movement are selectively enabled in a way that cannot be separated. In the present case, the observed interaction can be accounted for by a model where precision is increased in proprioceptive channels that represent the confluence of top-down predictions about the nature of a movement and where it will be implemented (see Figure 2).
In addition to the interaction above, there was a small reaction time benefit from a valid hand cue, even if the movement cue was invalid. The magnitude of this effect was much smaller compared to the reaction time benefit seen for two valid cues (66 vs. 237 ms). This, and the lack of any benefit for a valid movement cue if the hand cue is invalid, means that a model in which precision operates at the intrinsic level is still the most likely. The small validity effect of a valid hand cue might be explained in the framework of active inference; because the movements performed in this experiment were self-limited, the same muscles were recruited for both flexion and extension movements, to either initiate or terminate the movement. Thus, if the precision of the stretch receptor channels in one forearm were boosted after cuing that side, a small benefit might accrue for the opposite movement.
Rushworth et al. (1997) also demonstrated a benefit for valid cuing using a similar paradigm. Spatial cues were used, and the motor preparation time was calculated from the difference between two conditions: a simple cuing task in one movement dimension, and a control task where the movement made did not depend on the validity of the cue. A small reaction time benefit was seen for valid cues.
In Rosenbaum (1980), some aspects of the movement were left unspecified until the appearance of the target-stimulus. Unlike our study, Rosenbaum saw separable effects of cuing just the arm, the direction and the extent of the upcoming movement. However, there is a key difference between our paradigm and that of Rosenbaum (1980) that may account for the difference. The button-press responses used in Rosenbaum (1980) entail visuomotor and somatosensory–motor integration. This means that attentional cueing effects in the visual or somatosensory domains cannot be disambiguated from purely proprioceptive attention. Our paradigm avoided conflating multiple attention processes by cueing movements that could be performed using only proprioceptive channels (simple, self-terminated flexion, and extension movements). This means that one can attribute the cue validity effects to attentional modulation of proprioceptive signals, in accordance with active inference. Furthermore, Rosenbaum’s cues were semantic (letters), whereas ours used low-level visual features which were arbitrarily mapped onto flexion and extension movements. The complexity of the semantic cues meant that most of the reaction time advantages seen in Rosenbaum (1980) could be accounted for by validity effects on processing visual targets and their semantic content and not on the movements per se. In short, the simplicity of our movements and cues suggests a motoric rather than sensory locus for attentional cueing.
A further study (Rosenbaum and Kornblum, 1982), which resembled ours except that only two of four possible movements were possible in each trial, did not find that correctly cuing one response attribute benefited reaction time. They found the opposite – violating the hand and movement cues increased reaction times relative to violating just the movement cue. Their explanation for this was that both movements were simultaneously prepared, but choosing between two movements on the same hand takes longer because the movements are more “similar.” The larger number of possible movements in our experiment meant that simultaneously preparing all responses was unlikely (our flexion and extension movements used the same motor plant, while index and middle finger movements were used in Rosenbaum and Kornblum, 1982). By contrast, Miller (1982) found a contradictory effect – advance information of which hand to use gave a reaction time advantage, whereas advance information of which finger (on either hand) did not.
How can these discrepancies be resolved? Cui and Deecke (1999) found anatomically congruent movements were performed faster than spatially congruent movements, suggesting that anatomically congruent movements are prepared together in the motor hierarchy, or, alternatively that the mapping from extrinsic to intrinsic coordinates is more efficient. Despite the anatomical distance between [pre]motor cortex in each hemisphere, activity in these areas may be influenced at an early stage during motor preparation. If left and right effectors are competing alternatives for subsequent actions (cf. Cisek and Kalaska, 2010), several (bilateral) representations can in principle occur in an intrinsic frame of reference at the same time. Our results suggest that predictions about impending movements are integrated to boost processing in effector-based (intrinsic) coordinates.
Goodman and Kelso (1980) suggested that stimulus–response mapping time is shorter for cued movements. If this were the case, we would expect cues correct in one response that mention to provide some reaction time benefit for the other. The locus of such an effect would likely be before the motor stage; i.e., early in the stimulus–response interval. However, evidence from EEG studies suggests that the effects of cueing occur relatively late, again suggesting an effect in intrinsic coordinates: for example, the lateralized readiness potential (LRP), an EEG potential evoked when one hand is cued, has been suggested to be the halfway point between premotor and motor processing (Osman et al., 1995). This is supported by the finding that it occurs nearer to the movement during trials with informative cues than those without, although the stimulus–LRP latency does not change (Jentzsch et al., 2004). Finally, we note that a locus of the motor attentional effect in intrinsic coordinates provides an interesting parallel with results from the Posner paradigm. The reaction time benefit associated with cues in most visual paradigms seems to occur in retinotopic (intrinsic) rather than allocentric (extrinsic) frames of reference (Posner and Cohen, 1984; Golomb et al., 2008).
We have explored the idea that motor preparation is an attentional phenomenon that is directed toward proprioceptive sensations (i.e., predicted sensory feedback of the anticipated motor response). This perspective suggests that attention should not be limited to perceptual processing in the exteroceptive (e.g., visual) domain but should also bias interoceptive inference during movement. We verified this prediction by adapting a classical attention (Posner) paradigm for a motor setting. Furthermore, we tried to establish the hierarchical level this attentional bias operates by cueing the movement and effector independently. Our behavioral results demonstrate an interaction between the validity of movement and effector cues. This suggests that the bias for the selected action is mediated at a low level in the motor hierarchy, in an intrinsic frame of reference. More generally, the ideas outlined above provide a heuristic framework in which to address questions about the link between motor preparation and attention, and their mechanistic underpinnings.
Conflict of Interest Statement
The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.
This work was funded by the Wellcome Trust and the Biotechnology and Biological Sciences Research Council (BBSRC).
Allport, D. A. (1987). “Selection for action: some behavioral and neurophysiological considerations of attention and action,” in Perspectives on Perception and Action, eds H. Heuer, and A. F. Sanders (Hillsdale, NJ: Erlbaum), 395–419.
Bestmann, S., Harrison, L. M., Blankenburg, F., Mars, R. B., Haggard, P., Friston, K. J., and Rothwell, J. C. (2008). Influence of uncertainty and surprise on human corticospinal excitability during preparation for action. Curr. Biol. 18, 775–780.
Cui, R.-Q., and Deecke, L. (1999). High resolution DC EEG of the Bereitschaftspotential preceding anatomically congruent versus spatially congruent bimanual finger movements. Brain Topogr. 12, 117–127.
Galazky, I., Schütze, H., Noesselt, T., Hopf, J. M., Heinze, H. J., and Schoenfeld, M. A. (2009). Attention to somatosensory events is directly linked to the preparation for action. J. Neurol. Sci. 279, 93–98.
Herrero, J. L., Roberts, M. J., Delicato, L. S., Gieselmann, M. A., Dayan, P., and Thiele, A. (2008). Acetylcholine contributes through muscarinic receptors to attentional modulation in V1. Nature 454, 1110–1114.
Mars, R. B., Bestmann, S., Rothwell, J. C., and Haggard, P. (2007). Effects of motor preparation and spatial attention on corticospinal excitability in a delayed-response paradigm. Exp. Brain Res. 182, 125–129.
Perfetti, B., Moisello, C., Lanzafame, S., Varanese, S., Landsness, E. C., Onofrj, M., Di Rocco, A., Tononi, G., and Ghilardi, M. F. (2010). Attention modulation regulates both motor and non-motor performance: a high-density EEG study in Parkinson’s disease. Arch. Ital. Biol. 148, 279–288.
Posner, M. I., Nissen, M. J., and Ogden, W. C. (1978). “Attended and unattended processing modes: the role of set for spatial location,” in Modes of Perceiving and Processing Information, eds H. L. Pick, and M. J. Saltzman (Hillsdale, NJ: Lawrence Erlbaum Associates), 137–157.
Woldorff, M., Fox, P., Matzke, M., Lancaster, J., Veeraswamy, J., Zamarripa, F., Seabolt, M., Glass, T., Gao, J., Martin, C., and Jerabek, P. (1997). Retinotopic organization of the early visual spatial attention effects as revealed by PET and ERPs. Hum. Brain Mapp. 5, 280–286.
Keywords: priming, motor preparation, action selection, attention, precision, free energy, active inference
Citation: Brown H, Friston K and Bestmann S (2011) Active inference, attention, and motor preparation. Front. Psychology 2:218. doi: 10.3389/fpsyg.2011.00218
Received: 08 June 2011; Accepted: 21 August 2011;
Published online: 21 September 2011.
Edited by:Michela C. Tacca, Heinrich-Heine University Düsseldorf, Germany
Arnon Cahen, Ben Gurion University in the Negev, Israel
Reviewed by:Peter Konig, University of Osnabrück, Germany
Matthew Rushworth, University of Oxford, UK
Copyright: © 2011 Brown, Friston and Bestmann. This is an open-access article subject to a non-exclusive license between the authors and Frontiers Media SA, which permits use, distribution and reproduction in other forums, provided the original authors and source are credited and other Frontiers conditions are complied with.
*Correspondence: Harriet Brown, Wellcome Trust Centre for Neuroimaging, Institute of Neurology, Queen Square, London WC1N 3BG, UK. e-mail: email@example.com