Original Research ARTICLE
Humans anticipate the goal of other people’s point-light actions
- 1 Department of Psychology, Uppsala University, Uppsala, Sweden
- 2 Center of Neurodevelopmental Disorders at Karolinska Institutet, Karolinska Institutet, Stockholm, Sweden
This eye tracking study investigated the degree to which biological motion information from manual point-light displays provides sufficient information to elicit anticipatory eye movements. We compared gaze performance of adults observing a biological motion point-light display of a hand reaching for a goal object or a non-biological version of the same event. Participants anticipated the goal of the point-light action in the biological motion condition but not in a non-biological control condition. The present study demonstrates that kinematic information from biological motion can be used to anticipate the goal of other people’s point-light actions and that the presence of biological motion is sufficient for anticipation to occur.
When observing another person performing an action, humans typically fixate the goal of the action before it is completed (Flanagan and Johansson, 2003; Falck-Ytter et al., 2006; Rotman et al., 2006). This anticipatory ability emerges early in infancy and remains an important foundation for social cognitive development throughout life (Kochukhova and Gredebäck, 2010). It allows us to synchronize our actions with collaborators and to stay ahead of competitors, while compensating for the internal processing lag of the perception-action system (Johansson et al., 2001; von Hofsten, 2006).
To date little is known about what type of information observers use to anticipate the goal of others’ actions, specifically with respect to anticipatory eye movements and the ability to shift gaze from a reaching hand to the goal of the reaching action (usually an object) before the hand arrives at this goal. One possibility is that observers detect motion information – low-level kinematic cues – from human actions and use this information to rapidly extrapolate and anticipate the goal of ongoing actions. We refer to this notion as the biological motion hypothesis of action anticipation. According to this hypothesis, kinematic information from biological motion is sufficient for anticipation of action goals to occur. This idea receives indirect support from two paradigms. First, several studies demonstrate that humans anticipate the goal end-state of human manual actions but not the end-state of self-propelled objects that move in similar manners (Flanagan and Johansson, 2003; Falck-Ytter et al., 2006). Thus, these studies suggest that seeing human actions is important for anticipatory gaze shifts to occur. Second, various studies demonstrate that humans are highly sensitive to biological motion information expressed by point-light (PL) displays of human movements (Johansson, 1973; Verfaillie et al., 1994; Falck-Ytter et al., 2010; for a review, see Blakemore and Decety, 2001). In fact, humans are able to detect actions within PL displays and to categorize observed kinematic representations of human actions based on walking patterns, gender, and emotions (Troje, 2002). These findings demonstrate that humans are highly sensitive to biological motion and that their tendency to anticipate others’ action goals seems to be linked to seeing humans act on objects (for alternative processes that might impact goal processing without necessarily relating to anticipatory goal-directed gaze shifts see Southgate et al., 2008; Ramsey and Hamilton, 2010; for a review see also van Overwalle and Baetens, 2009).
While these patterns of results are in line with the biological motion hypothesis of action anticipation, direct evidence is still lacking. One reason for this is that it has not been possible to disentangle “pure” biological motion information from other types of information. While observing “full vision” human actions the putative role of kinematic information from biological motion is confounded by information about the agent performing the action and other visual properties normally associated with human actions, such as texture and hand-object interactions. Moreover, while a large number of studies have demonstrated humans’ sensitivity to biological motion, no study has asked whether human observers can anticipate the endpoint of a goal-directed biological motion PL action by looking at the action goal before the human PL display reaches this goal (i.e., a goal-directed gaze shift; see Manera et al., 2011 for a recent demonstration that humans discriminate between social intentions on the basis of information from PL displays of manual actions).
The current eye tracking study aimed to test the biological motion hypothesis of action anticipation. An anticipatory eye movement paradigm was used in which participants were presented with a PL representation of a hand reaching for a goal object (biological motion condition, Figure 1D) or a non-biological version of the same manual PL display (control condition, Figure 1D) while goal-directed eye movements were recorded with near-infrared eye tracking technology.
Figure 1. Stimuli. (A) Photo showing the hand with 18 attached markers to create the manual point-light (PL) action by means of a motion capture system. (B) Lateral camera perspective from which the PL reaching action was recorded and presented to the participants. (C) Areas of interest (AOI) used in the analysis: first AOI covers the PL display plus one visual degree at the beginning and end of the PL hand, the goal AOI covers the goal object plus one visual degree in each direction. (D) Snapshots of the stimuli movie during the movement phase, during the reaching phase (both separate for each condition), and after the onset of the end effects (movement of goal object and sound, equal in both conditions).
Interestingly, while detection of full body motion (featuring all major joints of the human body) is rapid and accurate (Troje, 2002), pilot data from our lab suggests that PL displays of a hand (without additional information about the arm or the body) are accurately recognized by only some observers, even after substantial repetition (10 PL reaching actions, each lasting 12 seconds). Therefore, in the present study, we asked participants after stimuli presentation to estimate what the PL representation depicted in order to investigate how they perceived the PL display and if action anticipation can be accomplished even when visual information is so sparse that observers fail to report what the PL display represents. Hence, all participants were asked how they would describe the observed PL dots and whether they recognized the PL display as representing a familiar object.
If the biological motion hypothesis of action anticipation is correct, we would expect that (i) gaze will be predictive in the biological motion condition because kinematic information from human movements is sufficient for anticipatory gaze performance and (ii) gaze will be reactive in the control condition which does not include biological motion. Additionally, asking the participants what the PL display represented for them allows us to map whether there is a connection between verbal recognition and anticipation.
Materials and Methods
Thirty-eight adult university students (mean age = 25.3 years, SD = 5.7, 26 females) participated in this eye tracking study. Five additional participants had to be excluded from the analysis. They produced less than five gaze shifts over all 10 trials or were not fixating gaze for 200 ms within the goal areas of interest (AOI) in more than half of all trials (see Data Reduction for a description of AOIs used in the analysis).
The 3D motion from a manual reaching action was recorded using motion capture (QUALISYS, Gothenburg, Sweden). Five cameras captured the movement of 18 markers (Ø 4 mm) attached on the major joints of the hand (Figure 1A). The reaching action was filmed from a lateral perspective with an elevated camera angle recording from approximately 50° from above (Figure 1B). With the elevated camera position, most of the point-lights were continuously visible except when point-lights in the foreground covered other PL markers on the hand (for instance, due to the 3D perspective, not all point-lights were visible during the waving). A 2D movie of the reaching action (the PL display) was generated in MATLAB (MATHWORKS, Natick, USA) and integrated in a virtual environment using Cinema 4D (MAXON, Friedrichsdorf, Germany). The final stimulus movies were created from this virtual environment (Figure 1D) in which the reaching action was presented from the same lateral perspective. The biological motion condition contained a PL hand (horizontal size: 6.2 visual degrees, vertical size 2.4 visual degrees) performing a reaching action in two phases. In movement phase one, the hand was stationary while individual fingers were tapping on a table and waving towards the observer. Within the second phase, the hand reached for and interacted with a goal object partly covered by a barrier. The PL representation of the hand disappeared behind the barrier and subsequently the goal object moved in conjunction with a sound.
In the biological motion condition the velocity and motion profile of the PL markers are consistent with a real reach, including acceleration and deceleration. In the control condition the spatiotemporal profile of every PL marker was manipulated to create a linear (constant velocity) and therefore non-biological movement throughout the whole manual PL action by removing biological acceleration and deceleration of the motion in each dimension individually (X, Y, Z) for every movement unit (see Figure 2).
Figure 2. (A) Position of one point-light marker during the reaching action for the biological motion condition (blue line) and the non-biological motion condition (red line) plotted individually for every dimension X, Y, and Z. The start time for the reach and the points in time when each point-light marker disappears behind the barrier are the same in both conditions. The circle represents an enlargement to highlight the different trajectories. (B) Position of one point-light marker during the whole action sequence, plotted separately for each dimension (X, Y, Z). Blue Lines show the position over time in the biological motion condition. The linear translations in the non-biological control condition (red lines) were created by removing biological acceleration and deceleration of the motion in each dimension individually in both the movement and the reaching phase. The movement direction in the y-dimension was inverted for 50% of the point-light markers (for selected movement units only) in order to disrupt the hand configuration during the movement phase (stationary hand, moving fingers). However, no movement direction was inverted in the other dimensions in order to keep the distance between the point-light markers and the goal area equal across conditions. Moreover, to retain maximal similarity between the conditions during the reaching phase, no movement direction was inverted in this second movement phase.
Converting the biological motion to a linear movement is not very striking for the naked eye and was not sufficient to prevent that the presented PL display in the control condition was perceived as biological motion (as assessed by verbal report). For this reason the movement direction in the y-dimension was inverted for 50% of the PL markers (for selected movement units only) during the first movement phase (stationary hand, moving fingers). However, to retain maximal similarity between the two conditions, no movement direction was inverted within the second movement phase (reaching phase). The two PL displays contained the same total number of PL markers.
Both conditions included a goal object (a toy bear, horizontal size: 7.4 visual degrees, vertical size 12.1 visual degrees) and a barrier (horizontal size: 8.1 visual degrees, vertical size 18.9 visual degrees) that covered the final approach and contact between the PL markers and the goal object (Figure 1D). The barrier was added in order to restrict differences between conditions to the motion profile of individual PL markers prior to contact. A subsequent goal manipulation consisted of a movement of the goal object while a sound was presented simultaneously. The start time for the reach, the duration of the reach as well as the points in time when each PL marker disappears behind the barrier and subsequently enters the goal AOI were identical in both conditions. Besides the congruent overall timing of the action in the two conditions, the onset and duration of end effects were identical in both conditions and synchronized with the time when the PL hand made first contact with the goal object and (see Figure 1D).
After written consent was obtained, participants were seated approximately 60 cm from a Tobii T120 eye tracker (sampling at 60 Hz, Stockholm, Sweden). Each participant was only shown one condition to avoid possible influences from previously perceived PL displays on following stimuli presentations. Consistent with prior studies investigating anticipation of human reaching actions using a single reaching action repeated over many times (see, e.g., Falck-Ytter et al., 2006; Kochukhova and Gredebäck, 2010), one PL reaching action was presented several times in each condition. Nineteen participants per condition were presented with 10 identical stimulus presentations (movies), each lasting 12 seconds with brief attention-grabbing movies in between. After watching the stimuli, participants were asked to fill out a questionnaire inquiring how they would describe the stimuli, whether they recognized the PL display as a familiar object/event, and if so, asked to write down what that object/event was. All participants received a gift coupon with the value of 50 SEK. The study was conducted in accordance with the ethical standards specified in the 1964 Declaration of Helsinki and procedures were approved by the Uppsala Ethical Review Board.
As in previous studies presenting reaching actions where only one goal is present, the goal in this study refers to the goal object (toy bear) the hand is reaching for. Within this paradigm, fixating the goal before the goal is achieved is considered as action anticipation. This is seen as a marker of action understanding (Falck-Ytter et al., 2006; Gredebäck et al., 2010). Two AOIs were used to analyze eye movements (see Figure 1C). In line with prior studies (Falck-Ytter et al., 2006; Gredebäck et al., 2010), one AOI covered the goal object (plus one visual degree in each direction), another equal sized AOI covered the initial position of the PL display (plus one visual degree at the beginning and at the end of the PL hand). Gaze fixations were defined as maintaining gaze in an area of interest for more than 200 ms.
During the second movement phase – the reach – we analyzed gaze shifts from a fixation within the hand AOI (to ensure that the PL display and the movement were perceived) to the goal AOI. Latency was calculated by subtracting the time when gaze first fixated the goal AOI from the time when the PL hand first contacted the goal object (onset of end effects). Latencies of goal-directed gaze shifts were assessed to be anticipatory if gaze fixated the goal AOI before the onset of end effects. The threshold for anticipations was chosen in line with prior studies on action understanding and goal anticipation using a temporal threshold of 0 ms (Johansson et al., 2001; Flanagan and Johansson, 2003; Falck-Ytter et al., 2006; Kochukhova and Gredebäck, 2010; Gredebäck and Melinder, 2011). This conservative criterion of zero ensured that participants actually looked at the goal object before the PL hand has reached this goal. In other areas of research, for instance in studies of object representations, a more liberal criterion incorporating a 200 ms reaction time in anticipations has been used (e.g., Kochukhova and Gredebäck, 2007). We refrain from using the more liberal threshold as this is not in line with most prior action anticipation studies. For a more in-depth discussion on these thresholds see Gredebäck et al. (2010).
Data reduction was performed in MATLAB. Statistical analysis was performed in the following manner: an independent samples t-test and one-sample t-tests assessed the degree to which gaze latencies differed significantly between the two conditions and the extent to which latencies differed from zero within each condition separately. Additionally, gaze performance data in the biological motion condition were analyzed with an independent samples t-test to examine if anticipation differed between participants recognizing the PL hand and those that did not. A χ2-test assessed if the proportion of participants recognizing the PL display differed between conditions. An additional analysis with a different size of the goal AOI (extended AOI including the edge of the barrier nearest the hand) provided the same result as in the presented analysis and demonstrated that a difference in gaze latencies is not restricted to certain AOI sizes. In the following Results section only the smaller AOI (covering the goal object) will be reported.
In line with the predictions, participants anticipated the goal of the PL action in the biological motion condition (M = −124 ms, SE = 28.5), t(18) = 4.36, p < .001, but not the control condition (M = 21.5 ms, SE = 29.4), t(18) = .73, p = .474. Latency of goal-directed gaze shifts differed significantly between conditions, t(36) = 3.56, p = .001, d = 1.16, see Figure 3.
Figure 3. Results. Average gaze latency in the goal AOI over all 10 trials for each participant, separate for the biological and non-biological condition. The horizontal line differentiates anticipatory from reactive gaze shifts. Short horizontal lines mark the group average for each condition. Filled markers represent participants recognizing the point-light display; empty markers represent participants not recognizing the point-light display as a hand.
Eleven out of 19 participants in the biological motion condition reported that the PL display included a human hand. Remaining participants did not recognize the PL display as a hand. They either made no suggestion or reported that the PL display represented a non-human entity like a swarm of bees or an animal. Within this condition no latency differences were observed between those that recognized the hand and those that did not, t(17) = 1.40, p = .179. In fact, both participants recognizing, t(10) = 2.57, p = .028, and participants not recognizing the PL display as a hand, t(7) = 3.81, p = .007 anticipated the goal. Only one participant perceived a hand in the control condition, and this participant did not look at the action in a predictive fashion. More participants recognized the hand in the biological motion condition relative to the control condition, χ2(1) = 9.87, p = .002. No learning effects were observed. That is, latencies of goal-directed gaze shifts were not reduced over the 10 stimulus presentations in neither the biological motion nor the control condition.
This eye tracking study investigated the degree to which adult observers are able to anticipate the goal of a manual reaching action, represented as a PL display, as well as whether conscious recognition of the observed PL action is required for anticipatory eye movements to occur. In line with the biological motion hypothesis of action anticipation, participants anticipated the goal in a biological motion PL display but not in a non-biological control condition. Specifically, gaze arrived at the goal of an observed PL reaching action ahead of time only if the presented manual action contained biological motion information. We also found that recognition of the PL display as a reaching hand was unrelated to the latency of goal-directed gaze shifts. These findings provide a new perspective on the functional role of biological motion detection, at the same time, suggesting that adults are able to anticipate the goal of others’ actions based on kinematic information from biological motion.
Humans detect and process biological motion both rapidly and efficiently (Blakemore and Decety, 2001). The process of detecting and orienting to biological motion has been claimed to aid our ability to infer mental states, such as social intentions (Frith and Frith, 1999; Manera et al., 2011) or mental representations of movements (Freyd, 1983, 1987), to facilitate imitation (Klin et al., 2009), and to allow us to anticipate future goals (Blakemore and Decety, 2001) or future postures (Verfaillie and Daems, 2002) of observed human actions. The current result reveals that adults are able to use biological motion to anticipate the goal of others’ actions, as suggested by Blakemore and Decety (2001). The present study suggests that the fast and effortless processes for detecting biological motion previously documented (for a review see Blake and Shiffrar, 2007) may feed directly onto processes for online anticipation of other’s actions in human adults. This is done in the absence of texture, color, and contour information that is intrinsically associated with everyday human reaching actions. Our core findings demonstrate that the brain can make use of kinematic information from human actions to do more than simply detect biological motion. An ultimate function of biological motion processing may be to anticipate the goals of others.
Of course, it is possible that participants first perceived the reaching hand and used this information to anticipate the presented action (Gergely and Csibra, 2003; Csibra and Gergely, 2007). However, because adults systematically anticipated action goals even if they were not able to consciously recognize the action represented by the PL display, we find this alternative interpretation unlikely. Although we cannot exclude that lower-level recognition processes precede action prediction, the result demonstrates that explicit recognition is not required for anticipation to occur. This result aligns with prior findings from Hemeren and Thill (2011) demonstrating that adult perceivers rely on bottom-up kinematic information rather than high-level conceptual knowledge when they segment manual PL displays into small functional units, so-called motor primitives.
In summary, prior findings show that biological motion is quickly detected (Blake and Shiffrar, 2007) and that action segmentation is based on low-level kinematic information (Hemeren and Thill, 2011). The current study demonstrates that biological motion PL displays are anticipated to a higher degree than non-biological motion PL displays and further suggests that verbal recognition is not required for adults to anticipate biological motion. Together, these findings indicate that the presence of biological motion is sufficient to elicit anticipatory gaze shifts.
Although eye tracking data do not provide direct evidence regarding underlying brain processes, the current findings are consistent with results from neuropsychology. For example, Graf et al. (2007) showed that simulation processes occur in real-time during observation of human PL actions and Casile et al. (2010) demonstrated that observation of human movements consistent with normal kinematic laws caused higher activation in the left dorsofrontal and premotor cortex than observation of human movements violating kinematic laws of motion.
In addition, our results seem to support the notion that a mirror process mediates anticipatory gaze shifts during action observation and that the motor system is involved in anticipation tasks during visual action perception of human motion stimuli (Flanagan and Johansson, 2003; Kilner et al., 2004; Aglioti et al., 2008; Manera et al., 2011).
The mirror-neuron system is a distributed network that becomes activated in a similar manner when we perform actions and when we observe others performing similar actions. It incorporates the inferior frontal and premotor cortex as well as the inferior parietal lobule, with input from the superior temporal sulcus (STS; Rizzolatti and Craighero, 2004). The latter area has also been targeted as responsible for detection of biological motion, presented both as fully visible human actions and PL displays (Allison et al., 2000). Based on these neurological considerations, previous empirical findings and our data, we suggest that humans anticipate the goal of others’ actions in the following manner: (1) biological motion is encoded by the STS, (2) fed forward to the mirror-neuron system, (3) where kinematic information from perceived human actions is mapped onto the observer’s own motor representations. This direct matching process (Flanagan and Johansson, 2003; Giese and Poggio, 2003; Lange and Lappe, 2006) allows the observer to (4) simulate future action goals based on his/her own motor plans, and through connections with occulomotor control systems (5) to initiate an anticipatory gaze shift to the future goal of others’ ongoing actions. This system is rapid, dependent on the presence of biological motion, and independent from a conscious recognition of the observed action.
We suggest that the system where biological motion information from perceived actions is mapped onto the observer’s own motor representations of a hand is fairly specific. It does not map all forms of biological motion onto the entire motor cortex. Instead, biological motion information that corresponds to a reaching hand activates the motor programs involved in a reaching action. Thus, it is as such not enough to present a single dot moving in a biological manner, the biological motion contains of two aspects: the motion profile and the biological configuration of the hand.
Understanding action goals is a complicated process. Some brain areas are activated both during observation of goal-directed human actions and actions of non-human geometrical shapes (Ramsey and Hamilton, 2010). However, our results and prior findings in the field of anticipatory eye movement studies measuring gaze shifts to the goal before it is completed appear closely connected to observing human actions. Yet, more studies are needed to examine the specificity of goal-directed eye movements during action anticipation and to fully understand the role of the observation-execution matching system for online anticipation of goals and actions. One potential avenue for future research is to employ transcranial magnetic stimulation (TMS) in order to directly test the idea that action plans are functionally related to anticipatory eye movements during action observation.
The present study shows that humans anticipate the goal of other people’s point-light actions, a finding in line with the view that gaze anticipations of biological motion are related to the observation-execution matching system. These results add to our understanding of the role of biological motion in action processing.
Conflict of Interest Statement
The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.
We would like to thank Luciano Fadiga, Laila Craighero and Alessandro D’Ausilio for valuable discussions. We also wish to thank Christine Fawcett for comments on the manuscript. This work was supported by grants from the Swedish Research Counsil (VR 2011-1528) and by the Marie-Curie ITN RobotDoC (2010-2013).
Manera, V., Becchio, C., Cavallo, A., Sartori, L., and Castiello, U. (2011). Cooperation or competition? Discriminating between social intentions by observing prehensile movements. Exp. Brain Res. 211, 547–556.
Keywords: biological motion, anticipation, eye movements, direct matching, mirror neuron
Citation: Elsner C, Falck-Ytter T and Gredebäck G (2012) Humans anticipate the goal of other people’s point-light actions. Front. Psychology 3:120. doi: 10.3389/fpsyg.2012.00120
Received: 15 February 2012; Accepted: 02 April 2012;
Published online: 26 April 2012.
Edited by:Peter Neri, University of Aberdeen, UK
Reviewed by:James Kilner, University College London, UK
Sarah Tyler, University of California Irvine, USA
Copyright: © 2012 Elsner, Falck-Ytter and Gredebäck. This is an open-access article distributed under the terms of the Creative Commons Attribution Non Commercial License, which permits non-commercial use, distribution, and reproduction in other forums, provided the original authors and source are credited.
*Correspondence: Claudia Elsner, Department of Psychology, Uppsala University, Blåsenhus, Von Kraemers allé 1, Box 1225, 751 42 Uppsala, Sweden. e-mail: firstname.lastname@example.org