Beyond Extrastriate Body Area (EBA) and Fusiform Body Area (FBA): Context Integration in the Meaning of Actions

Our ability to identify and interpret the actions and intentions of other people in a meaningful way is the bedrock of social cognition. Visual perception of human body is a critical component of this complex task as long as it provides cues which enable the observer to make the required inferences to accurately extract the meaning of daily action events. 
 
During the last decade, neuroimaging studies have identified two brain regions of the extrastriate visual cortex that are highly sensitive to the perception of human bodies and body parts. These regions are the extrastriate body area (EBA), located at the posterior inferior temporal sulcus/middle temporal gyrus (Downing et al., 2001) and the fusiform body area (FBA) found ventrally in the fusiform gyrus (Peelen and Downing, 2005; Schwarzlose et al., 2005). Evidence derived from fMRI studies has shown that both areas become significantly activated in response to body/body parts stimuli visually presented in different formats like photos, line drawings, stick figures, and silhouettes compared to control stimuli like faces/face parts, tools/tool parts, and scenes (Downing et al., 2001; Peelen and Downing, 2005; Schwarzlose et al., 2005; Spiridon et al., 2006; Weiner and Grill-Spector, 2010). Recently, it has been suggested that EBA and FBA can be functionally dissociated, with a more selective activation for local body parts in EBA relative to more holistic images of the human body in FBA (Taylor et al., 2007). 
 
Based on these findings, many authors have claimed that EBA/FBA should be directly involved in complex functions such as perceiving goal-directed actions and other higher-level related processes (Costantini et al., 2005; Saxe et al., 2006; Moro et al., 2008; Marsh et al., 2010; Kuhn et al., 2011). However, as suggested by Downing and Peelen (2011), it might be more accurate to interpret the activity of these regions in terms of populations of neurons that selectively encode and make explicit low-level visual features of human bodies like body shape and posture. According to this hypothesis, the comprehension of meaningful actions could be supported by a more distributed neural network where visual information extracted by EBA/FBA is integrated with the contextual information processed in other parts of the brain. Although this hypothesis seems to be more plausible, the authors do not give further information on how this integration might be accomplished or, more specifically, about which other cortical areas would be actively engaged in this network. 
 
In order to address this issue, we propose a functional neuroanatomic model for the contextual processing of goal-directed actions where the general perceptual processing provided by EBA/FBA is integrated in a larger fronto-insular–temporal network. 
 
When we witness a simple event, our brain integrates the information about people, objects, and the interactions among them into a coherent meaningful representation. For instance, object recognition is thought to be instantiated by cognitive structures that integrate information about the identity of the objects that tend to co-occur in a given context with previously learned information about their possible relationships (Bar, 2004). These structures can be thought of as a set of expectations about what is more probable to see or not to see in a given context, enabling us to make predictions and accurately disambiguate incoming information. We proposed that in a fronto-insular–temporal “social context network” (SCN), several frontal areas update and associate ongoing contextual information in relation to episodic memory and target-context associations (Sigala et al., 2008; Bar, 2009; Burgess et al., 2009). The temporal regions [e.g., the parahippocampal cortex (PHC), hippocampus, and amygdala] may index the value learning of target-context associations (Langston and Wood, 2010). Finally, the insular cortex would coordinate internal and external milieus in an inner motivational state (Singer et al., 2009; Ibanez et al., 2010a). See Figure ​Figure11. 
 
 
 
Figure 1 
 
Lateral view of the left hemisphere showing the proposed fronto-insular–temporal network (light-blue, violet, and green regions of interest, respectively). In this context network, prefrontal areas (PFC) such as frontopolar and dorsolateral prefrontal ... 
 
 
 
This contextual dimension of action understanding is also supported by ERP studies on the N400-like component, an ongoing negativity elicited when a meaningful action is incongruent (unexpected) with a previous context. N400 seems to be a specific context integration component (Bar, 2004). For example, videos and pictures of everyday-life actions, co-speech gestures, and semantic processing of current motor events (Sitnikova et al., 2003; Aravena et al., 2010; Ibanez et al., 2010b, 2011; Proverbio et al., 2010) have shown that, as the action-related stimulus becomes more expected/congruent with the context in which it is embedded, the N400 amplitude is reduced compared to incongruent/unexpected conditions. These findings suggest that when the previous context builds up meaning, processing of upcoming stimulus that fit with that context is facilitated. Evidence derived from lesion studies, MEG, and intracranial recordings includes the left superior/middle temporal gyrus, the anterior-medial temporal lobe, the PHC and fusiform gyrus as well as frontopolar, orbital, and dorsolateral prefrontal regions as the possible sources of this N400 effect (Halgren et al., 2002; Van Petten and Luka, 2006). 
 
Finally, as well as frontal and temporal regions, the insular cortex plays a crucial role in the proposed network. This region has been recently implicated in the contextual integration of interoceptive information (conscious representation about one's body physiological state and motivational drives) with external stimuli (sensory current environment) into a global feeling state (Craig, 2002; Ibanez et al., 2010a). Moreover, anterior insular cortex has also shown to be recruited during motivational decision-making in uncertain contexts, suggesting that this area also mediates risk behavior when the available information is not sufficient to predict an outcome (Singer et al., 2009). 
 
Overall, the SCN provides an empirically testable set of hypotheses regarding contextual update, contextual prediction, and target-context association in action meaning paradigms. For instance, we expect to observe the engagement of the SCN during action meaning processing. This prediction is partially confirmed since frontal, temporal, or insular activations have been previously observed together with EBA and FBA during action paradigms (Kable and Chatterjee, 2006; Lamm et al., 2007; Lamm and Decety, 2008; Hodzic et al., 2009a,b; Cross et al., 2010; Kret et al., 2011). Additionally, a more straight empirical testing would be provided by direct contextual manipulation of action-related stimuli. The use of frames, background information or multimodal designs (as used in other domains of contextual studies, e.g., Bar, 2004) adapted to action meaning tasks would provide simple experimental shortcuts. An ideal experimental approach would comprise a battery of tasks that vary the degree of context for action/non-action stimuli, in order to test the relative engagement of the SCN in EBA/FBA activation during action and non-action processing. We expect that, while manipulating the contextual information (e.g., by increasing its influence), stronger activation in frontal, temporal, and insular regions rather than in EBA/FBA would be observed. Furthermore, if the contextual information being processed crucially requires the extraction of specific information regarding body/body parts (e.g., imagine a task where body posture is important for disentangle the emotional state of a person), we expect that activity in EBA/FBA will be enhanced as much as in the other regions of the SCN. 
 
In brief, we suggest that action meaning is beyond EBA and FBA through the integration of contextual information processed by a distributed fronto-insular–temporal network. Moreover, action meaning is not an amodal, invariant, immutable representation in a brain area, but instead a polymodal, context-sensitive, constructive, and distributed process. Similar to context integration during visual object recognition (Bar, 2004), information of body appearance and posture in EBA/FBA should be integrated within a SCN in order to process action meaning. We propose a multimodal system of action meaning in which expectations (frontal areas) of external information (including body processing in EBA/FBA), interacts with their semantic association (temporal regions) and the current internal motivational states (insula) in order to get a specific significance of an event. Thus, a context–facilitation large-scale distributed neural network may process and influence the EBA/FBA activity in a top-down manner.

Our ability to identify and interpret the actions and intentions of other people in a meaningful way is the bedrock of social cognition. Visual perception of human body is a critical component of this complex task as long as it provides cues which enable the observer to make the required inferences to accurately extract the meaning of daily action events.
During the last decade, neuroimaging studies have identified two brain regions of the extrastriate visual cortex that are highly sensitive to the perception of human bodies and body parts. These regions are the extrastriate body area (EBA), located at the posterior inferior temporal sulcus/middle temporal gyrus (Downing et al., 2001) and the fusiform body area (FBA) found ventrally in the fusiform gyrus (Peelen and Downing, 2005;Schwarzlose et al., 2005). Evidence derived from fMRI studies has shown that both areas become significantly activated in response to body/body parts stimuli visually presented in different formats like photos, line drawings, stick figures, and silhouettes compared to control stimuli like faces/face parts, tools/tool parts, and scenes (Downing et al., 2001;Peelen and Downing, 2005;Schwarzlose et al., 2005;Spiridon et al., 2006;Weiner and Grill-Spector, 2010). Recently, it has been suggested that EBA and FBA can be functionally dissociated, with a more selective activation for local body parts in EBA relative to more holistic images of the human body in FBA (Taylor et al., 2007).
Based on these findings, many authors have claimed that EBA/FBA should be directly involved in complex functions such as perceiving goal-directed actions and other higher-level related processes (Costantini et al., 2005;Saxe et al., 2006;Moro et al., 2008;Marsh et al., 2010;Kuhn et al., 2011). However, as suggested by Downing and Peelen (2011), it might be more accurate to interpret the activity of these regions in terms of populations of neurons that selectively encode and make explicit lowlevel visual features of human bodies like body shape and posture. According to this hypothesis, the comprehension of meaningful actions could be supported by a more distributed neural network where visual information extracted by EBA/FBA is integrated with the contextual information processed in other parts of the brain. Although this hypothesis seems to be more plausible, the authors do not give further information on how this integration might be accomplished or, more specifically, about which other cortical areas would be actively engaged in this network.
In order to address this issue, we propose a functional neuroanatomic model for the contextual processing of goal-directed actions where the general perceptual processing provided by EBA/FBA is integrated in a larger fronto-insular-temporal network.
When we witness a simple event, our brain integrates the information about people, objects, and the interactions among them into a coherent meaningful representation. For instance, object recognition is thought to be instantiated by cognitive structures that integrate information about the identity of the objects that tend to cooccur in a given context with previously learned information about their possible relationships (Bar, 2004). These structures can be thought of as a set of expectations about what is more probable to see or not to see in a given context, enabling us to make predictions and accurately disambiguate incoming information. We proposed that in a fronto-insular-temporal "social context network" (SCN), several frontal areas update and associate ongoing contextual information in relation to episodic memory and target-context associations (Sigala et al., 2008;Bar, 2009;Burgess et al., 2009). The temporal regions [e.g., the parahippocampal cortex (PHC), hippocampus, and amygdala] may index the value learning of target-context associations (Langston and Wood, 2010). Finally, the insular cortex would coordinate internal and external milieus in an inner motivational state (Singer et al., 2009;Ibañez et al., 2010a). See Figure 1.
This contextual dimension of action understanding is also supported by ERP studies on the N400-like component, an ongoing negativity elicited when a meaningful action is incongruent (unexpected) with a previous context. N400 seems to be a specific context integration component (Bar, 2004). For example, videos and pictures of everyday-life actions, co-speech gestures, and semantic processing of current motor events (Sitnikova et al., 2003;Aravena et al., 2010;Ibáñez et al., 2010bIbáñez et al., , 2011Proverbio et al., 2010) have shown that, as the action-related stimulus becomes more expected/congruent with the context in which it is embedded, the N400 amplitude is reduced compared to incongruent/ unexpected conditions. These findings suggest that when the previous context builds up meaning, processing of upcoming stimulus that fit with that context is facilitated. Evidence derived from lesion studies, MEG, and intracranial recordings includes the left superior/middle temporal rather than in EBA/FBA would be observed. Furthermore, if the contextual information being processed crucially requires the extraction of specific information regarding body/body parts (e.g., imagine a task where body posture is important for disentangle the emotional state of a person), we expect that activity in EBA/FBA will be enhanced as much as in the other regions of the SCN.
In brief, we suggest that action meaning is beyond EBA and FBA through the integration of contextual information processed by a distributed fronto-insulartemporal network. Moreover, action meaning is not an amodal, invariant, immutable representation in a brain area, but instead a polymodal, context-sensitive, constructive, and distributed process. Similar to context integration during visual object recognition (Bar, 2004), information of body appearance and posture in EBA/FBA should be integrated within a SCN in order to process action meaning. We propose a multimodal system of action meaning in which expectations (frontal areas) of external information (including body processing in EBA/FBA), interacts with their semantic association (temporal regions) and the current internal motivational states (insula) in order to get a specific significance of an event. Thus, a context-facilitation large-scale distributed neural network may process and influence the EBA/FBA activity in a top-down manner.

Acknowledgments
This work was supported by CONICET and FINECO grants. The authors thank the Editor and Dr. Carlos Gelormini Lezama for their helpful comments.

RefeRences
Aravena, P., Hurtado, E., Riveros, R., Cardona, F., Manes, F., and Ibáñez, A. (2010 ing paradigms. For instance, we expect to observe the engagement of the SCN during action meaning processing. This prediction is partially confirmed since frontal, temporal, or insular activations have been previously observed together with EBA and FBA during action paradigms (Kable and Chatterjee, 2006;Lamm et al., 2007;Lamm and Decety, 2008;Hodzic et al., 2009a,b;Cross et al., 2010;Kret et al., 2011). Additionally, a more straight empirical testing would be provided by direct contextual manipulation of action-related stimuli. The use of frames, background information or multimodal designs (as used in other domains of contextual studies, e.g., Bar, 2004) adapted to action meaning tasks would provide simple experimental shortcuts. An ideal experimental approach would comprise a battery of tasks that vary the degree of context for action/non-action stimuli, in order to test the relative engagement of the SCN in EBA/ FBA activation during action and non-action processing. We expect that, while manipulating the contextual information (e.g., by increasing its influence), stronger activation in frontal, temporal, and insular regions gyrus, the anterior-medial temporal lobe, the PHC and fusiform gyrus as well as frontopolar, orbital, and dorsolateral prefrontal regions as the possible sources of this N400 effect (Halgren et al., 2002;Van Petten and Luka, 2006). Finally, as well as frontal and temporal regions, the insular cortex plays a crucial role in the proposed network. This region has been recently implicated in the contextual integration of interoceptive information (conscious representation about one's body physiological state and motivational drives) with external stimuli (sensory current environment) into a global feeling state (Craig, 2002;Ibañez et al., 2010a). Moreover, anterior insular cortex has also shown to be recruited during motivational decision-making in uncertain contexts, suggesting that this area also mediates risk behavior when the available information is not sufficient to predict an outcome (Singer et al., 2009).
Overall, the SCN provides an empirically testable set of hypotheses regarding contextual update, contextual prediction, and target-context association in action mean- In this context network, prefrontal areas (PFC) such as frontopolar and dorsolateral prefrontal cortices would be involved in the generation of focused predictions via the update of associative activation of representations in the specific context. The insular cortex (IC) would provide the convergence point for emotional and cognitive states related to the coordination between external and internal milieus, facilitating the fronto-temporal interaction in social context processing. Finally, target-context associations stored in the temporal regions (TR) would be integrated with feature-based information processed in frontal regions. EBA and FBA (colored in red), would contribute to this larger network by making explicit lower-level perceptual information regarding body posture and body shape. Connected nodes represent the fronto-insular-temporal interactions. Black arrows show the top-down contextual modulation of activations in EBA and FBA and the bottom-up contribution of these later regions to the SCN.