Original Research ARTICLE
Why is that hammer in my coffee? A multimodal imaging investigation of contextually based tool understanding
- School of Applied Physiology, Georgia Institute of Technology, Atlanta, GA, USA
Appropriate tool–object pairing is a natural part of our lives. When preparing to stir coffee, we know that a hammer is useful for some tasks but that it is not appropriate in this behavioral context. The neural correlates of this context–tool pairing process remain unclear. In the current work, we used event-related electroencephalography (EEG) and functional magnetic resonance imaging (fMRI) to determine neural correlates for differentiating contextually correct and incorrect tool use. Subjects were shown images depicting correct (e.g., spoon used to stir coffee) or incorrect (e.g., hammer used to stir coffee) tool use. We identified distinct regional and temporal activations for identifying incorrect versus correct tool use. The posterior cingulate, insula, and superior temporal gyrus preferentially differentiated incorrect tool–object usage, while occipital, parietal, and frontal areas were active in identifying correct tool use. Source localized EEG analysis confirmed the fMRI data and showed phases of activation, where incorrect tool-use activation (0–200 ms) preceded occipitotemporal activation for correct tool use (300–400 ms). This work extends our previous findings to better identify the neural substrate for contextual evaluation of tool use, and may contribute to our understanding of neurological disorders resulting in tool-use deficits.
Regions of the brain showing activation to tools as a distinct class of objects have been well characterized; viewing tools activates regions surrounding the temporal–parietal–occipital junction, extending along the temporal cortex and into parietal and frontal areas (for review, see Lewis, 2006). This suggests the existence of a network, primarily along the ventral visual stream (Milner and Goodale, 2008), for the identification and conceptual understanding of tools. Further, work in humans and monkeys shows a link between neural activations for tool viewing and information about actions associated with their use (Rizzolatti et al., 1998; Chao and Martin, 2000), and that viewing manipulable objects may induce a simulation of potential actions which may be performed with those objects (Tucker and Ellis, 1998).
Recognition of a particular tool for a specific task, and understanding that a given tool will not work in all tasks, is part of daily life. Comprehending appropriate and inappropriate tool–object associations may be related to knowledge of a specific tool’s function, thus involving both perception and action (Milner and Goodale, 2008). The action-related stream, including the posterior parietal cortex, is active even when no overt motor action is required, such as recognition of graspable versus non-graspable objects (Hattori et al., 2009) and manipulable versus non-manipulable objects (Mecklinger et al., 2002). Part of this action knowledge is related to understanding when and when not to use a particular tool. For example, to eat we recognize the advantage of using a fork instead of a key. This suggests knowledge of usability of a tool (e.g., fork) on specific objects (e.g., food) but not on all other objects (e.g., lock, nail, etc.).
The posterior parietal cortex is related to both action knowledge (e.g., motor) and internal representation of actions (e.g., cognition; Creem-Regehr, 2009). Our own work has identified a left-hemisphere network for differentiation of matching and mismatching tool–object pairings (Mizelle and Wheaton, 2010b). Specifically, activation was seen in superior temporal cortex, insula, posterior cingulate, and precuneus in distinguishing mismatching versus matching tool–object pairs. This network was not active in evaluation of matching versus mismatching environmental image pairings, thus strongly suggesting a discrete network for detecting incorrect tool–object relationships. Others have evaluated the understanding of tool similarity based on action relatedness (comparing tools used in the same way) or functional relatedness (comparing tools used in the same context; Canessa et al., 2008), and highlighted the importance of retrosplenial and inferotemporal cortex in understanding functional properties of tools. Further, processing of information related to object manipulation and function has been shown to engage similar neural structures (Boronat et al., 2005), suggesting the conceptual identification of objects may involve binding the perceptual input (e.g., identity) and output (e.g., usage) representations of those objects. Others using ERP analyses have identified the N400 effect in response to identification of anomalous tool use (Sitnikova et al., 2003, 2008). Similarly, this response has been seen in extracting movement-related semantic information, such as identifying the incorrect conclusion of an action sequence (Reid and Striano, 2008) and in determining uncooperative hand–hand interactions (Shibata et al., 2009).
Despite the literature reviewed above on the relationship of perceptual and functional aspects of tools, little is known of how we determine the conceptual “correctness” of tool–object interactions. As an example of a tool being used in a motorically plausible (e.g., can physically be used) but conceptually incorrect scenario (e.g., not the most appropriate tool), a person could use a hammer to stir coffee. One recent study examined brain activation as subjects evaluated spatiomotor versus functional qualities of tool–object interactions (Bach et al., 2010). However, this type of judgment is distinct from asking if a particular tool is correct or most effective for a particular behavioral goal. The current work specifically focuses on the neural activations for understanding contextually correct and incorrect tool–object interactions.
Our current study was designed to specifically evaluate the neural mechanisms for the conceptual understanding of tool–object interactions. As an advancement from our previous work (Mizelle and Wheaton, 2010b,c), here we used functional magnetic resonance imaging (fMRI) and electroencephalography (EEG) to record neural activations as subjects identified whether tools were used in correct or incorrect contexts. A role is suggested for the ventral stream in providing semantic/contextual information to parietofrontal areas prior to interaction with a tool or object (Creem and Proffitt, 2001b; Valyear and Culham, 2010). In our previous work, a distinct temporal–insula–precuneus–cingulate network was engaged in differentiating matching from mismatching tool–object pairings (Mizelle and Wheaton, 2010b). However, this study used relatively simple line drawings to evaluate the functional congruence of tool–object pairs, and did not specifically identify the contextual nature of using tools. We address this in the current study by using high-resolution static photographs depicting implied action of hand-held tools interacting with objects. These interactions were either correct or incorrect based on the context of the tool–object interaction.
Given the previous work described above, we expect activation of the traditional parietofrontal “tool use” network for identification of contextually correct tool use, but primary activations at temporal areas, insula, cingulate cortex, and cuneus/precuneus for identification of incorrect contextual tool use. As we are using EEG to augment our fMRI analyses, we will also have the ability to determine temporal activation differences. Specific to EEG, we expect to see regional/temporal differentiation in the current work, where ventral areas (identified above) will show earlier activation differences for incorrect over correct contextual tool use and later differences will be seen at dorsal areas for correct over incorrect contextual tool use.
Fifteen right-handed healthy subjects (nine females, 25.6 ± 2.8 years of age) participated in both the fMRI and EEG parts of this study. Handedness was confirmed using the Edinburgh Handedness Inventory (Oldfield, 1971). Informed consent was obtained from all participants according to Georgia Institute of Technology human subjects Institutional Review Board guidelines prior to participation in the study. Each subject completed fMRI and EEG recording sessions while evaluating pictures for the contextual correctness of tool use. For both fMRI and EEG, identical images were used. Pictures were high-resolution grayscale images of a right hand holding various tools in a correct orientation (e.g., hammer held by handle) being used in a correct (e.g., hammer used to drive a nail) or incorrect (e.g., hammer used to stir coffee) context. As control, images of tools alone (e.g., hammer lying on a table) were shown. The details of each experimental session will be described below, and are outlined in Figure 1. As these were static images, no auditory stimulation was delivered to the subjects.
Figure 1. (A) Experimental design for fMRI sessions. Six runs of functional scans were conducted with approximately 1 min of rest between each. Within each run, 24 images were presented (eight correct, eight incorrect, and eight tool-only) with a 2-s duration. Fixation crosses (6 or 8 s duration) were presented between images. (B) Experimental design for EEG sessions. Within each of two blocks of recording, 75 images were presented (25 correct, 25 incorrect, 25 tool-only). Each image was preceded by a fixation circle (4–6 s duration) and alerting cue (500 ms duration).
MRI Methodology and Analysis
All MRI data were acquired using a 3-T Siemens Trio MRI scanner using a 12-channel head-coil. T2*-sensitive functional imaging was performed using a gradient-echo echo-planar imaging (EPI) sequence (time to echo [TE] = 30 ms, time to repetition [TR] = 2000 ms, 90° flip angle, and field of view [FOV] = 204, 68 × 68 in-plane matrix, 37 axial 3 mm thick slices with 10% slice gap. For B0 unwarping, echo spacing was 0.49 ms, phase encoding was A > P [“y-”]) to obtain functional images. The scanned area covered the entire cortex and most of the cerebellum. To obtain structural three-dimensional volume, T1-weighted images were acquired using a MP-RAGE sequence (TI = 850 ms, TR = 2250 ms between shots, TE = 3.98 ms, 9° flip angle, FOV = 256 mm × 256 mm, 176 1 mm sagittal slices, 256 × 256 matrix).
Figure 1A depicts the fMRI experimental protocol. Each subject viewed six series of images during the functional scanning session. Each series of images contained eight images of correct tool use, eight images of incorrect tool use, and eight images of tools alone. Images were pseudorandomly ordered within and across runs, and were presented with pseudorandom inter stimulus intervals of 6 or 8 s (mean of 7 s). A black fixation cross was presented to the subjects between images. Each image was presented for 2 s, and subjects were instructed to silently evaluate the image for content (e.g., whether the image showed tool use or a tool alone), and, if a tool-use image, to determine whether the image showed contextually correct or incorrect tool use. See Appendix for a complete list of tool–object combinations.
Preprocessing and statistical analysis of imaging data were performed using the Statistical Parametric Mapping Software package (SPM8; Wellcome Department of Imaging Neuroscience, London, UK) implemented within MATLAB (MathWorks, Sherborn, MA, USA). Functional image volumes were corrected for slice timing skew using temporal sinc interpolation and realigned to the first acquisition using rigid-body transformation. The mean image of the realigned images was spatially normalized to the SPM-standard EPI template. This SPM template is in Montreal Neurological Institute (MNI) space (http://www.bic.mni.mcgill.ca) and approximates the standard stereotaxic space of Talairach and Tournoux (1988). Linear and non-linear deformation parameters estimated during this step were subsequently applied to all realigned EPI volumes of the corresponding time series. Spatially normalized images were then smoothed using a Gaussian kernel (8 mm full-width at half-maximum). Low frequency drifts were removed with a temporal high-pass filter (0.006 Hz).
Realigned, spatially normalized and smoothed T2*-weighted EPI images were analyzed using SPM8 in the framework of the general linear model (Friston et al., 1995). Both first- and second-level analyses were performed. In the first-level (fixed-effects) analysis, data were analyzed for each individual subject separately on a voxel-by-voxel basis using the principles of the general linear model extended to allow the analysis of fMRI data as a time series. Subject-specific event-related responses to the onset of the presentation of the object pictures were examined using the general linear model approach. In this procedure, regressors corresponding to correct tool use, incorrect tool use and tools alone were estimated based on the canonical hemodynamic response function, and six rigid body model parameters related to translation and rotation of the head along the three principal axes were included as additional regressors. Contrasts representing the effect of identifying correct tool use, incorrect tool use and tools alone were defined (versus the implicit fixation cross baseline) and contrast images were calculated for each. The calculated contrast images from each subject were used for the second-level analysis.
Second-level (random-effects) analysis took into account two sources of variability, within- and between-subject variability. To make inferences about the population from which the subjects were drawn, individual contrast images from the first-level analysis were entered in the second-level analysis (Penny and Holmes, 2003). Analyses were performed for each image type versus fixation cross. Significance levels were set at P < 0.01, voxel-level corrected for multiple comparisons using the false discovery rate method (FDR; Genovese et al., 2002). Further, to evaluate activations for understanding the context of tool use over control images, we designed contrasts seeking greater activations for correct tool use over tools alone [correct > tool] and incorrect tool use over tools alone [ incorrect > tool]. These contrast images were entered into a two-tailed t-test to evaluate ([correct > tool] versus [incorrect > tool]). The significance level for this comparison was set at P < 0.05 (voxel-level FDR corrected).
EEG Methodology and Analysis
Subjects were seated in a chair and fitted with a standard tin 58-channel EEG cap (Electrocap, Eaton, OH, USA) to record neural activity using Synamps 2 (Neuroscan, Charlotte, NC, USA). In addition, eye movements were recorded with electrodes placed above and below the left eye to capture electrooculographic (EOG) activity. Data acquisition was performed using a right ear reference at a sampling rate of 1000 Hz and filtered to DC-100 Hz. The left ear was also recorded and used (offline) to create a linked ear reference. Subjects were seated 1.8 m (6 ft) in front of a 50.8-cm (20 in) widescreen visual display that showed the picture presentations. The display was placed in the middle of their visual field, and its height was matched to the eye of the seated subject.
Images of tools and tool–object pairs were displayed using Stim 2 (Neuroscan, Charlotte, NC, USA). Each trial started with a black circle that turned into a cross, alerting the subjects that the trial was about to begin. After 500 ms, the cross disappeared and a target image was immediately shown (Figure 1B). The target image was displayed in the center of the screen (9.5° visual angle) for 2000 ms. Stimulus presentation software forced a fixed delay of 16 ms prior to the target image, warning cue, and fixation cross, which is reflected in subsequent analysis. Subjects were explicitly told to focus on the images as presented and to actively, but silently, determine whether the images showed contextually correct or incorrect use of the tool. Subjects were instructed to avoid making any sort of verbal or behavioral response, and to try to refrain from blinking during trials. Time between trials in all conditions was pseudorandomly varied from 4 to 6 s, with a mean of 5 s.
Neural activations were recorded over two blocks of trials, each lasting approximately 15 min. Within each block, there were 25 images of correct tool use (e.g., hammer used to drive a nail), 25 images of incorrect tool use (e.g., hammer used to stir coffee), and 25 images of tools alone. To avoid ordering effects, correct, incorrect, and control trials were pseudorandomly presented with no repetitions of any tool–object pair. Correct, incorrect, and tool conditions were counterbalanced in their order across subjects and recorded separately, and each condition lasted approximately 15 min.
Each picture provided a unique marker recorded online on the EEG record. Offline, a low pass filter (30 Hz) was applied. Data were epoched from 1000 ms before the onset of the image 2000 ms after, which includes the warning cue and the full duration of the image presentation. Time zero (0 ms) is related to the beginning of the epoch, and each epoch was baseline corrected over the interval from 0 to 250 ms. Based on the unique marker created in the Synamps 2–Stim 2 interface for each picture presentation, epochs were sorted into the possible trial variants (correct tool use, incorrect tool use, and tools alone). An autoregressive model was used to remove any blinks or other ocular artifact from the data based on the EOG signal (Bai et al., 2001, 2006). Any trials with residual artifact were visually identified and removed from analysis.
To better understand the event-related effects of evaluating activation for identifying correct and incorrect contextual tool use across the brain, we replicated, in EEG, the fMRI analysis which specifically factored for the influence of identifying tools. We defined analysis time bins based on our previous work, which showed peak neural responses within the first 400 ms following image presentation (Mizelle and Wheaton, 2010b). Accordingly, data were analyzed within four consecutive 100 ms time bins following image onset. Analysis of these time bins using data from the original 58-channel EEG array was performed using standardized low-resolution brain electromagnetic tomography (sLORETA; Pascual-Marqui, 2002) to optimally define neuroanatomical generators. The EEG signal is related to activation of summed post-synaptic processes recorded at a high sampling rate, and has very fine temporal resolution. However, EEG is not recorded directly from the generators of this activity, causing a weaker spatial resolution. Previous attempts to derive precise localization of this activity have been troubled with non-zero error estimates and localization bias. The sLORETA method allows for unbiased localization of cortical and gray matter linear solutions, with zero localization errors (Pascual-Marqui, 2002). Activations are compared for 6239 brain matter voxels at a 5-mm spatial resolution. Analysis was performed using sLORETA in the abovementioned time bins to evaluate localized neural activations specific for identifying correct versus incorrect contextual tool use using the same contrast as in fMRI ([correct > tool] versus [incorrect > tool]). Using log-transformed t-values, significance was set at voxels corresponding to P = 0.05. Clusters of activation were restricted to have greater than five voxels (k > 5).
To verify that subjects were accurate in their assessment of correct versus incorrect tool–object interactions, each subject completed a computerized questionnaire in which they evaluated the images used in the experimental sessions. Using custom MATLAB code, each image presented during the experiment was again shown to the subject. For each image, subjects were instructed recall their judgment of the image during the recording sessions and to indicate whether the tool–object interaction was contextually correct or incorrect via a graphical user interface (GUI). The GUI specifically asked whether the image showed CORRECT tool use, and presented subjects with YES and NO response buttons. Subject responses, via mouse input, were recorded. Responses were identified as “hits” or “misses” depending on whether subjects correctly or incorrectly identified the images, respectively. We used Fisher’s exact test to determine whether there was any effect of image category (correct and incorrect context) on subject judgment accuracy (hits and misses). Statistical significance was set at P < 0.05.
Overall subjects were 95% accurate in their assessment of correct versus incorrect contextual tool–object interaction. More specifically, images showing incorrect tool use had an accuracy rate of 98% while those showing correct tool use had an accuracy rate of 92%. The results of the Fisher’s exact test (P = 0.53) suggested that the proportion of hits and misses was independent of the image class (correct or incorrect context). In other words, subjects were not more or less accurate for either image category.
fMRI Activations for Correct and Incorrect Tool Use Versus Tools Alone
Similar regions showed greater activation for correct tool use and incorrect tool use over tools alone (Figure 2). For both [correct > tool] and [incorrect > tool] comparisons, primary activations were generally seen at premotor areas, inferior frontal gyrus, SPL, IPL, posterior temporal cortex, middle and inferior occipital gyri, cuneus, lingual gyrus, insula, fusiform gyrus, and cingulate gyrus.
Figure 2. fMRI activations for [correct > tool] (upper panel) and [incorrect > tool] (lower panel) are shown in red. All activations shown were statistically significant at an FDR corrected P < 0.05.
fMRI Activations for Contextual Tool–Object Understanding
We sought to identify activation specific to the conceptual understanding of correct and incorrect tool use through the analysis of [correct > tool] versus [incorrect > tool]. This analysis showed that bilateral premotor and parieto-occipital areas were active in comprehension of correct tool use (Figure 3; Table 1), while bilateral regions along the insula, superior temporal cortex, mesial prefrontal cortex, and posterior cingulate were active in comprehension of incorrect contextual tool use (Figure 3; Table 2).
Figure 3. fMRI activations for the specific comparison of [correct > tool] versus [incorrect > tool]. Differences for [correct > tool] > [incorrect > tool] are shown in red and those for [incorrect > tool] > [correct > tool] are shown in green. All activations shown were statistically significant at an FDR corrected P < 0.05. Slice images are also presented below surface renderings, and slice orientations are shown to the right of each series of slices.
Table 1. Functional magnetic resonance imaging activations specific to [correct > tool] > [incorrect > tool].
Table 2. Functional magnetic resonance imaging activations specific to [incorrect > tool] > [correct > tool].
EEG Activations for Contextual Tool–Object Understanding
To identify the time course of activations for understanding contextually correct and incorrect tool use, we repeated the [correct > tool] versus [incorrect > tool] factor analysis in EEG in time bins identified in our previous work (Mizelle and Wheaton, 2010b). Generally, waveforms showed similar characteristics to those from our previous work (Mizelle and Wheaton, 2010b; Figure 4). A brief deflection was seen following onset of the cue, and large, sustained deflections were present following onset of the image. As compared to tool-only images, these responses were larger for correct and incorrect tool use at temporal and parietal areas. Waveforms for correct and incorrect tool use diverged at two times following onset of the image (0–200 and 300–400 ms following image onset; Figure 4). This was most noticeable at bilateral temporal and parietal regions, where activation for incorrect use was greater immediately following image onset (0–200 ms) and later at occipital, parietal, and temporal regions (300–400 ms), where activation was greater for correct over incorrect tool use. Because of these factors, these time bins were chosen for statistical analysis in sLORETA. When these waveforms were subjected to analysis (Figure 5; Table 3), sLORETA showed early activation differences (0–100 ms post image presentation) exclusively for identifying incorrect over correct tool use predominantly at insula, superior temporal cortex, and anterior and posterior cingulate.
Figure 4. ERP waveforms recorded at electrodes overlying left and right temporal (A,B) and left and right parietal (C,D) regions of interest. Vertical lines within each sub-plot indicate onset of cue (left vertical line) and image (right vertical line). Conditions are indicated by different color tracings, as specified to the bottom of the figure. Arrows indicate clear divergence of waveforms in correct and incorrect conditions.
Figure 5. Source localized (sLORETA) EEG activation differences. Differences for [correct > tool] > [incorrect > tool] are shown in red and those for [incorrect > tool] > [correct > tool] are shown in green. All activations shown were statistically significant at P < 0.05. Each row corresponds to a unique contrast as indicated.
Table 3. Electroencephalography activation differences specific to [incorrect > tool] > [correct > tool] from 0 to 100 ms following image onset.
From 100–200 ms post image presentation (Figure 5; Table 4), these activation differences shifted posteriorly to cuneus, lingual gyrus, insula, superior temporal cortex, and were still exclusive to incorrect over correct tool use. The similar spatial localization of these activations in EEG and fMRI are highlighted in Figure 6.
Table 4. Electroencephalography activation differences specific to [incorrect > tool] > [correct > tool] from 100 to 200 ms following image onset.
Figure 6. To highlight congruence between fMRI and EEG activations specific to [incorrect > tool] > [correct > tool], localized activations from both modalities are presented in slice image format at identical Talairach Z planes. fMRI activations (PFDR < 0.05) are shown along the top row and EEG activations (P < 0.05) are shown along the bottom row of the upper and lower panels. For each image, the nose is to the top of the image, and image left is anatomical left. The upper panel shows activations from 0 to 100 ms post image onset and the lower panel shows activations from 100 to 200 ms post image onset. Both fMRI and EEG slice images were taken at the same Talairach Z plane. Z coordinates are indicated below the EEG slices in the upper and lower panels.
No differences were seen at 200–300 ms. However, at 300–400 ms after image presentation (Figure 5; Table 5), activation differences exclusive for identifying correct over incorrect tool use were seen at occipitotemporal areas and cuneus.
Table 5. Electroencephalography activation differences specific to [correct > tool] > [incorrect > tool] from 300 to 400 ms following image onset.
This study was designed to evaluate a specific aspect of tool knowledge above the perception of tools alone: the neural mechanisms of understanding correct and incorrect contextual tool use. With this study, we exploited the spatial resolution of fMRI and the temporal resolution of EEG to identify not only the regional activations associated with conceptual knowledge for understanding tool use in correct and incorrect contexts, but also how this activation develops over time. Event-related fMRI analysis showed distinct activations in bilateral insula, superior temporal cortex, anterior cingulate, and posterior cingulate for tool use in incorrect contexts (Figure 3). Bilateral activations for tool use in correct contexts tool use were seen in posterior temporal areas and occipital cortex extending along the temporal–parietal–occipital junction, superior parietal cortex, premotor areas, lateral prefrontal areas, and anterior cingulate (Figure 3). EEG results largely confirm the fMRI data, while further elaborating the temporal activation features. With analysis of EEG data focused on time bins identified through our previous work (Mizelle and Wheaton, 2010b), we observed early activations (e.g., during the first 200 ms following image onset) exclusively for incorrect over correct tool use in temporal cortex, insula, cuneus, and posterior cingulate (Figure 5). Later time windows (300–400 ms) showed occipital and temporal activity (Figure 5) for identification of correct over incorrect tool use exclusively. As such, here we provide support for our previous model of tool–object associative knowledge (Mizelle and Wheaton, 2010b), and further elaborate on the time course of regional activation in understanding tool use in correct and incorrect contexts. We also discuss a potential mechanism for conceptual apraxia, with respect to the failure of understanding appropriate tool–object interactions.
Correct Versus Incorrect Tool–Object Contextual Usage
Our primary focus in this conceptual tool-use study was in determining neural activation patterns for differentiating correct versus incorrect tool use. This was novel in the direct focus on understanding the context of tool use. When our analysis was specific to this conceptual aspect of tool–object understanding, we observed activations that differentiated correct versus incorrect contextual tool use by brain region as well as by the temporal evolution of activation differences (more below). Because subjects performed equally well in determining whether images presented correct or incorrect tool use, we are confident that activation differences accurately reflect the identification of contextual correctness.
Contextually correct tool use
The results of the correct context largely reflect areas related to tool knowledge (Lewis, 2006). While the temporal cortex is well known to contribute to vision for perception (for review, see Milner and Goodale, 2008), tool-related processing has also been reported in temporal cortex (Martin et al., 1996), and clinical studies have shown impairments of tool-related conceptual knowledge following lesions to left temporal areas (Tranel et al., 1997a, 2003).
Activations were also seen in parietal and lateral frontal areas well known to contribute to tool processing (Lewis, 2006) and which may form part of the human analog of the mirror neuron system (Rizzolatti and Sinigaglia, 2010). Previous fMRI (Beauchamp et al., 2002) and EEG (Proverbio et al., 2009) studies have identified increased cortical activation in movement-related brain areas in viewing static images which imply biologically relevant motion. As images presented in the current work also showed implied hand–tool–object interactions, similar activations were not unexpected. Further activations were seen in regions known to be involved in viewing graspable objects (posterior cingulate and parietal cortices (Grezes and Decety, 2002; Hattori et al., 2009) and images of tool-related objects (cuneus and precuneus (Creem-Regehr and Lee, 2005). Similar regional activations were seen in each condition versus fixation, suggesting that the presence of a tool in each image was a primary source of activation. As such, we are confident that our paradigm was successful in eliciting activations not only in canonical regions of the brain known to contribute to the visual perception of tools but also in brain areas which support the understanding and production of complex tool-related movements.
Contextually incorrect tool use
Some other studies have identified different regions for understanding action errors using fMRI. However, these studies were focused on different aspects of action error as compared to the current study. For example, Buccino et al. (2007) identified right temporo-parietal, left supramarginal gyrus, and mesial prefrontal cortex in observation of unintended over intended actions. The study of Manthey et al. (2003) was focused on premotor areas, and reported that left premotor cortex was preferentially engaged in analysis of objects while the right premotor cortex was engaged in analysis of movements. Here, we specifically sought to identify areas engaged in the understanding of contextual aspects of tool use, which we have previously suggested lie along ventral stream areas, including temporal cortex and insula (Mizelle and Wheaton, 2010a,b).
Unlike the findings of correct over incorrect context, incorrect over correct contextual tool use activated novel areas that lie ventral to the parietofrontal regions, as well as on the mesial brain surface, particularly the insula, superior and middle temporal cortex, posterior cingulate, and cuneus/precuneus. Here we extend our previous model, which identified a role for ventral structures in differentiating matching and mismatching tool–object relationships (Mizelle and Wheaton, 2010b).
The insula is a region with diverse functions (see Shelley and Trimble, 2004 for review) underlying the fronto-orbital, fronto-parietal, and temporal opercula. Extensive connections between the insula and many structures in frontal, parietal, and temporal cortices suggest it is well positioned to serve as a convergence and integration point for different sensory modalities. Specific to the current work, others have found the insula to serve a role in contextual understanding of visual and somatosensory stimuli (Downar et al., 2001, 2003; Paulus et al., 2005). Further, an argument has been made for the insula in deriving “body ownership” of a movement (Hallett, 2007; Karnath and Baier, 2010), and others have argued that bilateral insula is an important structure in deciding whether to act or not (Kuhn and Brass, 2009). Continuing the idea of simulating tool–object contextual usage, this previous work suggests that insula may become engaged in a decision-making fashion to derive an understanding of action in the incorrect context.
In addition to the insula, we also saw activation at superior temporal cortex, posterior and anterior cingulate, and precuneus. The superior temporal cortex is well known to be engaged in processing objects and actions associated with objects (Martin et al., 1995; Grezes and Decety, 2002; Grezes et al., 2003; Soros et al., 2003; Corina et al., 2005; Lewis, 2006; Beauchamp and Martin, 2007; Liljestrom et al., 2009; Peran et al., 2010). Anatomical connections of the superior temporal cortex suggest integration processing for the ventral and dorsal visual streams (Karnath, 2001). Importantly for the current work, the superior temporal gyrus may have a key role in understanding goal-directed action (Schultz et al., 2004). Specific to the current work, damage to superior temporal cortex has been associated with a selective loss of tool-related knowledge (Tranel et al., 1997a). This is supported by clinical studies which show apraxic patients with lesions focused at superior temporal gyrus benefit from high levels of object affordance in gesture recognition (Barde et al., 2007). In other words, damage to superior temporal areas in the presence of spared dorsal areas specifically impacts the understanding of tool function. This relates to the current study in superior temporal/insula activations seen in the judgment of too use in an incorrect context, and further supports high-level visual functions in superior temporal areas cortex. Further, the superior temporal gyrus has also been implicated in action-related contextual processing, such as integrating previous actions and outcomes into decision-making processes (Paulus et al., 2005) as well as in judging semantic associations (Luo et al., 2003). More recently, superior temporal areas were proposed to engage in determining the contextual plausibility of observed actions (Brass et al., 2007).”
Previous reports of posterior cingulate and precuneus activations also support the current findings. The posterior cingulate cortex has been associated with viewing familiar places and objects (Sugiura et al., 2005), visually guided grasping (Culham et al., 2003), viewing graspable objects (Grezes and Decety, 2002), and in viewing images of tool-related objects (Creem-Regehr and Lee, 2005). Precuneus activation has been shown in the recall of memory-related visual information (Cavanna and Trimble, 2006). Recently, selective activation was shown in the precuneus for related versus unrelated word pairs, and direct versus indirect lexical associations (Sass et al., 2009), also suggesting an associative role for the precuneus. Furthermore, it has recently been suggested that precuneus activation “…provides a visuo-spatial representation of the functionally appropriate hand-tool interaction that contributes to skilled use…” (Vingerhoets, 2008), which may implicate the precuneus in tool–object contextual evaluations.
Temporal Evolution of Activations for Correct and Incorrect Tool-Use
A strength of the EEG technique is the high temporal resolution at which neural activations can be evaluated. In the current work, a complementary analysis was conducted in EEG to determine the sequence of activations contributing to the observed fMRI responses. Early activations (image onset through 100 ms), were exclusive to incorrect tool use and observed at bilateral insula and temporal areas, anterior cingulate and posterior cingulate. From 100 to 200 ms following image onset, this activation shifted posteriorly to include cuneus, insula, and posterior cingulate, and remained exclusive to incorrect tool use. This is in keeping with our previous work (Mizelle and Wheaton, 2010b) which showed early activation differences at insula, superior temporal gyrus, posterior cingulate, and precuneus for differentiating matching and mismatching tool–object image pairs, and directly supports the results seen in our event-related fMRI analysis. At 300–400 ms post image presentation, however, activation differences were localized to occipital and temporal areas and were exclusively for correct tool use. The location of activations here is also very similar to the regions identified in fMRI analysis (Figure 6), and are directly supportive of previous work suggesting ventral activation preceding dorsal activation (Creem and Proffitt, 2001a; Valyear and Culham, 2010).
Although currently speculative, our temporal and spatial results allow us to suggest that insula and superior/middle temporal cortex may serve as a “gatekeeper,” evaluating the contextual correctness of tool–object interactions. Supporting evidence for this can also be found in behavioral studies which emphasize goal-related aspects of behavior on our ability to understand tool/object use (Bach et al., 2005; van Elk et al., 2008). Similar results have also been reported in auditory processing, where action-related sounds elicit responses at superior temporal areas prior to premotor cortex (Pizzamiglio et al., 2005).
If the tool–object relationship is determined to be contextually appropriate, no (tool-use specific) error signal arises from insula/superior temporal cortex. In this case, the parietofrontal network would then derive the adequate (task relevant) sensorimotor representation and motor plan for that tool–action goal pair. Alternatively, if the tool–object relationship is determined to be contextually inappropriate, perhaps the insula/superior temporal areas serve to generate an error signal allowing for appropriate perception of tool use error. This is in line with previous work identifying the insula as having a role in deciding whether to act or not act (Kuhn and Brass, 2009), and with the suggestion of contextual information processing within the ventral stream preceding activation of parietofrontal areas (Creem and Proffitt, 2001a; Valyear and Culham, 2010). As described below, ventral stream processing of tool–object contextual correctness may help to explain conceptual apraxia, a condition where patients seemingly have an inability to specifically identify incorrect tool–object interactions.
Extension into Tool Behavior Deficits
Apraxia is a deficit commonly arising after stroke that will impair tool-related behavior. This can include performance of tool movements, selection and ordering of tools for a task, and the general understanding of tools. Conceptual apraxia is manifest as the inability to select tools adequate for a particular task. Thus, selecting tools in task-driven ways is impaired though it is possible to retain functional knowledge of the tool even when removed from natural settings (i.e., tool pantomime and tool identification are intact). In conceptual apraxia, parietofrontal areas seem to function “normally,” as patients are able to successfully derive a task-specific sensorimotor representation of a tool, although the tool is contextually incorrect. For example, a patient with conceptual apraxia may be unable to identify that a fork is the most appropriate tool with which to eat a meal, yet he is able to complete the task of eating with a toothbrush. Although previous clinical work has not successfully identified a specific cerebral cortical localization for performing multi-step actions involving different tools and technical equipment (Hartmann et al., 2005) or conceptual knowledge (Heilman et al., 1997), lesion studies in patients with tool-related conceptual deficits following stroke strongly implicate are role for temporal cortex and insula in such tool–context errors (Tranel et al., 1997b, 2003, 2008; Damasio et al., 2004).
Conceptual apraxia is most commonly a result of lesion to temporo-occipital areas (Heilman and Gonzalez Rothi, 2003), and thus serves as a unique example of disrupted ventral stream information processing. As Valyear and Culham (2010) reported, “…prior to the actual use of objects, the ventral stream provides important information to specific parietal areas about how to most efficiently engage an object based on semantic knowledge about its identity, function, and how it is to be moved and used.” Here, we directly support this hypothesis, and have identified a system that may serve a crucial role in this process. Activation at these regions (e.g., insula, superior temporal areas, posterior cingulate, and cuneus) was selectively increased to incorrect too use well in advance of temporal–parietal–occipital activation which was specific for correct tool use. In conceptual apraxia, the network specifically coding for incorrect tool use may be selectively damaged and the contextual information passed to the parietofrontal areas is incorrect. In this case, incorrect tool use becomes possible because no error signal arises in response to incorrect use perception, resulting in contextually inappropriate use of a tool.
Previous work has identified that a canonical set of (largely) left parietofrontal regions is engaged in tool use. Here, however, we were focused on a specific conceptual aspect of tool understanding – judgment of the contextual correctness of tool–object interactions. Our fMRI and EEG analyses have shown distinct regional and temporal patterns of neural activation while subjects identified whether images showed contextually incorrect versus correct use of tools. fMRI showed that primary activations for identifying incorrect tool use were found at temporal cortex and insula, while activations for correct tool use were seen along the canonical parietofrontal tool use network. Source localization analysis of EEG waveforms provided additional information about the temporal evolution of these activations; insula, temporal cortex, and cuneus were exclusively active to incorrect tool use 0–200 ms following image onset, while occipitotemporal areas were exclusively active to correct tool use 300–400 ms after image onset. These results extend previous work which suggests that tool-related contextual processing along areas of the ventral visual stream precedes sensorimotor processing in the parietofrontal tool-use network. Further, we argue that the loss of function in insula/temporal areas may be directly related to tool-use deficits seen in conceptual apraxia.
Conflict of Interest Statement
The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.
Bai, O. U., Nakamura, M., Nagamine, T., Ikeda, A., and Shibasaki, H. (2001). Blink artifact elimination in electroencephalographic records based on discrete cosine transform domain modeling. Front. Med. Biol. Eng. 11, 191–206.
Boronat, C. B., Buxbaum, L. J., Coslett, H. B., Tang, K., Saffran, E. M., Kimberg, D. Y., and Detre, J. A. (2005). Distinctions between manipulation and function knowledge of objects: evidence from functional magnetic resonance imaging. Brain Res. Cogn. Brain Res. 23, 361–373.
Canessa, N., Borgo, F., Cappa, S. F., Perani, D., Falini, A., Buccino, G., Tettamanti, M., and Shallice, T. (2008). The different neural correlates of action and functional knowledge in semantic memory: an FMRI study. Cereb. Cortex 18, 740–751.
Corina, D. P., Gibson, E. K., Martin, R., Poliakov, A., Brinkley, J., and Ojemann, G. A. (2005). Dissociation of action and object naming: evidence from cortical stimulation mapping. Hum. Brain Mapp. 24, 1–10.
Culham, J. C., Danckert, S. L., DeSouza, J. F., Gati, J. S., Menon, R. S., and Goodale, M. A. (2003). Visually guided grasping produces fMRI activation in dorsal but not ventral stream brain areas. Exp. Brain Res. 153, 180–189.
Downar, J., Crawley, A. P., Mikulis, D. J., and Davis, K. D. (2001). The effect of task relevance on the cortical response to changes in visual and auditory stimuli: an event-related fMRI study. Neuroimage 14, 1256–1267.
Hartmann, K., Goldenberg, G., Daumuller, M., and Hermsdorfer, J. (2005). It takes the whole brain to make a cup of coffee: the neuropsychology of naturalistic actions involving technical devices. Neuropsychologia 43, 625–637.
Mecklinger, A., Gruenewald, C., Besson, M., Magnie, M. N., and Von Cramon, D. Y. (2002). Separable neuronal circuitries for manipulable and non-manipulable objects in working memory. Cereb. Cortex 12, 1115–1123.
Mizelle, J. C., and Wheaton, L. A. (2010c). Testing perceptual limits of functional units: are there “automatic” tendencies to associate tools and objects? Neurosci. Lett. doi:10.1016/j.neulet.2010.11.009. [Epub ahead of print].
Paulus, M. P., Feinstein, J. S., Leland, D., and Simmons, A. N. (2005). Superior temporal gyrus and insula provide response and outcome-dependent information during assessment and action selection in a decision-making situation. Neuroimage 25, 607–615.
Penny, W. D., Holmes, A. P., and Friston, K. J. (2003). “Random effects analysis,” in Human Brain Function, 2nd Edn, eds R. S. J. Frackowiak, K. J. Friston, C. Frith, R. Dolan, K. J. Friston, C. J. Price, S. Zeki, J. Ashburner, and W. D. Penny (New York: Academic Press), 12–22.
Peran, P., Demonet, J. F., Cherubini, A., Carbebat, D., Caltagirone, C., and Sabatini, U. (2010). Mental representations of action: the neural correlates of the verbal and motor components. Brain Res. 1328, 89–103.
Pizzamiglio, L., Aprile, T., Spitoni, G., Pitzalis, S., Bates, E., D’Amico, S., and Di Russo, F. (2005). Separate neural systems for processing action- or non-action-related sounds. Neuroimage 24, 852–861.
Proverbio, A. M., Riva, F., and Zani, A. (2009). Observation of static pictures of dynamic actions enhances the activity of movement-related brain areas. PLoS ONE 4, e5389. doi: 10.1371/journal.pone.0005389.
Schultz, J., Imamizu, H., Kawato, M., and Frith, C. D. (2004). Activation of the human superior temporal gyrus during observation of goal attribution by intentional objects. J. Cogn. Neurosci. 16, 1695–1705.
Sitnikova, T., Holcomb, P. J., Kiyonaga, K. A., and Kuperberg, G. R. (2008). Two neurocognitive mechanisms of semantic integration during the comprehension of visual real-world events. J. Cogn. Neurosci. 20, 2037–2057.
Soros, P., Cornelissen, K., Laine, M., and Salmelin, R. (2003). Naming actions and objects: cortical dynamics in healthy adults and in an anomic patient with a dissociation in action/object naming. Neuroimage 19, 1787–1801.
Sugiura, M., Shah, N. J., Zilles, K., and Fink, G. R. (2005). Cortical representations of personally familiar objects and places: functional organization of the human posterior cingulate cortex. J. Cogn. Neurosci. 17, 183–198.
Tranel, D., Logan, C. G., Frank, R. J., and Damasio, A. R. (1997b). Explaining category-related effects in the retrieval of conceptual and lexical knowledge for concrete entities: operationalization and analysis of factors. Neuropsychologia 35, 1329–1339.
Keywords: conceptual tool use, temporal cortex, insula, EEG, fMRI
Citation: Mizelle JC and Wheaton LA (2010) Why is that hammer in my coffee? A multimodal imaging investigation of contextually based tool understanding. Front. Hum. Neurosci. 4:233. doi: 10.3389/fnhum.2010.00233
Received: 25 August 2010;
Accepted: 13 December 2010;
Published online: 29 December 2010.
Edited by:Maryse Lassonde, Université de Montréal, Canada
Copyright: © 2010 Mizelle and Wheaton. This is an open-access article subject to an exclusive license agreement between the authors and the Frontiers Research Foundation, which permits unrestricted use, distribution, and reproduction in any medium, provided the original authors and source are credited.
*Correspondence: Lewis A. Wheaton, Cognitive Motor Control Laboratory, School of Applied Physiology, Georgia Institute of Technology, 281 Ferst Drive, Atlanta, GA 30332-0356, USA. e-mail: email@example.com