Switching Modalities in A Sentence Verification Task: ERP Evidence for Embodied Language Processing

In an event related potential (ERP) experiment using written language materials only, we investigated a potential modulation of the N400 by the modality switch effect. The modality switch effect occurs when a first sentence, describing a fact grounded in one modality, is followed by a second sentence describing a second fact grounded in a different modality. For example, “A cellar is dark” (visual), was preceded by either another visual property “Ham is pink” or by a tactile property “A mitten is soft.” We also investigated whether the modality switch effect occurs for false sentences (“A cellar is light”). We found that, for true sentences, the ERP at the critical word “dark” elicited a significantly greater frontal, early N400-like effect (270–370 ms) when there was a modality mismatch than when there was a modality-match. This pattern was not found for the critical word “light” in false sentences. Results similar to the frontal negativity were obtained in a late time window (500–700 ms). The obtained ERP effect is similar to one previously obtained for pictures. We conclude that in this paradigm we obtained fast access to conceptual properties for modality-matched pairs, which leads to embodiment effects similar to those previously obtained with pictorial stimuli.

from the current construal. This is how we understand connected discourse (Zwaan, 2004). For Glenberg (1997;Glenberg andRobertson, 1999, 2000) the key issue is that we use the perceptual symbols to derive affordances, in the sense of Gibson (1986), for the specific situation. Understanding a sentence is a result of meshing the affordances, which is guided by the syntax of the sentence.
Evidence for the modal grounding of conceptual and linguistic representations has been found using a variety of techniques and tasks. Only a few key findings relative to the current experiment will be reviewed here. Goldberg et al. (2006) measured fMRI BOLD responses while participants did a blocked property verification task. Participants had to press the button for each word that had the property "green" (visual), "soft" (tactile), "loud" (auditory), or "sweet" (gustatory). The results for visual and tactile decisions showed increased activation in visual and somatosensory cortex when compared to control, which supports the notion of modal grounding.
Using a behavioral measure and the same paradigm, Pecher et al. (2003) established that there is a cost to switching modalities. They presented participants with short sentences that consisted of a concept followed by a modal property (they used audition, vision, taste, smell, touch, and action). For example, after reading "blender can be loud," participants were asked to decide whether "loud" is a typical property of "blender." Crucially, half of the experimental trials were preceded by a trial of the same modality (matched modality, "leaves can be rustling" -"blender can be loud") while the other half were preceded by a trial of a different (mismatched) modality (e.g., "cranberries can be tart" -"blender can be loud"). Participants were able to verify the property of the concepts faster

IntroductIon
The idea that our conceptual system is grounded in modality-specific or embodied simulations has received support from many different areas of research including psychology, neuroscience, cognitive modeling, and philosophy (Lakoff and Johnson, 1999;Gibbs, 2005;Pecher and Zwaan, 2005, for reviews). The suggestion that modality-specific simulation also affects language processing has been put forward by a number of authors (Glenberg, 1997;Barsalou, 1999;Glenberg andRobertson, 1999, 2000;Zwaan, 2004;Zwaan and Madden, 2005). For example, Barsalou's (1999) theory of perceptual symbol systems suggests that modality-specific simulations arise from perceptual states and that these (simulated) states underlie the representation of concepts. Hence, all conceptual symbols are grounded in modality-specific states. Linguistic symbols develop alongside the perceptual symbols that they are linked to so that when we use or encounter words, we simulate the perceptual states that are linked to the linguistic information. Such a source of perceptual state simulations is called a simulator by Barsalou (1999). Zwaan and Madden (2005) similarly assume language is grounded in perception and action via something akin to Barsalou's (1999) perceptual symbols. However, they focus specifically on how language guides the simulators. They assume that what we simulate is based on attentional frames (Langacker, 2001). In particular, within one attentional frame we construct a "construal": a simulation that includes time, spatial information, perspective, and a focal and background entity (for details see Zwaan and Madden, 2005). Furthermore, during construal, information from previous construals forms the context with which we integrate the information having either a salient visual or a salient tactile property (Pecher et al., 2003;Van Dantzig et al., 2008;Lynott and Connell, 2009;Van Dantzig and Pecher, submitted; see Materials and Methods for details). From this set of concepts with salient modality features, we created true statements "A cellar is dark" (visual). As in Pecher et al. (2003), sentences were presented one by one and for the participants, appeared to be unrelated. However, the critical manipulation was that sentences that followed each other were either matched in the salient modality (e.g., visual-visual, "Ham is pink" -"A cellar is dark") or mismatched in the salient modality (e.g., tactile-visual, "A mitten is soft" -"A cellar is dark"). We crossed modality with veracity by making half of the experimental target sentences false, while maintaining the same modality information ("A cellar is light").
We will now first review the links between property verification and sentence verification and then discuss the previous findings on veracity. As mentioned above, Pecher et al. (2003) asked participants to perform a conceptual-property verification task for statements such as "blender/can be/loud" (slashes indicate line breaks on the computer monitor). The participants were asked to verify that the property (always shown on the third line) is "usually true" of the concept (always shown on the first line) and had to respond with a true or false response. In the current study, it was decided to change the task to sentence verification. Sentence verification is a similar but more general task and has been used extensively in early sentence processing literature (for a review see Carpenter and Just, 1975) and in event related potential (ERP) experiments (e.g., Fischler et al., 1983). In this task, sentences are presented and subjects respond with a true or false judgment at the end of the sentence. Some items are almost identical between tasks ("A blender can be loud"), others can only be used in the sentence verification ("A baby drinks milk"). In our version of sentence verification, the words are presented one by one in the middle of the screen, which leads to a relatively natural reading experience, while avoiding eye movements. Using the sentence verification task, the typical finding is that false sentences take longer to verify than true sentences (e.g., Fischler et al., 1983).
The majority of the response time literature on veracity investigates the time to decide whether a sentence is a true or false representation of a corresponding picture ("The dots are red" with a picture showing either red or blue dots). In this situation, true sentences have been consistently shown to be verified faster than false sentences (for example, Trabasso et al., 1971;Clark and Chase, 1972;Wason, 1980). The primary explanation for this is that readers match the color red to the color of the dots. When this is congruent, readers are facilitated; when the colors are incongruent there is a slow down (Carpenter and Just, 1975; see also Fischler et al., 1983).
In this paper, we will try to obtain further empirical evidence for an embodied approach and we will discuss how an embodied language comprehension system can explain the current and past findings through the process of simulation. In an embodied view, determining the veracity of a statement depends on the outcome of a simulation and the comprehension process should be modulated by direct or indirect effects of simulation.
The reaction time effects for modality switch are quite subtle so we decided to use a more sensitive technique for this study: To explore the processing dynamics of modality switching, veracity, and more accurately in matched modality trials than in mismatched modality trials. Similar modality switch effects have been found in other studies across both conceptual and perceptual processing tasks (e.g., Spence et al., 2001;Marques, 2006;Vermeulen et al., 2007;Van Dantzig et al., 2008).
If the mental simulations that are required for understanding involve the premotor areas, keeping these areas otherwise involved should interfere with language comprehension. This has indeed been demonstrated, for example by Zwaan and Taylor (2006), who found that reading about an action which involves clockwise turning (e.g., increasing the volume on a radio) interfered with the action of turning a knob counterclockwise. More abstractly, Glenberg and Kaschak (2002) showed that reading a sentence which involves transferring an object or information away from the participant ("You told Liz the story") interfered with that participant pressing a response button which was located toward their body as compared to a button which was located away from their body.
Evidence from tasks not involving large physical movements comes from sentence-picture verification tasks. For example, a sentence such as "John pounded the nail into the floor" was followed by a picture of a nail (Stanfield and Zwaan, 2001). Response times were faster when the picture matched the orientation implied in the sentence (vertical) compared to when there was a mismatch in orientation (see also Zwaan et al., 2002).
What is striking about the behavioral studies described here is the number of innovative tasks and procedures that were created in order to show that concepts and language are grounded in bodily states. While these and other sets of studies form a convincing body of literature, one might question how effects related to embodied cognition might be evident in other tasks that are more standardly used within language comprehension research. If simulating linguistic or conceptual material in terms of our bodily states is the norm, we should see evidence of it in any standard task if properly designed and analyzed. This is important as it could be argued that the use of tasks involving movement and pictures encouraged participants to use an imagery based strategy (e.g., Glucksberg et al., 1973), which would make the embodiment results specific to the tasks that are used in this field. Of course, exceptions already exist: Results from neuroimaging studies where participants read either single words or sentences referring to bodily actions support the embodied view by showing increased activation in the premotor and sometimes the primary motor areas of the cortex (for example, Hauk et al., 2004;Boulenger et al., 2009). Recent findings using the sentence-picture verification task also suggest that the results are not due to the use of imagery as a strategy (Pecher et al., 2009) and the study by Pecher et al. (2003) did use solely linguistic stimuli, albeit in a slightly unnatural task. If embodied simulation is a part of everyday language comprehension, we should be able to find evidence for it using the standard language comprehension techniques that do not involve pictures or movements. In the current study we will therefore use the well-studied paradigm of the sentence verification task (Meyer, 1970). Before discussing results related to this task, we will quickly outline our experiment to frame the discussion below.
The materials of the current study were adapted from the design used by Pecher et al. (2003) to a sentence verification task. We drew our materials from items that have previously been rated as that is grounded in the failed simulation (Barsalou, 1999, p. 601). However, as in the response time literature on veracity discussed above, Barsalou (1999) discusses false sentences in a context where one compares the sentence to a situation (or picture) immediately in front of people, not what would happen when something is false based on background knowledge. Nonetheless, if false sentences lead to a failure of simulation, this may lead to a different ERP modulation based on the point at which the simulation fails. Considering false sentences take longer to verify than true sentences, one might expect the ERP modulation relative to modality switching to occur later in the time course of processing.

PartIcIPants
Sixteen native speakers of English recruited from Canterbury Christ Church University and the University of Kent participated in this experiment, 10 of whom were included in the final analysis (eight females; aged 18-22, mean = 19.7). They were paid a small fee for their participation. All participants had normal or correctedto-normal vision, and normal hearing and all were right handed. None of the participants had any neurological impairment and none of them had participated in the pretests (see below). The six participants (37.5%) who were excluded from the final analysis were rejected for the following reasons: excessive artifacts (eyemovements, excessive noise from muscle tension, two participants, see EEG Recording and Analysis below for details), technical problems with recording (one participant), reaction time errors over 25% (two participants), and non-native English speaker (one participant). Ethical approval for the ERP study and the pretest was obtained from the Canterbury Christ Church University Faculty Research Ethics Committee, which follows the British Psychological Society guidelines for ethics on human subject testing. All participants signed a consent form prior to participating in the ERP experiment and the pretests.

stIMulus MaterIal and desIgn
The experimental materials comprised 160 pairs of sentences. Each pair consisted of a first sentence, which we will call the modality context sentence followed by a second sentence, the target sentence. The modality context sentences were always semantically correct, true statements which either described a salient tactile property (tactile context) of an object or a salient visual property (visual context) of an object. We selected a subset of the items that have been previously rated as having one modality that was clearly dominant in people's perception of that item (ratings from Pecher et al., 2003;Van Dantzig et al., 2008;Lynott and Connell, 2009;Van Dantzig and Pecher, submitted). The target sentence either matched the modality of the modality context sentence or mismatched. Additionally, the target sentence could either be true or false. False versions of the target sentences were created by using a word that was rated in a pretest to be the opposite of the salient feature of the object. For example, for "A cellar is dark" the word "light" was independently rated as the opposite of dark and it was used to create the false version. The false target sentences always contained a property in the same modality as the true target sentences. By using opposites we can keep the format of the true and false sentences and their interaction, we will look at the presence and significance of modulations in the ERP. If embodied simulation is an automatic process that occurs when we understand language, evidence of modality switching, veracity effects, and their interaction should be evident in ERPs. Predictions relative to modality switching are discussed below followed by veracity predictions.
One possible prediction of the effect of modality switching would be a modulation of the N400 effect. Although often incorrectly thought of as an increased negativity that occurs only to semantic anomalies (e.g., "He spread the warm bread with socks/ butter"; Kutas and Hillyard, 1980), a large body of research suggests that semantic anomalies are nether necessary or sufficient to elicit an N400 effect (see Kutas et al., 2006, for review). Instead results show that a (small) N400 occurs as a response to each meaningful word as part of normal processing (Van Petten, 1995). The amplitude of the N400 is sensitive to many different semantic and linguistic factors [for example, Cloze probability (Taylor, 1953), word frequency, word class, and discourse context]. Furthermore, relative to the veracity, a consistently larger amplitude N400 is seen for words that change the veracity of a single sentence (at the critical word, here shown in bold for "a ham is blue" versus "a ham is pink"; Fischler et al., 1983;Hagoort et al., 2004).
Given this range of meaning effects modulating the N400 and the behavioral findings that switching modalities leads to a processing cost (Pecher et al., 2003), a reasonable expectation is that the N400 effect could be modulated by modality switching. We think it is a priori unlikely that a modality switch would trigger a sizeable N400 by itself, as "a cellar is dark" is usually a true, semantically coherent statement, even after a tactile context like "a mitten is soft." However, the N400 is sensitive to the integration of incoming semantic information into the ongoing representation: Assuming that the ongoing representation is indeed embodied, a switch in the modality may lead to an earlier effect than the N400 (because modality switching should occur before integration), and the modality switch may modulate (enhance or suppress) the N400 itself. The effect of modality switch on the N400 may not be linear, as is known to be the case for word frequency and context (Van Petten and Kutas, 1990). Specifically, one may predict that a match in modality may lead to easier simulation and therefore a reduction or absence of the N400 for integration. Alternatively, there could be an ERP effect that occurs earlier than the N400, which is specifically indicative of the simulation itself.
A second question addressed in our study is what happens when the target sentence is false or commonly false ("a cellar is light"). We know that the veracity of the sentence can modulate the N400 (Fischler et al., 1983, with sentence verification task; Hagoort et al., 2004, with no task given). Similar N400 modulation results were found using a task where participants were required to determine whether a probe word was related conceptually to the precious sentence (for example, "flute" following "Mozart was a musical child prodigy"; Nieuwland and Kuperberg, 2008). To better understand the effects of modality switching, we investigate whether there is an interaction between effects for the veracity of the sentence and effects of the modality switch. Barsalou (1999) suggests that when a false sentence is read the simulation fails, which means that the meaning of the sentence cannot be successfully mapped onto reality. After a simulation fails presumably a new simulation is carried out

Pretest
In order to create false versions of the target sentences that maintained the same modality information, we decided to replace the adjectives with their opposites. For example, if the target sentence used the critical word "dark," we had people rate on a 7-point scale (7 = strongly agree): "The opposite of dark is light." These opposites do not form anomalous sentences of the kind that were used for the first N400 experiments (Kutas and Hillyard, 1980;Kutas et al., 2006). However, previous research has shown that people can show N400 effects to sentences that are at odds with their basic world knowledge (Hagoort et al., 2004;Hald et al., 2007). By using opposites, we were able to construct an experiment without anomalous sentences and we were able to use properties from tactile and visual modalities for all experimental items.
We tested a total of 52 different candidate opposites to arrive at the final set of 40. In addition to looking at possible opposites of all critical words, we also included fillers of two different types in this pretest to make sure participants were using the full scale in their ratings. Twenty fillers had properties that are difficult to assign an opposite to, for example "The opposite of checkered is striped." The other 20 fillers were based on words that are related but clearly not opposites "The opposite of clean is polished." For the visual modality, target words were often color terms (37.5% of the time). Although in one technical sense colors do have opposites (complementary colors), these opposites may not be conceived as such by ordinary language users in the same way as terms such as "dark" and "light." For that reason we tested all color word opposites (such as "Black is the opposite of white") separately, in a list with fillers that were also all color words. We encouraged the participants to use the full scale by including fillers that were related but clearly not opposites ("Magenta is the opposite of violet") and fillers that were difficult to judge ("Black is the opposite of fuchsia"). For the non-color pretest 27 native English speakers (eight males; mean age = 31) and for the color terms 37 participants (11 males; mean age = 31) completed the ratings online using SurveyMonkey 1 . We selected the words that were rated most highly as opposite as the adjectives for the false condition. The mean rating for the non-color list was 5.75 (SD = 0.52) and for the color list 4.61 (SD = 0.74). Although the color words were rated lower (less opposite) than the non-color words, the key issue for the sentence verification task is that using these words makes the sentences false. Thus, we had a set of clearly false statements that retained the same modality as the true statements, and had very similar content.

Procedure for the erP study
Participants were asked to fill out a questionnaire about their language and basic health background. Additionally, participants filled out a handedness questionnaire (Oldfield, 1971) and signed a consent form. Participants were tested individually in quiet room, seated in a comfortable chair approximately 70 cm away from a computer monitor. Participants were asked to read the sentences for comprehension and decide whether each sentence was true or false. They were also asked to try not to move or blink during the presentation of the sentences on the computer screen. No other tasks were imposed.
identical, but one may wonder what the effect of an opposite is. Furthermore, many of the studies looking at veracity have used opposites to create false sentences. For example, Nieuwland and Kuperberg (2008) looked at true and false sentences where all false sentences were created by using opposites (for example, "With proper equipment, scuba-diving is very safe/dangerous…") and found a typical N400 effect for false sentences compared to true (see also Hald et al., 2005 for similar use of opposites for creating false sentences. The conditions modality-match and veracity of the target sentence were fully crossed, with 40 pairs in each of the four cells. Half of these 40 target sentences were visual, the other half tactile (see Table 1 for example materials). Eighty false-false filler pairs were added to balance the number of true and false targets. The filler pairs also contained strongly modality related properties in half of the sentences, using tactile, visual, auditory, and gustatory modalities. The other half of the fillers were not based on modality-specific information but instead contained highly related words, while conveying false information (e.g., A ball is refereed; see Pecher et al., 2003, for similar use of semantically related filler items).
The 240 pairs of sentences were presented in a pseudorandomized order specific to each participant (created using the program Mix, Van Casteren and Davis, 2006) using a fully withinparticipants design. The use of within-participants manipulation kept the design similar to that of Pecher et al. (2003), where matched versus mismatched modality was manipulated within-participants. Furthermore, in previous ERP sentence verification experiments a within-participants design was also utilized (Fischler et al., 1983; see also Hald et al., 2005, for a direct comparison of within and between-participants design using a sentence verification task). further analysis, with 88.25% of the epochs being included. Below, we will carry out region-specific analyses of the predicted effects, comparing anterior regions (frontal and fronto-central electrodes, also including midline electrodes Fpz, Fz and temporal electrodes FT7, FT8) versus posterior regions (centro-parietal, parietal, and occipital electrodes, also including midline electrodes CPz, Pz, and Oz). Electrodes TP7, TP8, and POz were not included in the region analyses to balance the number of electrodes in each region.

results
An overview of nine representative electrodes (out of 64 total electrodes) is shown in Figures 1 and 2. It is apparent that, in the true sentences (Figure 1), the Modality-Match conditions (abbreviated to ModMatch, levels match and mismatch) clearly differ from each other whereas they are visually almost identical for the false sentences (Figure 2). For the true sentences, there are clear difference between the magnitude and direction of the effects across the scalp that leads us to include an additional factor Region (levels Anterior, Posterior) in the analyses. Based on established effects that have been found in the literature and visual inspection of the peaks of the ERP waveforms, we divided the analysis into four time windows: First, a very early window (160-215 ms) to capture the N1-P2 complex. Second, an early window (270-370 ms), which is positioned just before the classic N400 window. Third, a standard N400 window (350-550 ms). Fourth, a late window (500-700 ms) which should capture any late positive shift effects.
A three-way analysis of Modality-Match, Veracity, and Region (anterior, posterior) was carried out for all time windows. This analysis was followed by additional analyses split by Veracity, exploring the existence of a ModMatch effect and/or a Region effect for the subsets of true and false sentences. An N1-P2 complex is seen, which is typical for visual word presentation at this rate. We explored whether there was a difference between conditions in this very early time window. In the 2 × 2 × 2 analysis, a significant effect of ModMatch was found and a significant interaction between Veracity, ModMatch, and Region (F-values and significance levels are reported in Table 2 for easy reference; full details are in Table A1 in Appendix). We explored this interaction by computing simple effects analysis for both levels of the Veracity condition: In the first follow-up analysis (for true sentences only), the factor ModMatch was again significant in this very early window (True-Match mean = 0.145 μV, True-Mismatch mean = 0.063 μV, difference = 0.082, see Table 2 for significance levels). No significant effects were found in the second analysis (for false sentences).

second tIMe wIndow: early n400-lIke effects, 270-370 Ms
This time window was chosen after visual inspection of the ERP waveforms to capture the majority of differences that occur over the scalp, in all conditions. Given the theoretical and observed difference between our true and false sentences (see Figure 3), separate windows for true and false sentences could have been justified but we felt this would unnecessarily complicate the analysis (we carried out post hoc analyses on a number of other time windows but these analyses did not result in a different pattern of significance).
The experimental stimuli were presented using E-Prime 2.0 (Schneider et al., 2002). The experimental session began with a practice block of 10 sentences, which were similar in nature to the experimental items. At the end of the practice block the participant had a chance to ask any questions they had about the task. The remaining sentences were split into six blocks lasting approximately 12 min each. A short break followed each block. Each block began with two filler items, which were similar in nature to the experimental items. These filler items were included to minimize loss of data due to artifacts after beginning a new block.
Each trial began with a fixation ("+++") displayed for 1 s in the middle of the computer screen. The participants were told they could blink their eyes during the fixation display, but to be prepared for the next sentence. After a variable time delay (randomly varying across trials from 300 to 450 ms), the sentence was presented word by word in white lowercase letters (Courier New, 18-point font) against a black background. The first word and any proper noun were capitalized and the final word of each sentence was followed by a period. Words were presented for 200 ms with a stimulus-onset asynchrony of 500 ms. Following the final word, the screen remained blank for 1 s, after which three question marks appeared, along with the text "1:true" and "5:false." Participants needed to press either "1" or "5" on the number keypad of a keyboard to indicate whether the sentence was true or false (half of the time, the numbers were reversed). If they responded incorrectly, "Wrong Answer" was displayed and if they took more than 3000 ms, "Too slow" was shown. Exactly the same presentation was used for context and target sentences, so that participants were not aware that sentences were presented in pairs.
Following the experiment, the participants were debriefed and a short questionnaire was given to determine if they were at all aware of the purpose of the experiment.
In the initial 2 × 2 × 2 analysis, we obtained a significant threeway interaction between Veracity, ModMatch, and Region. The main effect of Veracity was borderline significant (F = 4.939, p = 0.053); no other effects reached significance. We explored the three-way interaction by computing simple effects analysis for both levels of the Veracity condition. In the first follow-up analysis (for true sentences only), we found no significant effects; in the second follow-up analysis (for false sentences only), we also found no significant effects.
So although a significant three-way interaction (Veracity, ModMatch, and Region) was found, no significant effects of ModMatch and Region are found when the data are split by Veracity. However, one can also split the data by Modality-Match and look for effects of Veracity and Region. This analysis corresponds to looking for a veracity N400; the results are reported in Table 3 and Figure 4. For modality-matched sentences, no effect of Veracity or Region were found (all p > 0.25). However, for modality-mismatched sentences, a significant effect of Veracity and a significant interaction of Veracity and Region were found [Anterior-True-Mismatch In the overall 2 × 2 × 2 analysis, we found that there was a significant difference between anterior and posterior electrodes and a significant three-way interaction between Veracity, ModMatch, and Region. We explored this interaction by computing simple effects analysis for both levels of the Veracity condition. In the first follow-up analysis (for true sentences only), the main effect of ModMatch was not significant, but the main effect of Region and the interaction ModMatch × Region were significant (see Table 2).
We further explored this two-way interaction for true sentences in a second follow-up and found that for true sentences, a significant ModMatch effect was found both on anterior electrodes  Table 3; and Table A2 in Appendix]. Because the ModMatch effect for anterior electrodes has a different polarity than the effect for posterior electrodes, the effects cancel out in the first follow-up analysis, but they are significant in the second follow-up. In the third follow-up analysis (for false sentences only), the factors Region and ModMatch were included but only Region was significant; this effect is not of substantive interest.   This late time window was chosen to analyze the late negativity that is apparent for true sentences on the anterior electrodes (see Figure 1). In line with the analyses above, the 2 × 2 × 2 was followed by two separate statistical analyses for true and false sentences. No are processed differently (Fischler et al., 1983). Our results indicate very different effects of modality for true and false sentences, see for example the scalp distributions in Figure 3. For true target sentences, we found a large early frontal N400-like effect for true, modality-mismatched pairs ("A mitten is soft" -"A cellar is dark") compared to true, modality-matched pairs ("Ham is pink" -"A cellar is dark") in our time window 1 (160-215 ms) and 2 (270-370 ms). In time window 1 (160-215) the anterior negativity effect did not significantly interact with region. However, in time window 2 (270-370 ms) this effect interacted with region such that true, mismatched sentences elicited a larger anterior negativity than true, matched sentences. True, mismatched sentences also elicited a larger positivity on posterior sites compared to true, matched sentences. The effects of modality on the true statements were replicated in a late time window (500-700 ms). As with the early time windows, more negativity is seen for the true, mismatched condition as compared to the true, matched condition across the frontal electrodes. Across the posterior electrodes, more positivity is seen for the true, mismatched condition compared to the true, matched condition. For false target sentences ("A cellar is light"), no significant effects of modality were seen at any time window. This is unlikely to be due to a lack of sensitivity, as the pattern for false sentences was numerically reversed compared to the true sentences (false, matched pairs eliciting a non-significant but larger anterior negativity in the waveforms than false, mismatched pairs). We obtained one additional finding: False, mismatched sentences elicited a classical Veracity N400 in the 350-500 ms window when compared to true, mismatched sentences. This negativity interacted with region, such that it was strongest centro-posteriorly. No effect of veracity was found for the modality-matched sentences.
We will first discuss the relatively early time course of the effect and its distribution. The effect of modality-match on the frontal ERP sites begins in the first time window, as early as 160 ms, and is clearest in the second time window, around 300 ms. The presence of a modality-match effect in our earliest window (165-215 ms) indicates that modality switching is a precursor to and likely to be necessary for meaning integration. The modality-match effect develops further and becomes easily discernible across the scalp in the second time window (270-370 ms). This is the main effect of interest as the polarity of the effect reverses across the scalp, with mismatched pairs eliciting a larger anterior negativity and a larger posterior positivity. Both effects are much earlier than a standard N400 effect which typically begins around 250 ms and peaks around 400 ms (Kutas et al., 2006). In addition, the N400 typically has the strongest negativity on occipital and posterior sites. The distribution of the negativity in window 1 is mostly anterior. The distribution of the negativity in window 2 is also mostly anterior, but in window 2 we see an additional posterior positive distribution, which does not resemble the standard N400 at all.
We are not the first to find an anterior N400-like effect in an embodied context. For example, Van Elk et al. (2010) found an anterior N400 for the preparation of meaningful actions compared to meaningless actions, in a task that required participants to grasp objects. Interestingly, their N400 was largest for the preparation of meaningful actions. Holcomb et al. (1999) found that concrete words elicit a stronger anterior N400 than abstract words, an effect This corroborates the findings in the 270-370 ms time window. Also similar to those earlier findings, no substantive significant effects were obtained for false sentences.

reactIon tIMe data
Participants made a true/false judgment after each sentence was presented. Although there are enough participants for EEG analysis, the analysis of reaction times may lack the power to detect all differences. Note that Pecher et al. (2003) included 32 participants per between subject experimental condition, three times the number of participants in this study. The average reaction time and standard deviations are given in Table 4. One should keep in mind that, to keep the task as natural as possible, participants were not required to give a speeded response and this generally leads to large standard deviations. Additionally, to avoid movement artifacts we used a delay response (see Materials and Methods), which may also contribute to more variation. The means for the four conditions are very close to each other and do not differ significantly: In a ModMatch × Veracity ANOVA, we found no significant effects [Veracity F(1,9) < 1; ModMatch F(1,9) < 1; Veracity × ModMatch F(1,9) < 1]. For this analysis, we included all correct responses to target sentences and removed responses faster than 200 ms and slower than 2500 ms. Similarly, accuracy for the conditions was very high and not significantly different: True-Match 94.25% accurate; True-Mismatch 95.25% accurate; False-Match 90% accurate; and False-Mismatch 90% accurate.

dIscussIon
We conducted an ERP study where participants were exposed to written sentence pairs that either matched or mismatched in modality. We looked for an effect of modality-match in true and in false sentences. Previous research suggests that true and false sentences distribution to the late effect that was obtained here in window 4. However, the N700 is sensitive to abstractness and shows a stronger anterior negativity and a stronger posterior positivity for concrete words than for abstract words. We do not want to argue that our mismatched stimuli were somehow more concrete but the interesting parallel is that the N700 is stronger in a mental imagery task than in two other tasks (lexical decision, letter spotting). This lead West and Holcomb (2000) to propose that the N700 reflects some image-based type of processing for purely linguistic stimuli. The similarity between our results using sentences and those from previous work with pictures can best be explained in an embodied view of conceptual representation that uses simulation to arrive at semantic interpretation (Barsalou, 1999). It has been shown that reading action verbs can activate motor cortex (Hauk et al., 2004;Boulenger et al., 2009), presumably because participants were simulating the action. If our participants generated a mental simulation of the properties of the object ("A cellar is dark"), this could have produced activation that is very similar to actually seeing the object. Hence, we found effects that are very similar to those that so far have been exclusively found with picture presentation.
An embodied view of concepts would predict that there are no fundamental differences between representations derived from words and those derived from pictures, because each type of stimulus connects to underlying concepts that are grounded in modality-specific representations (in contrast to a dual coding view, such as Paivio's, 1986). Normally, access to concepts happens earlier for pictures (a long-held assumption, e.g., Caramazza et al., 1990;Schriefers et al., 1990) and effects of embodiment and modality are therefore commonly obtained with pictorial stimuli (Stanfield and Zwaan, 2001). In the paradigm used here, modality is primed and access to concepts and modality information is very fast (see also below), which leads to ERP effects that are comparable to those obtained with pictures.
The above is indirect evidence for an embodied view. We also found direct evidence for such a view in the clear ERP differences between true sentences with matched and mismatched modalities. This is also consistent with results from Collins et al. (2011): Using a concept property verification task, modality switching lead to increased amplitude N400 for visual property verifications and a larger late positive complex for auditory verifications. These embodiment effects would not be predicted by models that assume that an abstract propositional representation is necessary which they coined the concreteness-N400. However, we did not use abstract words in the current study so this particular N400 variant cannot explain our results.
The effect in windows 1 and 2 and 4 are quite similar to the ERP modulation that has been found for pictures and combined sentence-picture stimuli (Barrett and Rugg, 1990;Ganis et al., 1996). In the Ganis et al. (1996) study, the relevant experimental stimuli were sentence fragments that were followed by a picture. The picture was either semantically congruent or incongruent with the sentence semantics up to that point. It was found that, on the frontal electrodes only, incongruent pictures elicited a large negative deflection between 150 and 275 ms compared to the congruent pictures. Barrett and Rugg (1990) found a similar effect, which they called the N300. This effect is similar in time course and distribution to the window 2 effect we report. Ganis et al. (1996) also found that there was a larger anterior N400-like effect for pictures than for control words, and that this effect was reversed on the posterior sites. We found that our window 2 early anterior negativity also reverses on the posterior sites. Lastly, Ganis et al. (1996) report a late congruency effect from 575 to 800 ms whereby the incongruent pictures elicit a negativity at anterior sites and a positivity at posterior sites, which is similar to the findings in our fourth window (500-700 ms). Ganis et al. (1996) suggest that their findings are specific to pictorial stimuli (see also Barrett and Rugg, 1990). However, we found a very similar effect using only language stimuli. We argue that our specific design, in which all the experimental stimuli refer to a highly salient modal (physical) aspect of an object, induces effects that are comparable in distribution and time course to those that have been obtained with pictures.
This explanation is somewhat consistent with the explanation of the so-called N700 effect proposed by West and Holcomb (2000). The N700 is very similar in time course and scalp sentences in a mismatched modality context. The standard N400 window from 350 to 550 ms is indicated. This is also our analysis window 3. ERPs are time locked to onset of the critical word (0 ms) and negative activation is plotted up.  (507) sentence. We obtained no differences in windows 1 or 2 between the modality-matched and mismatched conditions because "light" is never an expected word: Similar to the explanation above, a visual context would raise an expectation for "dark" and the tactile context for "moist." Therefore "light" is equally unexpected for both modalities and no difference between modality-match conditions is found. How does this activation explanation fit with embodied theories of language? Barsalou (1999) discusses falsity only with regards to comparing a sentence to a given situation. For that case, Barsalou (1999) essentially suggests that a simulation of the sentence is made and compared to the scene at hand. If there is a mismatch, then the simulation fails. In our experiment, participants presumably compare the information from the simulation of the false sentence to background knowledge, as there is no scene to compare to. Following Barsalou's (1999) line of reasoning, we would conclude that simulation of the sentence fails.
However this is an incomplete explanation of falsity since it seems that making the simulation of the false sentence should still show a benefit of modality-match. We can explain our results more completely if it is assumed that simulation is based on our prior recent experiences (for example, Glenberg et al., 2009) and that it never fails, but simply takes longer to complete. When trying to simulate "a cellar is light" out of context, we are unable to immediately activate the relevant perceptual/action/emotion information because we have limited experience with this. This is not to say we cannot simulate things we have no experience with, but this account would predict that such simulations take longer out of context. The modality switch effect is a small and subtle effect that cannot be observed in this case. In our experiment, that would mean that a false sentence cannot benefit significantly from the preceding modality-match at time windows 1 and 2, but we believe participants still arrive at a simulation of "the cellar is light." After all, this is what is required to understand larger discourse.
The inclusion of the false condition and the findings we obtained for it rule out a semantic relatedness explanation for our true sentence pairs. Under a semantic relatedness explanation, the results we find for true sentences are not due to embodiment but to simple semantic field priming. If, for example, a visual context used a color term and the following target sentence also used a color term, facilitation could be expected. There are independent arguments against this explanation : Pecher et al. (2003) provide empirical evidence against it and semantic priming does not usually last long enough to produce such an effect (see for example McQueen and Cutler, 1998). However, we can also rule out this explanation from our data: The same semantic priming should have occurred in our false sentences as word priming is not sensitive to veracity, but we found no effect for false sentences.

VeracIty fIndIngs: ModalIty-MIsMatched sentences
Overall, no effect of veracity was found. However, when splitting the data by modality-match versus modality-mismatch, an effect of veracity (greater amplitude N400 for false sentences) was seen in the modality-mismatched conditions. As already suggested, at the onset of the critical word in a false, mismatched sentence, the participant has simulated the concept cellar in the tactile modality and the most highly activated candidate is "moist." When the critical word "light" comes in, the modality of the simulation changes and this for language comprehension in general and for the sentence verification task specifically (e.g., the Constituent Comparison model; Carpenter and Just, 1975).
The proposed similarity with pictorial stimuli makes it likely (but not necessary) that the modality mismatch effects are stronger for the visual than for the tactile dimension. The idea that different modalities may lead to different modality switch effects in the ERP is supported by Collins et al. (2011), where their results indicate different ERP effects for visual and auditory verifications. Qualitative inspection of the frontal waveforms broadly supports this view, but unlike the Collins et al. (2011) study, the current design does not have the statistical power to investigate this matter quantitatively as there are only 20 items per cell.

ModalIty-Match fIndIngs on true sentences
We offer the following, tentative, explanation for the findings on true sentences. Although the full range of mechanisms underlying the generation of an N400 is still not fully understood, integration processes is one possibility (Brown et al., 2000). Increasing the difficulty of integration will produce a greater (more negative) modulation of the N400. Additionally or alternatively the amplitude may serve as an indicator of the ease or difficulty of retrieving stored conceptual knowledge related to a word. The modulation may be dependent on the stored conceptual representation as well as the preceding contextual information (Kutas et al., 2006). One way to integrate a word with the current discourse is to have a set of possible continuations at hand, which requires some type of prediction. In highly constraining contexts, strong predictive N400 effects have indeed been demonstrated (Van Berkum et al., 2005; see also DeLong et al., 2005). The experiment by Van Berkum et al. (2005) was conducted in Dutch, where adjectives must linguistically agree with nouns. The results showed an N400 effect to adjectives that did not agree with a strongly predicted noun.
In the current experiment, all experimental sentences speak about the visual or tactile modality and a half of the experimental sentences are in the same modality as the preceding sentence. Hence, when a visual context is followed by the target sentence "the cellar is…," participants are likely to have "dark" as the highly activated top candidate in the set of possible continuations. This prediction is derived from being in a visual context and simulating the visual experience of "cellar." When, in the true, matched condition, the word "dark" is read it is immediately integrated in the simulation.
At the onset of the critical word in a true, mismatched sentence, the most highly activated candidate is "moist," because the participant's simulation of the concept cellar is in the tactile modality. When the critical word "dark," comes in, the modality of the simulation has to be changed which leads to a modality switch effect and the observed anterior negativity and posterior positivity in windows 1 and 2 (160-215, 270-370 ms). This switch takes time, as was evidenced by the behavioral results of Pecher et al. (2003).

ModalIty-Match fIndIngs on false sentences
As is clear from the scalp distribution shown in Figure 3, a very different pattern of activation was obtained for false sentences than for true sentences. In the false conditions, the target sentence is "the cellar is light," preceded by either a tactile or a visual context causes a delay as outlined above. By the time of the N400 window (350-500 ms), the modality of the simulation may have switched to visual, but the simulation of "light" is minimal (assuming that simulation is based on our prior recent experiences) therefore a standard veracity N400 is observed. We tentatively conclude that a delayed minimal simulation leads to the difficulty in integrating "light" in the N400 time window.

VeracIty fIndIngs: ModalIty-Matched sentences
The situation in the false, matched sentences is slightly more complex. At the onset of the critical word, the participant has simulated the concept Cellar in the visual modality and the most highly activated candidate is "dark." When the critical word "light" comes in, the modality of the simulation does not need to be changed and a wider simulation can be done, which will arrive at "light" as a possible property of cellars: Hence, no Veracity N400 is observed. In other words, although simulation is delayed due to falseness, some benefit occurs from the modality-match that occurs too late to show an effect in time windows 1 and 2, but by the N400 time window, the simulation is rich enough to provide support to the processing of the critical word "light," making it less difficult to integrate. This means the modality context modulates the N400 observed for veracity. We have previously provided evidence showing that the Veracity N400 can be modulated. In Hald et al. (2007), a three sentence context introducing new (supposed) facts about the world significantly reduced the N400 effect to objectively false sentences ("Venice has many roundabouts").

conclusIon
Our results fit well with the ideas of Zwaan and Madden (2005) and Glenberg and Robertson (1999) in that both sets of authors assume that, during comprehension, we build upon simulations constructed from the previous part of the discourse to integrate the ongoing information with the current simulation (Zwaan calls this process construal). This idea applies most naturally to the comprehension of coherent discourse, but it should also apply to pairs of sentences such as our stimuli. It appears that the construction of a simulation in one modality for the context sentence can aid the simulation of the target sentence if it is in the same modality. A modality switch cost is incurred if the target sentence is of another modality, which leads to larger early anterior ERP effects.
Because the modality of previous sentences helps guide prediction, "the cellar is…" proceeded by a tactile context leads to a weaker activation of "dark" than when the preceding context is visual. Guided by the tactile context, the system is looking for a tactile property of "cellar" and this will lead to a modality switch negativity in our analysis windows 1 and 2 (160-370 ms) for true sentences. We think that the mismatch effect is not observed for false sentences because the comprehension system is engaged in efforts to integrate the false information (see above). Our finding suggests that the simulation process, which is central to embodied language processing, can be predictive (in line with Barsalou, 2009) and that that process will make stronger predictions when there is no modality switch. authors' contrIbutIon aPPendIx: full statIstIcal rePorts on the anoVas