Stormy Seas and Cloudy Skies: Conceptual Processing is (Still) Linguistic and Perceptual

Pecher et al. (2010) extended current findings on conceptual processing of words denoting references in the sky or in the ocean (e.g., falcon, dolphin) presented in different positions on the screen (top, bottom). This experimental paradigm has shown to facilitate processing for word meaning in the corresponding spatial location (e.g., falcon presented at the top of the screen). Pecher et al. (2010) compared two explanations for the interaction between concept and position, one related to perceptual simulation, the other to a response selection process, in a response time experiments. To test the perceptual simulation explanation participants received an ocean or sky judgment task (“can this item be found in the sky?”, “can this item be found in the ocean?”) and used a polarity response (yes answer left, yes answer right). Polarity correspondence did not explain response times and the authors concluded that the interaction between reference–word and position therefore need being explained by a perceptual simulation account. Lakens (2011) commented on the conclusions drawn by Pecher et al. (2010) and argued that it is too early to dismiss the polarity correspondence principle, a claim Van Dantzig and Pecher (2011) responded to.

for the interaction between word meaning (e.g., falcon and dolphin) and spatial position (high or low) that is reported in several studies (Zwaan and Yaxley, 2003;Meier and Robinson, 2004;Šetic´ and Domijan, 2007). The first explanation is that readers understand concept by perceptual simulation. We "hear" the dolphin and see it swimming in the ocean. The second explanation for the processing advantage between congruency of meaning (ocean animal) and position (high or low) is that congruency effects lie in a response selection process rather than in meaning representation (Proctor and Cho, 2006). The response selection process Pecher et al. (2010) rule out as an explanation for the congruency effects (but see Lakens, 2011). This leaves only one of the two explanations for the congruency between word meaning and spatial position: perceptual simulations. Since Pecher et al. (2010) give no alternative explanations, we can infer perceptual simulations to be the best explanation for conceptual processing. And that is where Lakens (2011) criticism becomes relevant: what about linguistic processes? Pecher et al. (2010) attribute the interaction between word meaning and spatial position solely to perceptual simulations. One study that also investigated this interaction, and a study Pecher et al. (2010) refer to, is Zwaan and Yaxley (2003). They used word pairs such as attic-basement, and presented these words in a vertical configuration, attic above basement or basement above attic. Response times were faster when the word pair matched the configuration of their referents, than when they did not. When the word pairs were presented in a horizontal configuration the effect disappeared. As with Pecher et al. (2010), Zwaan and Yaxley (2003) concluded that participants must therefore have perceptually simulated the meaning of the words. Louwerse (2008) replicated Zwaan and Yaxley's (2003) findings. In that study two predictions were made. First, perceptual Stormy seas and cloudy skies: conceptual processing is (still) linguistic and perceptual  Pecher et al. (2010) extended current findings on conceptual processing of words denoting references in the sky or in the ocean (e.g., falcon, dolphin) presented in different positions on the screen (top, bottom). This experimental paradigm has shown to facilitate processing for word meaning in the corresponding spatial location (e.g., falcon presented at the top of the screen). Pecher et al. (2010) compared two explanations for the interaction between concept and position, one related to perceptual simulation, the other to a response selection process, in a response time experiments. To test the perceptual simulation explanation participants received an ocean or sky judgment task ("can this item be found in the sky?", "can this item be found in the ocean?") and used a polarity response (yes answer left, yes answer right). Polarity correspondence did not explain response times and the authors concluded that the interaction between reference-word and position therefore need being explained by a perceptual simulation account. Lakens (2011) commented on the conclusions drawn by Pecher et al. (2010) and argued that it is too early to dismiss the polarity correspondence principle, a claim Van Dantzig and Pecher (2011) responded to.
I would like to take the opportunity to respond to another issue that Lakens (2011) commented on and one that also played a central role in a reply by Van Dantzig and Pecher (2011). Lakens pointed out that Pecher et al. (2010) explained their findings in terms of perceptual simulations, and seemed to dismiss linguistic processes. Lakens argued that a unified approach including perceptual and linguistic processes as an explanation for behavioral data in a cognitive task would be more fruitful. Van Dantzig and Pecher (2011) aptly pointed out that the debate on linguistic and perceptual representations was not the central point in the article. At the same time, I agree with Lakens (2011) that providing an "insight into the underlying mental representations of meaning" (Pecher et al., 2010, p. 2) and the conclusion that "people perform a mental simulation of the task-congruent location, which directs spatial attention and facilitates processing of targets in that location" (p. 11) does broader their discussion to the importance of perceptual simulations in conceptual processing. In fact, Van Dantzig and Pecher (2011) agreed with Lakens that "it is important to investigate how linguistic processing and mental simulation contribute to language comprehension" (p. 2), but "[t]here is, however, no evidence that meaning is extracted from linguistic information" (p. 2). The question discussed in Lakens (2011) and Van Dantzig and Pecher (2011) and one that can be inferred from Pecher et al. (2010) is whether conceptual processing can be explained by perceptual simulation and/or by linguistic processes.
The issue whether Pecher et al.'s (2010) response times can be explained linguistically and/or perceptually is not the pivotal research of that study. However, I agree with Lakens (2011) that it is an important one. I have to disagree with Pecher et al. (2010) stating: "[t]he ANOVA on RTs showed a theoretically uninteresting … interaction between category and task, F(1,98) = 73.02, p < 0.001, η 2 = 0.43" (p. 5). In fact, the category × task interaction is theoretically very interesting given the sheer amount of studies that investigate different explanations for conceptual processing (see Louwerse, 2011 for an overview). Moreover, the effect size of this interaction is by far the highest among all effect sizes found in the paper, making the interaction also statistically interesting.
So what does this interaction between category and task involve? If participants were told to make an ocean decision, they responded faster to ocean words than to sky words, with the opposite result for the sky decision. Pecher et al. (2010) interpreted these findings as evidence for the claim that spatial attention is directed by mental simulation of the task-relevant conceptual dimension. They give two explanations times, the interaction pattern obtained for linguistic frequencies maps on to the interactions obtained for response times. At the very least what these results show is that a linguistic factor should not be dismissed, or in the case of Pecher et al. (2010) ignored. That is, it might be the case that by being instructed whether an item could be found in the ocean (or in the sky) a semantic judgment between oceandolphin and sky-dolphin was made, yielding faster RTs for ocean-dolphin than for sky-dolphin. Such a linguistic explanation is supported by experimental evidence: semantic judgments of linguistic stimuli are better explained by statistical linguistic frequencies than by perceptual simulation ratings (Louwerse, 2008;Louwerse and Jeuniaux, 2010).
The statistical linguistic frequencies reported here do not demonstrate a statistical linguistic frequency explanation replaces Pecher et al.'s (2010) conclusion that their findings indicate perceptual simulations. I do not have the response time evidence to support such a claim (even though other studies have shown that linguistic frequency often better explains response times than perceptual factors do; Louwerse, 2008). But these frequency patterns do suggest that an alternative explanation complementary to perceptual simulation should not be dismissed or overlooked. That is, statistical linguistic frequency does not address the central polarity alignment question in Pecher et al. (2010), but it could explain the largest effect size found in their data, putting the conclusion that conceptual processing is perceptual in nature in a different light.
The potential effects of statistical linguistic frequencies on response times are relevant both from an experimental and a theoretical perspective. First, from an experimental perspective, it is important to rule out the possibility that response times can also be explained by linguistic variables. From this perspective, whether or not linguistic processes generated by such variables are activated is of less importance. At the very least such "confounding" linguistic variables should be ruled out, before the conclusion can be drawn that findings can be explained by perceptual simulation. Knowing that (a) perceptual information is encoded in language, (b) patterns predicted from a computational linguistic perspective match the patterns predicted from a whereas the perceptual system picks up on more detailed representations and explains slower response times best. The Louwerse and Jeuniaux (2010) and Louwerse and Connell (2011) findings showed that perceptual relations are encoded in language, that response times are affected by these linguistic cues, and that the role of linguistic and perceptual processes is dependent on the conceptual task and the stimuli.
The findings from Louwerse (2008) Let us consider the findings reported in Pecher et al. (2010). The strongest results were obtained for the interaction between the word meaning (sky words and ocean words) and instruction (sky decision and ocean decision). The question should therefore be raised whether linguistic frequencies could have explained at least some of the variance in the response times. For instance, because participants were verbally instructed to make a judgment on ocean and dolphin or ocean and falcon, it is possible they could have relied on statistical linguistic frequencies, if ocean-dolphin is more frequent than ocean-falcon.
To test whether the statistical linguistic frequencies matched the perceptual simulation explanation (perhaps not a surprise given that perceptual relations are encoded in language), the log frequency of sky-dolphin and ocean-dolphin was computed. The order frequency of all of the 160 (2 × 80) word pairs within 3-5 word grams was obtained using the large Web 1T 5-gram corpus (Brants and Franz, 2006). The interaction between the word sky and ocean and sky-words and ocean-words was significant, F(1, 154) = 39.18, p < 0.001, η 2 = 0.20. The frequencies are presented in Figure 1 (left), next to the RT results Pecher et al. (2010) obtained in the significant interaction, F(1, 98) = 73.02, p = 0.001, η 2 = 0.43 (Figure 1, right). With higher word frequencies generally yielding lower response relations are encoded in language. Second, these encoded linguistic cues are used by comprehenders. Evidence for the first prediction was obtained from the statistical linguistic frequencies of the experimental word pairs in a 5-gram window. The frequency of higher objects preceding lower objects showed to be significantly higher than the reverse order. That is, attic more frequently precedes basement in language than that basement precedes attic. This is perhaps no surprise, considering that we typically say "up and down," "high and low," "top and bottom," "head to toe," etc., rather than that we use the reverse word order. Evidence for the second hypothesis was obtained when the effect of statistical linguistic frequencies and participants' iconicity ratings ("how likely is it that item x is above item y in the real world") on response times were compared. Both statistical frequencies and iconicity ratings explained response times, but statistical linguistic frequencies outperformed iconicity ratings. Moreover, in a horizontal configuration of the word pairs, statistical linguistic frequencies still explained response times (but iconicity ratings did not). These findings showed that perceptual relations are encoded in language, and that response times are affected by these linguistic cues.
Louwerse and Jeuniaux (2010) extended the Louwerse (2008) findings, and again showed that linguistic and perceptual factors both explained conceptual processing. However, their relative importance was modified by the instructional task and the stimuli. If participants were asked to make a semantic judgment on word pairs, the linguistic factor reigned supreme. If participants were asked to make an iconicity judgment on pictures, the perceptual factor reigns supreme, even though both linguistic and the perceptual factors played some role in both linguistic and pictorial stimuli, in both semantic judgment and iconicity judgment tasks. Louwerse and Jeuniaux (2010) concluded that in shallow processing tasks (such as semantic judgments on linguistic stimuli) the linguistic system dominates, whereas in a deeper processing task (such as iconicity judgments on pictorial stimuli) the perceptual system dominates. Louwerse and Connell (2011) confirmed this conclusion, showing that the linguistic system picks up on good-enough representations and best explains faster response times best, continuous grounding of every word in the sentence. In other words, grounding of a minimal number of words allows comprehenders to bootstrap meaning throughout the statistical linguistic patterns. To make this more concrete: if a comprehender has heard the sentence "the Ambulocetus natans is like a whale" the grounding of whale distributes meaning about whales to Ambulocetus natans. With some probability a comprehender might now categorize Ambulocetus natans in oceans, even though the Ambulocetus natans might well be a flying in the sky. Louwerse and Connell (2011) showed that this distributed activation through statistical linguistic frequencies is a good explanation for early conceptual processing. Even though the comprehender is not entirely clear what an ambulocetus natans is, based on the information that it has similarities with a whale, it can help generating good-enough representations. Perceptual information about ambulocetus natans by grounding the word to its referent then gives more detailed information (for instance, ruling out the possibility that it can fly). The idea that language encodes perceptual information and that language users utilize these linguistic cues in their comprehension processes is central to the Symbol Interdependency Theory (Louwerse, 2011). The evidence supporting this theory replaces the question whether conceptual processing is perceptual with the more relevant question whether conceptual processing is dominated more by perceptual simulation or by statistical linguistic frequencies -because conceptual processing is both linguistic and perceptual.
In conclusion, Pecher et al.'s (2010) investigated the interaction between words denoting references in the sky or in the ocean (e.g., falcon, dolphin), their positions on the screen (top, bottom), judgment task ("can this item be found in the sky?", "can this item be found in the ocean?") and polarity response (yes answer left, yes answer right). Of all the main effects and interactions found, the interaction between words denoting references in the sky or in the ocean (e.g., falcon, dolphin) and the judgment task ("can this item be found in the sky?", "can this item be found in the ocean?") by far had the largest effect size. A statistical linguistic frequency analysis showed similar patterns as a perceptual simulation account, providing support for the conclusion that conceptual processing is (still) linguistic and perceptual.
Together with Lakens (2011) I agree that the time is ripe to consider explanations that go beyond a strict perceptual simulation account. Even though not being the a priori research question Pecher et al. (2010) aimed to answer, their findings suggest that both linguistic and perceptual factors should be considered in conceptual processing tasks. Indeed, paraphrasing Lakens (2011), a fruitful approach to cognition examines when meaning emerges from linguistic and perceptual processes. perceptual simulation perspective, make a careful analysis including linguistic variables desirable.
Second, from a theoretical perspective it is important to determine how linguistic processing and perceptual simulations are involved in cognition. From this perspective it is relevant whether or not linguistic processes complement perceptual processes in language comprehension. If they do not, a number of questions need an answer. For instance, why has language evolved encoding perceptual relations, even though comprehenders do not rely on such cues (Christiansen and Chater, 2008)? Why does the effect of a linguistic factor on processing change as a function of the conceptual task, if the linguistic variable itself remains constant (Louwerse and Jeuniaux, 2010)? And without considering linguistic processes, how can the results of numerous experiments be explained if verbal processes are not considered (Paivio, 1986)?
Finally, the question should be raised how linguistic processing can lead to comprehension (Van Dantzig and Pecher, 2011). After all, the sentence "whales swim in the ocean" make sense because we perceptually simulate the words by 'seeing' , 'hearing' and 'feeling' whales and oceans. My argument is not that linguistic processing can succeed without any grounding the words of a sentence to perceptual information. Instead, the argument is that comprehension could emerge without