On the need for Embodied and Dis-Embodied Cognition

Dove, Guy

doi:10.3389/fpsyg.2010.00242

HYPOTHESIS AND THEORY article

Front. Psychol., 25 January 2011

Sec. Cognition

Volume 1 - 2010 | https://doi.org/10.3389/fpsyg.2010.00242

This article is part of the Research TopicEmbodied and Grounded CognitionView all 24 articles

On the need for embodied and dis-embodied cognition

Guy Dove*

Department of Philosophy, University of Louisville, Louisville, KY, USA

This essay proposes and defends a pluralistic theory of conceptual embodiment. Our concepts are represented in at least two ways: (i) through sensorimotor simulations of our interactions with objects and events and (ii) through sensorimotor simulations of natural language processing. Linguistic representations are “dis-embodied” in the sense that they are dynamic and multimodal but, in contrast to other forms of embodied cognition, do not inherit semantic content from this embodiment. The capacity to store information in the associations and inferential relationships among linguistic representations extends our cognitive reach and provides an explanation of our ability to abstract and generalize. This theory is supported by a number of empirical considerations, including the large body of evidence from cognitive neuroscience and neuropsychology supporting a multiple semantic code explanation of imageability effects.

Introduction

In this essay, I propose and defend a new take on a familiar idea. The familiar idea is that our concepts are encoded in at least two general types of semantic representations: one type that is perception and motor based and another that is language based (Paivio, 1971, 1986). Although most concepts employ both types of representations, abstract concepts tend to depend more on linguistic representations than concrete concepts do. What separates my version of this idea from most previous ones is that I develop it within an embodied approach to cognition (although see Barsalou et al., 2008; Louwerse and Jeuniaux, 2008 for related yet distinct views).

My defense of this new take has three parts. The first part outlines and motivates an embodied approach to concepts based on simulation. The second part examines a challenge that faces any form of embodied cognition: the problem of abstraction. After making the observation that the symbolic structure of language is well suited to solving this problem, I propose that language should be seen as a form of what I refer to as “dis-embodied” cognition. What I mean by this is that linguistic representations are embodied in the neurophysiological sense that they rely on sensorimotor simulation but, unlike other embodied forms of cognition, they do not inherit semantic content from this fact. They do, however, accrue semantic content through their associations and inferential relationships with other linguistic representations. The third part surveys empirical evidence that supports the existence of separate semantic codes.

Embodied Concepts

Historically, cognitive scientists have presumed that higher cognitive processes are carried out by computations involving amodal mental representations (i.e., representations that are not located within a sensorimotor modality). The precise nature of these representations was a matter of some debate. For instance, a great deal of controversy has surrounded the issue of how language-like they might be (Fodor, 1975). The presumption of amodality, however, went largely unquestioned. The strength of this presumption was clearly demonstrated by the heated nature of the debate concerning the possibility that analog perceptual representations might be employed in mental imagery tasks (Pylyshyn, 1973, 1981; Kosslyn and Shwartz, 1977). Now, there is general agreement that behavioral and neural evidence suggests that mental imagery (Kosslyn, 1994) and motor imagery (Jeannerod, 1995; Grèzes and Decety, 2001) depend on sensory and motor representations respectively.

Within the last two decades, a growing number of researchers and philosophers have argued that cognitive science needs to reorient itself with respect to its fundamental assumptions about the nature of mind and cognition. These researchers and philosophers contend that cognitive processes need to be viewed as fundamentally based in our bodily interactions with the world. Clark (1998, p. 506) expresses this view clearly in his economical assertion that, “Biological brains are first and foremost the control systems for biological bodies.” The idea is that we cannot hope to understand the functioning of the brain without appreciating the central role it plays in guiding perception and action. This view has lead to a robust and diverse research program in which investigators examine the possible ways in which thinking, remembering, and understanding language are shaped by the fact that we dynamically interact with our complex physical and social environment by means of perceptual and motor capacities (Wilson, 2002). Embodied theories of cognition often suggest that concepts are understood via sensorimotor simulations. Neural systems that are involved in understanding real objects, actions, and events in the world are used to internally simulate those objects, actions, and events at later points in time.

The Theoretical Promise of Embodied Concepts

Within cognitive science, the orthodox approach to concepts views them as containing amodal representations. This approach posits mental symbols that are manipulated solely based on their syntactic properties. By assumption, there is no intrinsic connection between these symbols and what they represent. This approach faces a well-known challenge: the symbol grounding problem. Harnad (1990, p. 335) summarizes this problem with the question, “How can the meanings of the meaningless symbol tokens, manipulated solely on the basis of their (arbitrary) shapes, be grounded in anything but other meaningless symbols?” Perhaps the easiest way to think of this problem is to imagine trying to learn a foreign language from a dictionary in that language. Each word would be defined in terms of its connections to other words. In order to avoid this problem, the meaning of at least some mental symbols must be grounded in something other than their syntactic properties.

A key impetus for the hypothesis that concepts are couched in sensorimotor representations is the belief that this will help with the symbol grounding problem. In order to see just how it might help, we need to have a clear conception of what an embodied account of concepts might look like. For that reason, I am going to briefly sketch what I take to be the strongest and most developed embodied accounts of concepts: the perceptual symbol theory (Barsalou, 1999). I should emphasize, though, that many of the points made in this essay extend beyond this particular theory and do not depend on its ultimate success. A core tenet of perceptual symbol system theory is that sensorimotor simulations of experience are of central importance to our concepts. Intuitively, the idea is that our conceptualization of a category consists of simulating the experience of perceiving and/or acting on exemplars of that category. Such simulations are the result of a kind of neurophysiological re-enactment: information concerning the neural activation patterns associated with perception or action, which has been captured and stored by conjunctive neurons in neighboring association areas or convergence zones (Damasio and Damasio, 1994), is used later in absence of relevant input to generate a partial reactivation of the sensorimotor representations.

Perceptual symbols have a number of properties that make them well suited to serve as conceptual representations (Barsalou, 1999, 2003). First, simulations need not be conscious – that is, they may contain unconscious perceptual representations (for evidence to this effect see Pecher et al., 2009). This property removes some of the traditional objections to imagistic theories of cognition that turn on the unreliability or vagueness of introspection. Second, simulations will often be schematic in the sense they contain only some of the sensorimotor representations involved in the experience being simulated. For instance, a simulation in the visual modality of the concept DOG might involve shape representations but not color representations. Third, they will typically be multi-modal in the sense that they involve the reactivation of perceptual representations in several modalities. Fourth, perceptual symbols provide a novel means of drawing the type/token distinction (Barsalou, 1999, 2003). This is achieved through distinguishing simulators and simulations. A simulator is a distributed system spanning association and sensorimotor areas. To possess a concept, such as DOG, is to have a skill or ability to generate appropriate perceptual representations of dogs in a given situation. An innovative aspect of Barsalou’s account is that it holds that these simulations are context-sensitive: simulations for a given concept vary depending on the context and the speaker’s goals. For example, they might represent objects from a particular perspective. Typically, simulations will involve only a small subset of the information stored in memory.

Although I believe that simulation-based accounts of embodiment have the most empirical promise, I should acknowledge that there are other theoretical conceptions of embodiment. Borghi (2005) identifies two distinct strains of embodied cognition – one that focuses on affordances and situated action and the other that focuses on simulation – and argues that both are true depending on the context. I am going to limit myself to the simulation framework here for a couple of reasons. The first is that I believe that this framework is more flexible than critics assume. An unfortunate consequence of Barsalou’s use of the term perceptual symbol is that it gives the false impression that simulations are based in perception and not in action mechanisms. However, nothing in the theory prevents purely motor-based simulations. Perceptual symbols are thus compatible with, for example, action schemas (Glenberg and Robertson, 1999). The second reason is that I am committed to a representational approach to concepts. One of the issues that separate different views of embodiment is the status of representations. Many proponents of affordances and situated action embrace non-representational accounts of cognition – often appealing to the promise of dynamical systems theory. Based largely on this issue, Clark (1997) distinguishes between embodied cognitive science and radical embodied cognitive science. Clark (1997, 2008) defends the former while theorists such as Chemero (2009) defend the latter. Siding with Clark, I assume that the notion of representation is too useful to give up and, furthermore, that an empirically successful theory of concepts will involve an appeal to representations (Markman and Dietrich, 2000).

A perceptual symbol consists of a neurophysiological re- enactment of a collection of sensorimotor representations. It can be thought of as having perceptual content because there are certain states of affairs in the world that would be likely to elicit these representations under normal conditions. Barsalou argues that this perceptual content can provide a leg up with regard to intentional content. He writes (Barsalou, 1999, p. 597; emphasis in the original):

Where perceptual symbols do have an advantage [over amodal symbols] is in the ability of their content to play a heuristic role in establishing reference. Although perceptual content is rarely definitive for intentionality, it may provide a major source of constraint and assistance in determining what a symbol is about.

The general idea is that perceptual symbols help us refer to objects and events because they are already causally connected with those objects and events. This causal connection does not fully determine the conceptual content of a perceptual symbol but it can help secure that content.

Although embodied cognition has promise with respect to helping with the symbol grounding problem, it seems too early to declare victory for two reasons. The first is that it is not clear that the problem has been fully solved (Taddeo and Floridi, 2005). The second is that other approaches may have the conceptual resources to address the problem. Instead of proclaiming that embodiment solves this longstanding problem, I am going to make a weaker and hopefully less controversial claim: the heuristic role identified by Barsalou is an attractive design feature of perceptual symbols. A conceptual system containing perceptual symbols can benefit from the role that sensorimotor representations play in guiding action and perception. To be more precise, I am going to claim that this design feature is more beneficial with some concepts than it is with others.

Empirical Evidence

There is little question that embodied cognition has been a productive research program. New research seems to emerge daily. Due to this abundance, I am only going to offer a selective review. My purpose is not to be comprehensive but, instead, to provide general motivation for an embodied approach to concepts.

A number of behavioral experiments support the notion that perceptual representations are central to some cognitive tasks. For instance, Pecher et al. (2003) found a modality-switching cost in a linguistic task. Participants verified verbally expressed facts involving one modality, such as the fact that leaves rustle, more rapidly after verifying a fact involving the same modality, such as the fact that blenders make noise, than after verifying a fact involving a different modality, such as the fact that cranberries are tart. More recently, van Dantzig et al. (2008) found a similar modality-switching cost between a perceptual detection task and a property verification task. Stanfield and Zwaan (2001) asked participants to affirm whether or not pictures depicted the actions described in previously presented sentences. The actions had either a vertical or horizontal orientation (such as driving a nail into a floor or into a wall). Participants responded more quickly to the pictures that had the same orientation as the action described. Stanfield and Zwaan (2001) suggest that the subjects generated a perceptual image of the action described in the sentence and then used this image to carry out the affirmation task.

Other behavioral studies demonstrate the degree to which cognitive tasks can be interwoven with action. For instance, Borghi et al. (2004) found a compatibility effect associated with language processing and action. Participants were instructed to decide whether or not a word that followed a sentence named a part of the object mentioned in the sentence. Half of the selected parts were found in the upper-portion of the object and half were found in the lower-portion. The experimenters found that responses were faster when the direction of the key press movement (upward or downward) matched the part location (upper or lower). Further studies indicate that the motor representations elicited by the cognitive tasks can exhibit somatotopic specificity. For instance, Scorolli and Borghi (2007) asked their participants to judge whether or not simple sentences containing a verb and a noun were sensible or not and respond either by pressing a pedal or speaking into a microphone. The verbs in the sentences referred to actions that were typically performed with the mouth, hands, or the feet. Response times with the microphone were fastest with “mouth-sentences” and response times with the pedal were fastest with “foot-sentences” (see also Scorolli et al., 2009).

Researchers have produced evidence using transcranial magnetic stimulation (TMS) that provides compelling support for the behavioral findings. Pulvermüller et al. (2005) carried out a TMS study in which they found that stimulation over motor areas affects action word processing. They weakly stimulated different parts of the motor system while participants performed a lexical decision task on arm- and leg-related action words. Weak stimulation of left hemisphere areas associated with arm-movement led to an increased response time with arm-related words in comparison with leg-related words, and the reverse pattern occurred with weak stimulation of motor areas associated with leg-movement. Response times were not modulated in a control condition with a sham stimulation. Using a different experimental paradigm, Buccino et al. (2005) found that listening to action-related sentences modulated activity in the motor system. Motor evoked potentials (MEPs) recorded from hand and foot muscles were specifically modulated by hand-related and foot-related action sentences respectively.

More support is provided by the fact that lesions can lead to the loss of multiple categories that share perceptual properties (Simmons and Barsalou, 2003). For instance, Adolphs et al. (2000) found that damage to the somatosensory cortex was correlated with deficits in the visual recognition of facial expressions. The authors propose that simulation of producing facial expressions is involved in the recognition of facial expressions in others. A selective deficit in action word processing has been found patients with motor neuron disease (Bak et al., 2001). A word of caution is needed, though, because modality-specific damage does not explain the category-specific deficits of all patients (Caramazza and Mahon, 2006).

A body of brain imaging data support an embodied approach to concepts. Martin et al. (1996), for example, found increased activation in visual areas with categories that appear to rely heavily on visual information for identification. Simmons et al. (2007) find evidence of a common neural substrate for color perception and verification of object-associated color (e.g., taxi-yellow). Using a visual naming task, Chao and Martin (2000) found increased activity in motor areas with highly manipulable objects when compared to less manipulable objects. Hauk et al. (2004) had participants read individual words that referred to actions involving leg, arm, and head movements such as lick, pick, and kick. They found that reading each type of action word produced increased activation in successively in the regions of M1 associated with performing the relevant movements. In a task where participants listened to action-related sentences, Tettamanti et al. (2005) found increased activation in effector-specific premotor and motor areas.

In sum, a number of studies using different experimental paradigms and techniques implicate sensorimotor representations in various cognitive tasks. Positing perceptual symbols provides an economical and robust explanation for a diverse set of observed phenomena, including reaction times, the functional character of some neuropathologies, and neural activation patterns in response to certain cognitive tasks.

Challenges to the Evidence

Aside from the problem of abstraction, which will be discussed in the next section, the inference to embodied cognition from the available evidence faces two major challenges. The first concerns how the debate is framed. Machery (2007, 2010) argues that amodal theories are not monolithic, and there are conceivable amodal systems that would fit with the available evidence. In a similar vein, Mahon and Caramazza (2008) contend that the activity in sensoriomotor areas observed in many experiments could be the result of spreading activation from amodal representations. The ability to offer amodal explanations for the available evidence undermines some of the hyperbolic rhetoric used by supporters of embodied cognition. Too often, such supporters claim that the empirical predictions of embodied and amodal approaches sharply diverge. What Machery and Mahon and Caramazza demonstrate is that the empirical decision between the embodied and amodal approaches may be more difficult than some have advertised. This point seems well taken; the issue will ultimately be decided by which approach is best supported by the evidence. The defeasible position of this paper is that the available evidence favors an embodied approach.

The second challenge is that the neuroimaging evidence does not exclude the presence of amodal representations. Indeed, many of the cited imaging experiments find modulation of activity in multiple brain areas. Several commentators (e.g., Weiskopf, 2007; Chatterjee, 2010; Machery, 2010) point out that a number of the neuroimaging studies cited in support of embodied cognition actually find modulated activity in brain areas that are near – but not identical to – areas used for perceptual and motor processing. This is a serious challenge to a philosophical position known as neo-empiricism (Prinz, 2002). A core tenet of this position is that all conceptual representations are modality-specific (Machery, 2010). Against this universal claim, evidence suggesting that some conceptual representations are located within areas outside of areas used for perceptual and motor processing is damning. It is not clear, though, that such evidence undermines a simulation-based embodied approach.

On some level, the distributed activation patterns found in the literature fit with the theory of perceptual symbols. Barsalou (2003) proposes that long-term memory integration processes underlie the ability create appropriate simulations. Such processes are needed to explain our ability to generalize and abstract away from particular exemplars and generate the right simulations on a given occasion. This move offloads significant aspects of conceptualization into non-perceptual association areas or convergence zones (Damasio and Damasio, 1994). It also raises the question of whether or not these areas contain amodal symbols. Barsalou et al. (2003, p. 87) concede that “…conjunctive neurons in convergence zones constitute a somewhat amodal mechanism for capturing and re-enacting modality-specific states” but then go on to point out that alternative explanations of the activity of these neurons are available that do not require amodal symbols. They then suggest that we should pragmatically assume that convergence zones do not contain amodal symbols until evidence suggests otherwise.

This is not a satisfying solution to the challenge posed by activation in convergence zones because it is provisional and ad hoc. Fortunately, there is a better way to meet this challenge: we can adopt a more liberal definition of an embodied concept. The fundamental intuition behind the embodied approach is that cognition is fundamentally integrated with perceptual and motor systems. Such integration does not in and of itself exclude supramodal or even amodal representations as long as the function of these representations is to engage appropriate simulations and not to act as independent conceptual representations. I would even go further and suggest that the very modal/amodal distinction fits poorly with an integrated embodied perspective because it presupposes a clean distinction between cognition and perception. From an embodied perspective, no such clean distinction exists. If I am right, then evidence of relevant neural activity in areas near to, but not directly associated with, a particular sensorimotor modality is not unequivocally incompatible with an embodied approach.

The Problem of Abstraction

A well-known limitation of the evidence for embodied concepts is that it primarily involves concrete or highly imageable concepts (Pezzulo and Castelfranchi, 2007; Louwerse and Jeuniaux, 2008; Dove, 2009). This is problematic because, although it is not difficult to imagine how embodiment might help us acquire concrete concepts, it is difficult to see how it can be anything but a hindrance with abstract concepts such as DEMOCRACY, ELECTRON, ENTROPY, JUSTICE, NUMBER, PATIENCE, and TRUTH. Representations grounded in sensorimotor systems do not seem to be well suited to representing abstract intentional contents. For this reason, abstract concepts remain a critical issue for embodied cognition. More is at stake than simply the reach of this approach. For instance, Mahon and Caramazza (2008, p. 60) use the challenge posed by abstract concepts to support a parsimony argument in support of an amodal approach to concepts:

Given that an embodied theory of cognition would have to admit ‘disembodied’ cognitive processes in order to account for the representation of abstract concepts, why have a special theory just for concepts of concrete objects and actions?

While I am not convinced that such parsimony arguments have much force (the history of psychology is rich with highly economical failed theories), the core premise of this argument – i.e., that abstract concepts require disembodied cognition – needs to be examined.

Three Embodied Approaches to Abstract Concepts

Supporters of embodied concepts have begun to address the problem of abstraction. Three main approaches exist in the literature (for a review see Glenberg et al., 2008). Although each approach has some empirical support, there are reasons to believe that these approaches do not provide a full solution to the problem of abstraction.

The first and most well established approach involves metaphoric extension. This approach originally emerged from work in cognitive linguistics (Lakoff and Johnson, 1980; Lakoff, 1987). The core idea is that we often understand one conceptual domain metaphorically in terms of another. Often, these metaphors are shaped by image schemas formed from our bodily interactions, linguistic experience, and historical context. For instance, the concept of ARGUMENT may be understood in terms of the concept of WAR. The primary evidence for this approach is our use of linguistic metaphors. Some recent behavioral studies, though, provide evidence of the metaphorical use of space to represent abstract concepts. For instance, Boroditsky and colleagues (Boroditsky and Ramscar, 2002; Casasanto and Boroditsky, 2008) provide evidence that some temporal judgments rely on spatial representations. Richardson et al. (2003) attempted to ascertain whether or not comprehending abstract verbs, such as argue and respect, automatically activates spatial image schemas with a specific orientation (horizontal for argue and vertical for respect). Participants listened to short sentences while engaged in either a visual discrimination task or a picture memory task. Reaction times suggest that there was an interaction between the horizontal/vertical orientation of the image schema and the horizontal/vertical orientation of the visual stimuli.

The second approach is similar in spirit to the first but focuses on the importance of action schemas (Glenberg and Robertson, 1999). The core idea of this approach is that some abstract language is grounded in motor processes. A primary source of evidence is the action–sentence compatibility effect or ACE (Glenberg and Kaschak, 2002). Glenberg and Kaschak found that reaction times decreased when response direction (a button press either away/toward the body) and the implied direction of either concrete action sentences (e.g., Andy gave you the pizza/You gave Andy the pizza) or abstract transfer sentences (e.g., Liz told you a story/You told Liz a story) matched. They suggest that the ACE is the result of competition for resources by the motor planning associated with the action and the language processing associated with the sentence. Adding to the behavioral research, Glenberg et al. (2008) recently provide neurophysiological evidence that comprehension of both object-transfer and abstract-transfer sentences modulates motor system activity.

The third approach proposes that, contrary to our intuitions, some abstract concepts involve situated simulations (Barsalou, 1999). This approach is supported by evidence from feature generation experiments. In a preliminary study, Barsalou and Wiemer-Hastings (2005) asked participants to generate typical properties for three abstract concepts (TRUTH, FREEDOM, and INVENTION), three concrete concepts (BIRD, CAR, and SOFA) and three intermediate concepts (COOKING, FARMING, and CARPETING). The authors report two core findings: that participants generated situational properties with both concrete and abstract concepts and that participants tended to generate more event and introspective properties with abstract concepts. They propose that abstract and concrete concepts are generally associated with different aspects of situations: abstract concepts tend to focus on social aspects while concrete concepts tend to focus on physical entities and actions. In a more fully realized experiment employing similar methodology, Wiemer-Hastings and Xu (2005) found that participants tended to produce fewer entity properties, more introspective properties, and more relational properties with abstract concepts than with concrete concepts.

How promising are these approaches? Let us consider each in turn. There are a number of reasons to be skeptical of metaphorical projection solution to the problem of abstraction. First, there are reasons to question the force of the linguistic evidence supporting this approach. It is just not clear that such linguistic patterns directly reflect conceptual structure. Indeed, alternative explanations of metaphors that do not require positing metaphoric representations are available (Murphy, 1997). Another problem is that this proposal seems developmentally implausible (Murphy, 1996). For example, it seems unlikely that an understanding of the complexities of war is required for the acquisition of the concept of an argument. Furthermore, evidence suggests that children’s understanding of metaphor remains quite poor before the ages of 8–10 (Winner et al., 1976). Finally, there is an inherent difficulty faced by the attempt to capture conceptual content in terms of metaphor: while a metaphor enables us to highlight the similarities between two concepts, it cannot capture the important differences. Arguments, after all, are not really wars. Recognizing the appropriate connections between a perceptual experience and what it is being metaphorically extended to cover seems to require a prior understanding of the concept. Without such an understanding, it is difficult to see how one can arrive at a correct interpretation of a metaphor. The very ubiquity of spatial metaphor undermines its potential for representing a specific abstract concept such as RESPECT. This ubiquity raises the question of whether a non-metaphoric understanding of the target concept is needed to anchor these metaphoric uses.

Although the action schema approach is similar in spirit to the metaphorical projection approach, it enjoys some advantages over the metaphorical projection approach. For one, the evidence offered in support of this approach seems more substantial and less equivocal. Second, the developmental picture behind this approach seems more plausible. It fits with the developmental evidence suggesting that concrete or highly imageable event words are easier for young children to acquire than abstract ones (Maguire and Dove, 2008). Despite these advantages, the action schema approach faces some of the same challenges as the metaphoric projection approach. For instance, the apparent representational flexibility of action schemas raises the question of how it is possible to acquire the relevant abstract concepts. If the same action schema underlies various concepts, how are the differences between these concepts represented? Another problem is that it is difficult to imagine how action schemas can account for all abstract concepts. For instance, it is not clear how they might handle concepts such as ELECTRON, NUMBER, and TRUTH.

Finally, consider the situated simulation approach. The body of evidence cited in support of it is admittedly quite thin. More importantly, this evidence may not resolve the issue of the embodiment of conceptual representations. A supporter of amodal symbols could well argue that disembodied symbols are needed to account for our ability to represent the social and relational aspects of situations. In the end, the most serious problem facing the situated simulations proposal is that a particular abstract concept such as DEMOCRACY is not likely to be associated with a simple set of sensorimotor experiences (Dove, 2009).

In sum, current attempts to offer an embodied solution to the problem of abstraction appear suffer from two weaknesses: insufficiency and incompleteness. The approaches appear to be insufficient because they do not provide a full explanation of the concepts to which they apply. They appear incomplete because they do not seem to capture all abstract concepts. This is not to say that these proposals have no merit. Instead, I suggest that each has some promise and empirical support, but, ultimately, more is needed to explain our ability to abstract and generalize.

Dis-Embodiment

Supporters of an embodied approach to concepts tend to treat the problem of abstraction as a collection of exceptions. The task then becomes to explain a subset of these exceptions using the theoretical techniques and experimental designs of the research program of embodied cognition. This effort ignores the fact that abstraction represents a general problem for embodied concepts. What we need to explain is our ability to go beyond embodied experience. Earlier we emphasized how grounding our concepts in action and perception systems may help us acquire conceptual content. Now, we need to acknowledge that such grounding has potential costs associated with it. In particular, sensorimotor simulations seem ill-suited for representing conceptual content that is not closely tied to particular experiences. The problem is that some concepts appear to require what we might call ungrounded representations.

The orthodox position within cognitive science, clearly expressed in the quote from Mahon and Caramazza given above, is that such “disembodied” concepts require amodal representations. If we look at the general features of the proposed embodied solutions to the problem of abstraction – particularly the metaphor projection and action schema approaches – a different theoretical possibility emerges. Each of these approaches proposes ways in which embodied representations associated with a certain experiential/cognitive domain can be used to refer to objects and events outside of that domain. To capture this idea, I am going coin a new term: dis-embodiment. A mental symbol is dis-embodied if (1) it is embodied but (2) this embodiment is arbitrarily related to its semantic content. In other words, a mental symbol is dis-embodied if it involves sensorimotor simulations of experiences that are not associated with its semantic content. The dash in the middle of this term is intended to distinguish this notion from the more general notion of disembodiment to which Mahon and Caramazza appeal. What I want to suggest is that the proposals outlined above are on the right track, but they fail to provide a general solution to the problem of abstraction. Below, I argue that natural language itself serves as a form of dis-embodied cognition and plays an extensive role in enabling us to acquire and use abstract concepts.

Language as a form of Dis-Embodied Cognition

One way to approach the problem of abstraction is to scrutinize the abstract/concrete distinction (Scorolli, 2009). A number of researchers suggest that there are qualitative differences between abstract and concrete concepts. For example, Barr and Caplan (1987) propose that a meaningful distinction can be drawn between categories that are primarily represented by “extrinsic” features (those associated with relations between two or more entities) and those that are represented by “intrinsic” features (those associated with individual entities). Based on property generation studies, Wiemer-Hastings and Xu (2005) propose a two-factor account in which abstract concepts are both less contextually specific and predominately associated with social aspects of situations. Crutch and Warrington (2005) propose a qualitative distinction in which concrete concepts are organized primarily around similarity and abstract concepts are organized around semantic association. A recent eye-tracking experiment suggests that these representational differences emerge during on-line word-recognition (Deñabeitia et al., 2009). Participants were presented with visual displays that included a target picture of item that was a semantic associate of an abstract or concrete word. Their eye-movements were recorded as they listened to the relevant words. They tended to fixate more (and earlier) on depicted objects that were associates of abstract words than associates of concrete words. Overall, evidence of a qualitative distinction between abstract and concrete concepts is growing. What is the source of this distinction? I propose that it arises from an asymmetry between the types of representations employed by abstract and concrete concepts. While concrete concepts generally depend on both linguistic and non-linguistic perceptual symbols, abstract concepts tend to rely primarily on linguistic perceptual symbols.

Natural language has a number of design features commonly associated with amodal symbol systems that make it well suited to representing abstract concepts. Indeed, natural language is often held up as a paradigmatic example of an amodal symbol system. Three design features are particularly important. The first is the inherent representational arbitrariness of words and morphemes. There is, for example, no intrinsic similarity or other extralinguistic connection of the English word cat to the category of cats. Indeed, other languages associate phonetically and graphemically different words with the same category. Furthermore, the phonemic similarity of cat to cap carries no weight with respect to the contents of these words. The second is its stimulus-independence (Chomsky, 1966). Competent speakers are able to produce linguistic utterances in a self-generated fashion that is not an immediate response to proximal environmental stimulation. The third is its systematicity (Fodor, 1975; Fodor and Pylyshyn, 1988; Pinker, 1994). The ability to produce a sentence such as Joni loves Chachi seems to come hand in hand with the ability to form other sentences such as Chachi loves Joni and Jenny loves Chachi, etc. A common explanation of these design features is that natural language amounts to a syntactically recombinable symbol system. While there are disagreements concerning the cognitive architecture that underlies our linguistic competence, a large body of linguistic research suggests that the morphosyntactic structure of language is at least characterizable in terms of a productive grammar.

Now the mere fact that natural language is stimulus-independent and systematic does not sufficiently distinguish it from garden variety perceptual symbols. One of the achievements of perceptual symbol theory is that it demonstrates how a simulation-based symbol system might have these properties (Barsalou and Prinz, 1997). Stimulus-independence and systematicity alone cannot establish an advantage of verbal over non-verbal representations with respect to abstract contents. Natural language must bring something else to the table. In a philosophical exploration of possible conceptions of animal and human cognition, Camp (2009) suggests that we should view stimulus-independence and recombinability as degree properties. She then argues that natural language enhances these features in at least four ways. First, natural language is likely to increase the range of thoughts that any one individual may entertain because it enables one to hear the thoughts of others. Second, natural language makes it easier to reproduce the same thought in different situations because of its lack of context-sensitivity. Third, the manifest syntactic structure of natural language highlights the potential recombinality of thoughts and thus encourages us to entertain a wider of thoughts. Finally, natural language provides a sufficiently rich expressive medium to allow one to represent truth-values and inferential relations among thoughts. These enhancements mean that a creature with language is likely to enjoy a general cognitive advantage over a creature that does not.

A primary benefit afforded by a natural language is that it provides a representational system that can play the integrative role traditionally associated with amodal symbols. Consider the following argument for the necessity of amodal representations. After recognizing the existence of independent sensorimotor codes, Jackendoff (1992, p. 3) contends that amodal representations are necessary because “…none of these forms of input and output information suffices to explain the way that we understand the world in terms of objects, their motions, our actions on them, and so forth.” The general idea is that amodal representations are needed to capture generalizations about entities and events that go beyond the information contained within specific modalities. Amodal representations provide a means of gathering and integrating information from different modalities as well as transferring information between distinct sensorimotor codes. Because linguistic representations have the design features outlined above, they can also carry out this function (Carruthers, 2002).

I propose that when an individual acquires a natural language, she acquires a representational system that is different in some important respects from the multimodal, context-sensitive embodied symbol systems that exist independently of language. The acquisition of natural language, in other words, enhances and extends her representational abilities by giving her access to a context-free and arbitrary symbol system. This symbol system is independent of, and yet interacts with, other embodied symbols.

This proposal requires a revisionist conception of linguistic competence. Standard theories of linguistic competence are thoroughly amodal. Linguists have identified structural regularities at several levels of analysis, including phonology, morphology, syntax, and to some degree logical form or semantics. Knowledge relating to these levels is thought to be contained with language-specific functional modules (Fodor, 1983) and is generally thought to be couched in amodal codes. Comprehension involves translating perceptual information into these codes and production involves translating information in these codes into motor representations. The revisionist approach taken in this essay is that the process of achieving competence in a specific natural language involves acquiring the ability to generate appropriate simulations of linguistic experience. To be successful, these simulations must comport with the structural regularities at the different levels of analysis. They will not, however, depend on knowledge contained with an amodal symbol system. Three points about this revisionist proposal are especially important. The first is that it is neutral with respect to the issue of the degree to which linguistic competence is innate or learned. This proposal has to do with the format of the representations associated with this competence and not how it is acquired. The second is that, despite superficial appearances, this is not an inner speech view. The claim is that linguistic competence is contained within a system for generating perceptual symbols. These symbols consist of neurophysiological simulations that can be partial, selective, and unconscious. The third important point is that there is no independent lexical semantic code. The core thesis of this paper is that concepts are couched in two types of simulation-based representations: those associated with non-linguistic experience of the world and those associated with experience of language. Because simulations are detailed and often complex, linguistic perceptual symbols may exhibit structure at the various levels of analysis (phonology, morphology, syntax, etc.).

Thinking in Words

Despite the clear differences between embodied and orthodox approaches to cognition, both adopt a similar view of the relationship between language and thought. Both see language as a medium of communication rather than a medium of thought. According to both, language expresses underlying thoughts that are encoded in some other semantic code. Within traditional cognitive science, this code is typically taken to be a language-like amodal symbol system (Fodor, 1975). Within embodied cognition, this code is thought to consist of embodied representations grounded in action and perception mechanisms. Glenberg et al. (2008, p. 4) offer the following summary of what researchers mean when they say that language is embodied:

Linguistic symbols are embodied to the extent that: (a) the meaning of the symbol (the interpretant) to the agent depends on activity in systems also used for perception, action, and emotion, and (b) reasoning about meaning, including combinatorial processes of sentence understanding requires use of those systems.

The idea is that linguistic symbols have meaning because they dynamically activate sensorimotor representations associated with interacting with the world. On this account, linguistic symbols are intermediaries that do not directly have meaning or participate in reasoning about meaning.

I suggest that language plays two roles in our cognitive lives. One role is to engage sensorimotor simulations of interacting with the world. In this role, language serves primarily as a medium of communication. A second role is to elicit and engage symbolically mediated associations and inferences. Our concepts are not merely couched in sensorimotor representations but also in linguistic representations (words, phrases, sentences). Conceptual content is captured in part by the relationships of linguistic representations with other linguistic representations. These relationships may be merely associative or they may be inferential. On this view, a concept such as DOG will, not only be represented on a given occasion by multimodal simulations associated with interacting with dogs, but will also be represented in terms of related linguistic words, phrases, or sentences. This idea has a clear affinity with inferential role or conceptual role semantics (Harman, 1982; Block, 1986). This philosophical theory of mental content holds that the meaning of a concept is determined by its functional role within the cognitive life of an individual. My proposal is distinct from this theory because it adds the further requirement that the associative and inferential relationships be couched in language-based simulations.

One source of evidence for the view that internalized natural language can itself serve as a symbolic form of cognition is the effectiveness of statistical models that derive the meaning of words through statistical computations applied to large corpuses of text (Louwerse and Jeuniaux, 2008). A prominent example of this type of model is Latent Semantic Analysis or LSA (Landauer and Dumais, 1997). The idea behind LSA is that the aggregate of all the linguistic contexts in which a given word does and does not appear constrains semantic-relatedness. LSA has shown some effectiveness with respect to modeling a variety of linguistic tasks (Landauer et al., 1998). For example, an LSA model performed at a comparable level on the vocabulary portion of the Test of English as a Foreign Language to a large sample of students applying for college entrance in the United States from non-English speaking countries (Landauer and Dumais, 1997). Even if we grant that this particular model is psychologically implausible, it demonstrates the potential of a language-based representational system.

Theoretical Influences

I propose that our concepts are encoded in at least two types of semantic representations: one type employing embodied sensorimotor representations associated with our experience of the world and the other type employing dis-embodied sensorimotor representations associated with our experience of language. Other types may exist. Gesture, for instance, might form an independent semantic representational system (Goldin-Meadow, 2003). This pluralistic embodied proposal has clear similarities with some previous theories. Highlighting the similarities – and the differences – between it and these theories should help clarify its central claims.

This proposal overlaps somewhat with another recent attempt to offer an embodied solution to the problem of abstraction. Borghi and Cimatti (2009) argue that supporters of embodied cognition have paid too little attention to the embodied social experience associated with language. They propose that there is a qualitative distinction to be made, not between two different mental processes, but rather between two different cognitive sources of grounding: one that depends crucially on direct sensorimotor experience and another that depends crucially on linguistic experience. Both of these sources can be useful in the acquisition of any concept but the acquisition of concrete concepts is likely to depend more on direct sensorimotor experience and the acquisition of abstract concepts is more likely to depend on linguistic experience. This distinction seems important and necessary. I suggest that it falls short, however, because it does not appropriately emphasize the importance of the computational properties of natural language. While I agree that linguistic experience is an important source of socially derived information about the world, I maintain that the structural properties of natural language contribute to its effectiveness in representing abstract concepts. My account differs from Borghi and Cimatti’s because it holds that the acquisition of language creates a new dis-embodied semantic system, one that has many of the properties usually associated with the amodal symbol systems favored by traditional cognitive science. In other words, natural language on my view is not merely another source of information about the world but is also another way of thinking about the world.

My core thesis is that language is an internalized amodal symbol system that is built on an embodied substrate. As such, it extends our cognitive reach and helps us overcome the problem of abstraction. This idea is inspired in part by Andy Clark’s view of language as a kind of cognitive scaffolding that provides cognitive benefits that would not otherwise be available to us. Clark (2008, p. 47) summarizes these benefits in the following passage:

The computational value of a public system of essentially context-free, arbitrary symbols, lies… in the way such a system can push, pull, tweak, cajole, and eventually cooperate with various non-arbitrary, modality-rich, context-sensitive forms of biologically basic encoding.

Clark’s claim is that natural language augments the cognitive abilities of an embodied mind. The core idea is that natural language is a cognitively useful symbol system, not because it mirrors the structure of our underlying thoughts, but because it does not. Clark makes much of the arbitrariness of linguistic symbols. Although the arbitrariness of the relationship between words and their semantic contents is well known, one might think that “forms of biologically basic encoding” are equally arbitrary. However, as we saw above in the context of the symbol grounding problem, there is a sense in which perceptual symbols are not arbitrary because they contain sensorimotor representations that enjoy a non-cognitive causal relationship with objects and events. Clark (2008) argues that language helps extend our cognitive abilities in at least three distinct but related ways: first, the very act of labeling objects and events provides a means of discovering increasingly abstract patterns in nature; second, the ability to recall and react to structured sentences enables us to acquire new skills and capacities, and third, our language abilities partially underwrite our ability to reflect on and influence the contents of our own thinking. Because he is primarily interested in simply establishing that language can in fact extend our cognitive abilities, Clark focuses on a collection of empirically based examples that seem to demonstrate cognitive extension. One of the most established of these is the apparent way in which verbal counting helps children acquire an understanding of positive integers (Dehaene, 1999; Carey and Sarnecka, 2006).

Where my account diverges from Clark’s is with respect to scope. I contend that the sort of scaffolding he discusses is not limited to specific concepts or cognitive domains. Instead, acquiring a natural language extends our abilities to acquire concepts across the board. This is not simply because it offers a means of accessing socially derived information but also because it offers new representational powers. I suggest that most concepts depend to some significant degree on information represented in internalized natural language.

Clark may or may not be sympathetic with this general point, but there is no indication that he connects this scaffolding effect to the qualitative distinction between abstract and concrete concepts.

This brings us to perhaps the single greatest influence of the theory outlined in this essay: Dual Coding Theory or DCT (Paivio, 1986). This theory posits two independent cognitive subsystems, one employing symbolic verbal representations and the other employing analog non-verbal representations. Sadoski and Paivio (2004, p. 1340) write:

A basic premise of DCT is that all mental representations retain some of the concrete qualities of the external experiences from which they derive. These experiences can be linguistic or non-linguistic. Their different characteristics develop into two separate mental systems, or codes, one specialized for representing and processing language (the verbal code) and one for processing non-linguistic objects and events (the non-verbal code).

The focus in DCT on the dynamic relationship between experience and mental representations seems to be in keeping with the basic tenets of embodied cognition. One might even reasonably see DCT as a precursor to the embodied cognition movement. However, an important aspect of DCT, i.e., its emphasis on language as an independent symbol system, has not generally been taken up by embodied cognition. To a certain degree, my account can be seen as an attempt to recapture an important insight from DCT within an embodied framework. It is important, however, to recognize that the result of this effort is not simply a recapitulation of DCT. There are some important differences between the account developed here and DCT. First, DCT claims that mental images are the basic constituents of the verbal and non-verbal systems. My account views perceptual symbols as the basic units. This is significant because perceptual symbol system theory represents an explicit attempt to avoid the weaknesses associated with image-based theories of concepts. Perceptual symbols differ from mental images in a number of important ways: for instance, they need not be conscious, they can be schematic, and they are often multi-modal. Second, DCT and my theory differ with respect to the nature of the mental representations associated with language. According to DCT, they are a special class of mental images that are made up from different basic elements (logogens) than the basic elements of non-verbal representations (imagens). On my account, all conceptual representations consist of perceptual symbols. Linguistic representations are distinguished form non-linguistic ones by the fact that they are an internalization of an external symbol system.

In the end, the view advocated in this essay brings together ideas from a number of different theories and combines them in a novel way. While it clearly owes a debt to these previous views, it stands or falls on its own.

Empirical Evidence

We began the last section with the acknowledgment of the seriousness of the problem of abstraction. We now have a theoretical picture of how language might help explain this ability: language might extend our cognitive abilities in such a way that enables us to have some of the benefits of an amodal symbol system. This theoretical picture rests on two independent hypotheses: (1) that language processing involves sensorimotor simulation and (2) that linguistic representations play an important role in our ability to abstract and generalize.

Language Processing Involves Perceptual Symbols

Given the dynamic nature of linguistic communication, the idea that language processing involves perceptual symbols seems attractive. After all, most linguistic communication is time-constrained and would seem to require the integration of action, perception, and cognition. Below, I survey some of the evidence favoring this hypothesis.

The first reason to think language processing might involve sensorimotor simulations is a negative one: the project to locate self-contained language areas of the brain has not succeeded. Ever since the work of Broca and Wernicke in the late nineteenth century (Finger, 1994), the classical localizationist position has been that subcomponents of language are represented and processed in bounded and specialized cortical regions (Geschwind, 1970). One of the primary sources of evidence for this perspective has been the study of aphasic syndromes resulting from focal brain injuries (for a review see Saffran, 2000). Researchers, however, have begun to move away from strict localization and toward the view that language requires the activity of a number of spatially distinct brain regions. This shift has occurred in response to several forms of evidence. For one, neuroimaging studies indicate that widely distributed brain areas are active in language processing (Posner and Raichle, 1994). Another reason for this shift is the fact that the association of grammatical processing with Broca’s area has broken down to a large degree (Grodzinsky, 2000). For instance, there is evidence of some retained grammatical knowledge in Broca’s aphasics (Bates and Wulfeck, 1989; Bates et al., 1991). In addition, grammatical deficits have been found in Wernicke’s aphasics and other clinical populations (Dick et al., 2001). It also appears that grammatical deficits are associated with damage throughout the left perisylvian cortex (Caplan et al., 1996). Finally, recent evidence suggests that Broca’s area itself might have multiple functions. For example, a number have studies have implicated in action-related tasks (Thoenissen et al., 2002; Nishitani et al., 2005). In sum, evidence from cognitive neuroscience and neuropsychology suggests that language processing is widely distributed in the brain and involves a number of sensorimotor areas. Although this distribution is not logically incompatible with an amodal approach, it fits well with the idea that language processing involves sensorimotor simulations.

A second, more direct reason to think that language processing might involve perceptual symbols is that there is evidence of functional links between motor and perception circuits with the left perisylvian cortex (Pulvermüller, 2005). For example, there is evidence that listening to speech modulates tongue muscle responses (Fadiga et al., 2002). This sort of evidence is often seen as supporting the motor theory of speech perception (Liberman and Whalen, 2000) or the direct realist theory (Fowler et al., 2003). Critics of these theories argue that auditory areas alone might be sufficient for perceiving speech (e.g., Toni et al., 2008). If true, this would rule out a strongly action-based account of speech perception in which speech perception necessarily involves motor processing. However, it does not rule out a weaker view that speech recognition generally involves multimodal perceptual symbols.

A third reason to suppose that language processing involves perceptual symbols is that several studies implicate active integration of multimodal information in on-line language processing. It is well established that visual input can influence phonemic speech-processing (McGurk and MacDonald, 1976). A large body of eye-tracking experiments shows the manifold ways in which visual information can be continuously integrated with auditory information during the processing of speech (Spivey and Richardson, 2009). Visual information has been shown to influence language comprehension at various levels of linguistic analysis, including word-recognition (Allopenna et al., 1998), syntactic processing (Tanenhaus et al., 1995), and thematic role assignment (Altmann and Kamide, 1999). Consider a study involving syntactic ambiguity (Spivey et al., 2002; Spivey and Richardson, 2009). Participants were presented with a four-quadrant display of real objects and instructed to carry out actions. The display on one condition contained (going clockwise from the upper left quadrant) a spoon on a napkin, a bare napkin, a bowl, and a pen. The participants were instructed to “Put the spoon on the napkin in the bowl.” Eye-tracking evidence indicates that subjects often fixate on the irrelevant bare napkin before fixating on the bowl and carrying out the action. This suggests that they initially misparse the initial prepositional phrase as syntactically attached to the verb. This effect did not occur with a similar display in which two spoons appear, one on a napkin and one not on a napkin (replacing the pen in the earlier display).

A fourth reason to think that language processing might involve perceptual symbols is the employment of perceptual areas in language processing among people with congenital perceptual deficits. For example, neuroimaging studies find increased activation in auditory areas when congenitally deaf individuals view signs (Petito et al., 2000). Similarly, some primary visual areas show increased activation when congenitally blind individuals read Braille (Sadato et al., 1996).

Taken together, these various bodies of evidence suggest that language processing is much more integrated with action and perception systems than was previously assumed by researchers. It should be acknowledged, however, that this evidence is only suggestive and not conclusive. One could maintain that this evidence does not falsify the hypothesis that language processing is handled by amodal symbols since the implicated activity in sensorimotor systems could be associated with spreading activation and not be constitutive of language processing. As I mentioned earlier in the essay, this is a general problem faced by any embodied hypothesis. Ultimately, the issue is an empirical one, and unfortunately the evidence currently available does not completely settle matters.

Given this uncertainty, it seems worthwhile to consider what would happen if it turns out that language processing is indeed handled by an amodal symbol system of the sort posited by the current orthodoxy. This would turn the hypothesis that language is a form of dis-embodied cognition into the hypothesis that language is a form of disembodied cognition (non-hyphenated). It would result in a different kind of hybrid theory, one in which concepts are represented by both multimodal perceptual symbols and amodal linguistic symbols. Although I am promoting the dis-embodied view in this essay, the second view is an intriguing and compelling alternative (for general arguments in favor of a hybrid approach see Dove, 2009; Kemmerer, 2010).

Imageability Reconsidered

Imageability effects provide support for the account developed in this essay. Typically, imageability is defined as the ease with which a word gives rise to a sensory-motor mental image (Paivio, 1971). Imageability is a broader concept than concreteness because it includes sensory images of bodily states and motor images. It is generally recognized that imageability better captures the relevant phenomena and supports broader generalizations. Highly reliable imageability ratings on number scales have been gathered for linguistic concepts by number of researchers (Toglia and Battig, 1978; Bird et al., 2001). Traditionally, cognitive scientists examined imageability in terms of processing advantages for high imageable concepts over low imageable ones in several cognitive tasks. For instance, lexical access has been shown to be quicker for highly imageable words than for abstract ones (Coltheart et al., 1980) and highly imageable words are recalled more quickly in memory tasks than abstract words (Paivio, 1986; Wattenmaker and Shoben, 1987).

Two major theories dominate the literature: the DCT (Paivio, 1971, 1986) and the context-availability theory (Schwanenflugel and Shoben, 1983). According to the DCT, words with low imageability are associated primarily with verbal representations while highly imageable words are associated with both verbal representations and perceptual ones. Imageability effects are then explained in terms of the greater availability of perceptually encoded information. According to the context-availability theory, highly imageable words are more closely linked to relevant contextual knowledge in semantic networks than less imageable concepts. In other words, highly imageable words have greater contextual information stored in semantic memory, and imageability effects are to be explained by the facilitation of processing associated with increased activation in these networks. On this approach, the reason that participants respond more quickly in a lexical decision task to a word such as “fingertip” than to one such as “idea” is that the former has more semantic associations than the latter.

Evidence suggests that both theories are right, depending on the task. I am going to focus on the evidence for the DCT because this evidence has more relevance to the claims in this essay.

Consider first neuropsychological case studies. Several research teams describe aphasic patients with significant left hemisphere damage who exhibit a selective semantic impairment for high imageable words (Berndt et al., 2002; Bird et al., 2003; Crepaldi et al., 2006). Patients with a selective semantic impairment for low imageable words are less common but have also been found (Marshall et al., 1996; Luzzatti et al., 2002). This double dissociation suggests that, at least at some level, the semantic processing of concepts with low imageability is functionally independent from the semantic processing of concepts with high imageability.

A number of event-related potential (ERP) experiments support a neuroanatomical distinction between concepts of high and low imageability. For instance, Holcomb et al. (1999) created a task that involved manipulations of both context and concreteness. ERP recordings were time-locked to sentence-final words in a word-by-word reading task in which participants made semantic congruency judgments (e.g., Armed robbery implies that the thief used a weapon vs. armed robbery implies that the thief used a rose). They found that sentence-final concrete words generated a larger and more anterior N400 than sentence-final abstract words in both contexts (see also Kounios and Holcomb, 1994; West and Holcomb, 2000). Further studies have found context-independent topographic effects associated with imageability in single-word presentations (Kellenbach et al., 2002; Swaab et al., 2002). Using two-word stimuli that involved a noun preceded by either a concrete modifier or an abstract modifier (“green book” vs. “engaging book”) in a visual half-field presentation, Huang et al. (2010) found distinct hemispheric responses. Thus, ERP studies employing diverse tasks support the notion that different cognitive systems are associated with the semantic processing of high and low imageable words.

Neuroimaging data supports the notion that neural activity is modulated by imageability. A number of studies find that abstract or low imageable words elicit greater activation than concrete or high imageable words in superior regions of the left temporal lobe (Mellet et al., 1998; Giesbrecht et al., 2004; Noppeney and Price, 2004; Binder et al., 2005; Sabsevitz et al., 2005) and inferior regions of the left prefrontal cortex (Giesbrecht et al., 2004; Noppeney and Price, 2004; Binder et al., 2005; Sabsevitz et al., 2005; Goldberg et al., 2006). This evidence fits with imaging studies that implicate the left inferior frontal gyrus or IFG in language processing (Bookheimer, 2002). When researchers make the comparison in the reverse direction, the pattern is less clear. Whereas some studies find no areas of increased activation (Kiehl et al., 1999; Perani et al., 1999; Tyler et al., 2001; Grossman et al., 2002; Noppeney and Price, 2004), others find increased activation in right hemisphere areas (Mellet et al., 1998; Jessen et al., 2000; Binder et al., 2005; Sabsevitz et al., 2005). This divergence with respect to activation patterns fits with the neuropsychological observation that patients are more likely to have a selective deficit for abstract or low imageable words than for concrete or high imageable words.

Sabsevitz et al. (2005) carried out a particularly careful fMRI study. Their study incorporated a larger sample (28 adults) than previous studies and a task (judgment of semantic similarity) that is more likely to elicit deep semantic processing than a more superficial task, such as lexical decision. Participants were visually presented with three words (e.g., cheetah, wolf, and tiger) in the form of a triangle. The task was to decide which of the two bottom words was most semantically similar to the top word. In this task, abstract nouns elicited greater activation in the left superior temporal and left inferior frontal cortex than concrete nouns, while concrete nouns elicited greater activation in a bilateral network of association areas than abstract nouns.

The upshot of this survey is that imageability effects have been found in multiple disciplines by investigators in a number of labs using different research methodologies and measures. These effects provide support for the notion that abstract or low imageability concepts are processed somewhat differently than concrete or high imageability concepts. Areas associated with language processing appear to be more active during semantic tasks associated with abstract or low imageability concepts. This pattern of activation fits with both the hypothesis that language is a dis-embodied form of cognition and the hypothesis that it is an amodal form of cognition. The decision between these two hypotheses turns on the role played by the observed activity in language related areas. Is it part of linguistic sensorimotor simulations or is it part of amodal linguistic processes? This question awaits further research.

Conclusion

In this essay, I have attempted to assess the generality of embodied cognition. The current evidence for conceptual embodiment is compelling but, unfortunately, circumscribed. Part of the problem is that there has not been enough research on abstract concepts. Beyond this evidential lacuna, though, abstract concepts represent an important theoretical challenge to embodied cognition. The most promising attempts to deal with this problem appeal to what I have called dis-embodied representations. I have argued that there are good reasons to think that natural language itself is a form of dis-embodied cognition. The acquisition of competence with respect to a natural language provides access to syntactically recombinable symbol system that extends our cognitive reach.

The speculation that natural language extends the cognitive capacities of embodied minds points the way to new research opportunities. One question that needs to be answered more fully is just how the two types of conceptual symbol systems interact. The potential for interaction is implicit in the dual functionality of linguistic symbols. On the account developed here linguistic representations can serve as elicitors of non-linguistic perceptual symbols and as semantic symbols in their own right. Presumably, we have the ability to employ these systems in a context-sensitive and flexible way. However, the nature of this flexibility remains to be seen. Another question that arises is the extent to which language might explain other significant features of cognition. For example, both Dennett (1996) and Carruthers (2002) suggest that language may be the medium for conscious deliberation. Although this is not implied by the position outlined in this essay, the possibility that conscious deliberation involves language-based perceptual symbols seems worthy of investigation. In the end, the hypothesis that language is a dis-embodied form of cognition has both empirical support and theoretical promise.

Conflict of Interest Statement

The author declares that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

References

Adolphs, R., Damasio, H., Tranel, D., Cooper, G., and Damasio, A. R. (2000). A role for somatosensory cortices in the visual recognition of emotion as revealed by three-dimensional lesion mapping. J. Neurosci. 20, 2683–2690.