Impact Factor 2.089
2017 JCR, Clarivate Analytics 2018

The world's most-cited Multidisciplinary Psychology journal

Mini Review ARTICLE

Front. Psychol., 12 February 2014 |

Phonological iconicity

  • 1Department of General Psychology and Neurocognitive, Freie Universität Berlin, Berlin, Germany
  • 2Department of Cognitive Neuroscience and Psycholinguistics, Universidad de La Laguna, La Laguna, Spain
  • 3Dahlem Institute for Neuroimaging of Emotion (DINE), Berlin, Germany

The arbitrariness of the linguistic sign is a fundamental assumption in modern linguistic theory. In recent years, however, a growing amount of research has investigated the nature of non-arbitrary relations between linguistic sounds and semantics. This review aims at illustrating the amount of findings obtained so far and to organize and evaluate different lines of research dedicated to the issue of phonological iconicity. In particular, we summarize findings on the processing of onomatopoetic expressions, ideophones, and phonaesthemes, relations between syntactic classes and phonology, as well as sound-shape and sound-affect correspondences at the level of phonemic contrasts. Many of these findings have been obtained across a range of different languages suggesting an internal relation between sublexical units and attributes as a potentially universal pattern.


Linguistic theory widely adopts Saussure's (1959) essential notion of an arbitrary relation between signifier and signified. While exceptions to this rule have been suggested outside the linguistic mainstream (Jakobson and Waugh, 1979; Tsur, 1992, 1997; Hinton et al., 1994; Volke, 2007; Schrott and Jacobs, 2011), most psycholinguistic models of lexical retrieval and production (e.g., Dell and O'Seaghdha, 1992; Levelt et al., 1999) incorporate arbitrariness as a fundamental feature. However, recent research posits motivated sound-meaning mappings (see Perniss et al., 2010, for review), that according to Peirce's prolific typology of semiotic elements (Peirce, 1931; see Liszka, 1996 for an overview) classify as iconic or indexical rather than symbolic, involving structural resemblance, or natural association between signifier and signified.

Empirical evidence for such phenomena primarily comes from signed languages (e.g., Thompson et al., 2012), gesture (e.g., McNeill, 2008), or prosody (e.g., Nygaard et al., 2009b). Evidence in phonology in spoken languages is, though, less determined and will be outlined subsequently regarding the role of iconicity as pivotal in human language.

We will first focus on onomatopoiea and ideophones as well-established sound-symbolic inventories in a variety of languages. Phoneasthemes intoduce the basic idea of sublexical units referring to higher level attributes of meaning, giving rise to different approaches particularly concerning the phonemic level in relation to affect or the perception of size or shape. We thus aim to interrogate the nature of iconicity in language processing and its role in phylogenetic and ontogenetic language development.


Intuitively, phonological iconicity is reflected in onomatopoeia that mimic animal sounds or sounds habitually associated with moving or colliding objects (e.g., cuckoo, bang) sometimes further imitating the emotional impression they have on us, e.g., the German “Uff” which transposes the ejected breath (ff) with which we instinctively express a reaction of relief into written German (Schrott and Jacobs, 2011). According to Berko-Gleason (2005), word acquisition in early childhood often refers to onomatopoeic expressions, because their inherent echoic relation to a referent enhances apprehension (cf. also Perniss and Vigliocco, in press).

Using functional magnetic resonance imaging, Hashimoto et al. (2006) reported that nouns increased activation in the left anterior superior temporal gyrus and animal sounds in the bilateral superior temporal sulcus and the left inferior frontal gyrus, while onomatopoeia recruited structures involved in the processing of both, thus indicating the activation of neural subsystems devoted to perception beyond language comprehension only.

According to Wundt (1904), some onomatopoeia occur as interjections, i.e., non-sentence phrases expressing emotion or sentiment on the speaker's part (e.g., Ah!, Pst!).

Following Schrott and Jacobs (2011), in interjections language seems closest to (affective) mental life (cf. also Bühler, 1934; Wierzbicka, 1991). They lend a voice to bodily feelings and affects, e.g., pain (German “aua”) or indifference (German “bah”). The Yiddish interjection “oy” expresses no less than 29 different affect states in only two phonemes (Rosten, 1968). Reaching beyond the expressive function in Bühler's (1934) Organon model, they also fulfill the conative/appealing function as in the calling (German “he”) or the request to keep silent (German “ssst”).

Testing cross-cultural agreement in the understanding of phonological iconicity of interjections, Sauter et al. (2010) asked native English speakers and speakers of Himba, a Namibian Bantu language, which of two vocalizations of the respective unknown language would best match a presented short story. Though participants agreed cross-culturally, the question remains whether they inferred the correct meaning from phonologic stimulus features or other acoustic cues, as Couper-Kuhlen (2011) demonstrated that the interpretation of “oh” as utterance of disappointment or anger much depends on prosody modulated by volume, pitch and intonation.

Ideophones, Mimetics, Expressives

Ideophones, mimetics, or expressives, typically referring to sound-symbolic inventories of Sub Saharan African, East Asian or Native American languages, similarly elude standard linguistic theory. According to Dingemanse (2011, 2012), they “depict sensory imagery” rather than merely describing it, and reach, unlike onomatopoeia, beyond acoustic perception only (e.g., Japanese kyoro kyoro for “looking around” or “spinning”; Tamil thuru thuru for “eager” or “active”). Following Dingemanse, sensory imagery is perceptual knowledge that derives from sensory perception of the environment and the body. Although scarcely represented in Indo-European languages, Atoda and Hoshino (1995) list more than 1700 frequent Japanese mimetic words, thus exceeding onomatopoiea numerically.

Iwasaki et al. (2007) showed that Japanese and English monolinguals agree in evaluative ratings of Japanese ideophones, despite Japanese raters' higher degrees of consistency. Effects were stronger for concepts of sound than vision or proprioception and limited to certain phonemes, but still suggest certain sound-meaning mappings to generalize cross-linguistically, which cannot be explained by mere exposure to language regularities.

Imai et al. (2008) replicated this result with ideophonic neologisms in Japanese and English native speakers. Using the same stimuli in a subsequent verb learning task with 3-year-old Japanese children, they further demonstrated that ideophonic word material facilitates verb acquisition in toddlers—predominantly due to phonological as opposed to morphological or syntactic properties. Kantartzis et al. (2011) and Yoshida (2012) extended these findings to English children creating comparable complements despite the marginal incidence of ideophones in their native language.

Using a word learning task, Nygaard et al. (2009a) reported higher accuracy and faster responses of English speaking monolingual adults to correct translations of Japanese adjectives involving a variety of perceptuo-motor properties. The effect was even present when matched to their antonyms—though to a lesser extent—as compared to random assignments. Iconic mappings thus reach beyond acoustic experience and hold across unrelated languages.

Lexical Categories

Focusing on broader syntactic categories rather than distinct attributes grounded in sensory domains, effects of regular phonological mappings are abundant also in Indo-European languages. Nouns are likely to count more syllables than verbs (Cassidy and Kelly, 1991) or to contain back (e.g., /u/,/o/) rather than front vowels (e.g., /e/,/i/) (Kelly, 1992). Nouns and verbs also exhibit larger Euclidean phonological distances across word classes than within (Farmer et al., 2006). English female names differ from male names and other nouns in number of syllables, syllable stress, and vowel brightness (Cutler et al., 1990). More importantly, language users exploit these regularities during language development when learning to assign new words to grammatical classes (Cassidy et al., 1999; Cassidy and Kelly, 2001; Farmer et al., 2006; Reilly et al., 2012).

These results imply systematic relations between phonology and syntax, rather than semantics. Yet, from a connectionist perspective, morphology might emerge as a layer of hidden units between levels of phonology and semantics (Plaut and Gonnerman, 2000). Accordingly, Monaghan et al. (2011) point out that morphology generates numerous instances of systematicity serving category assignment in first language acquisition, some of which (e.g., plural forms or differences in female vs. male names) might be considered iconic.


These are phoneme clusters like syllable onsets or rimes that typically occur in words belonging to specific semantic fields, (e.g., gl, as in glitter, glow, gleam etc. relates to “vision” and “light”) but lack the central feature of compositionality to qualify as morphemes. They even appear across language borders in non-cognate-words of remote languages (e.g., the consonant sequence /s/t/r/ reflecting concepts of “straight” in both English and Gaelic, Magnus, 2000). Several studies in English and Swedish posit phonaesthemes as instrumental in production and perception of neologisms (Hutchins, 1998; Abelin, 1999; Magnus, 2000). Bergen (2004) reported priming effects for phonaesthemic prime-target relations to be more pronounced than predicted by linearly combined effects of phonological and semantic priming. In a word learning task, phonaesthemes facilitated participants' deduction of new meanings with or without context (Parault, 2006).

According to Bergen (2004), available data do not necessarily suggest an innate sound-meaning relation. They might well be accounted for by connectionist models in terms of acquired associative frequency effects (e.g., Grainger and Jacobs, 1996; Rey et al., 1998; Plaut and Gonnerman, 2000), and were also suggested to have derived from early indo-european morphemes indicating etymologic evolution rather than iconic relation to referents as source of their occurrence. Note, however, that specific phonaesthemes such as sn—involving a nasal sound—occurring in words related to the nose (sniff, snore, snob) also seem to depict sensory imagery and therefore might qualify as iconic mappings.

Phonemic Contrasts

Sound and Size

Sapir (1929) initiated an influential line of research focusing on phonemic contrasts. Using nonword pairs, thus addressing potential sound-meaning mappings beyond the direct context of a given vocabulary, he showed that English speakers systematically associate the back vowel /a/ with largeness, but the front vowel /i/ with smallness. Newman (1933) extended his finding showing that size judgments systematically co-vary with articulation point in the vocal tract for consonants and vowels—more frontal phonemes relate to smallness and vice versa, yet failed to establish such sound-size relations for 350 English words with size connotations. Using alternative methods, Taylor and Taylor (1965) were able to reveal statistically reliable relations within Newman's data of smallness with more frontal sounds (e.g., consonants /n/,/t/; vowels /e/,/i/) as well as largeness with more posterior sounds (e.g., /g/,/k/; /o/,/u/).

More recently, Peña et al. (2011) reported increased looking times of 4-month-old infants for front vowels (/e/,/i/) presented with smaller, and back vowels (/a/,/o/) presented with larger objects than vice versa. Using a broader range of phonologically comparable nonword stimuli, Thompson and Estes (2011) demonstrated that this effect follows a graded function in adults. They argue that cross-modal processing of gesture and frequency code (Ohala, 1982; Berlin, 2006) better account for the results than statistical learning. In his frequency code hypothesis, Ohala (1984) stresses the correlation between general physical and vocal tract size: the fundamental frequency modulation (F0) would be the acoustic counterpart of common visual displays of physical size, providing a close link to natural selection—a pattern that might reverberate in the perception of vowel backness.

Shrum et al. (2012) extended empirical findings cross-linguistically: across French, Spanish, and Chinese subjects, fictitious brand names were preferred when vowel backness matched products' perceived size attributes.

Sound and Shape

Substantial evidence for phonological iconicity as a cross-linguistic phenomenon was derived from a seminal experiment of Köhler (1929). Within the framework of Gestalt psychology, he showed a reliable preference of native Spanish speakers to match the nonword maluma with a curvy round shape and takete with a spiky angular shape. The effect was subsequently labeled as “kiki/bouba effect” and replicated across a wide range of unrelated languages such as Himba (Bremner et al., 2012) or Tamil. It appears to be extraordinarily reliable with agreement of up to 95% (Ramachandran and Hubbard, 2001).

Maurer et al. (2006) found this effect in 2.5-year-old preliterate toddlers using a forced choice task. Ozturk et al. (2012) even demonstrated effects of congruent vs. incongruent sound-shape mappings in looking times of 4-month-old children. Infants' attention differed significantly though exclusively to a combination of continuants (e.g., /b/) and back vowels (e.g., /u/) or plosives (e.g., /k/) and front vowels (e.g., /i/), respectively. Adults' judgments from a control study revealed sensitivity to consonants or vowels only.

Developmental and cross-linguistic studies strongly suggest an innate origin of iconic mappings. However, dependent variables used are offline measures and especially adults' judgments might reflect metacognitive strategies.

To overcome this problem, Westbury (2005) implemented a lexical decision task in an implicit interference design. Words and nonwords matching Köhler's stimuli's consonant characteristics were presented simultaneously to either congruent or incongruent round or angular shapes. Results showed reliable form-x-phonology interaction, though for nonwords only, i.e., continuants on curvey backgrounds or plosives on angular backgrounds were rejected faster than vice versa. Therefore, sound-shape mappings appear to hold psychological reality also influencing online processing beyond judgments.

Using an implicit learning categorization task combined with EEG, Kovic et al. (2010) presented subjects with curvy or pointy figures labeled sound-symbolically congruent or incongruent as either “dom” or “shick.” After a learning phase participants had to decide whether presented label-object pairs where correct or incorrect. Responses were faster in the sound-symbolic congruent compared to the incongruent condition. Congruent sound-shape pairs further elicited an early occipital negativity around 160 ms. Based on earlier findings (Hillyard et al., 1998) the authors interpret this result as indicative of multi-sensory feature integration and covert spatial attention.

Likewise, Ramachandran points to possible synkinetic mappings of hand and jaw movements, controlled in two adjacent areas in the Penfield motor homunculus (Ramachandran and Hubbard, 2001), claiming that the “pincer-like opposition of thumb and forefinger to denote small size” might be mimicked in movements of the jaw as typically displayed in the production of front vowels (Ramachandran and Hubbard, 2001, p. 21). Contrasting high and front vowels against low and back vowels across 136 languages, Ultan (1978) suggested deictic distinctions to reflect conjoint activation of motor maps for moving of lips and hands toward and away from the body. Similarly, cross-modal mappings in the left fusiform or angular gyrus might explain non-arbitrary sound-shape correspondences via integration of visual information from the inferior temporal lobe and sound representations from the primary auditory cortex. Cross-modal associations would, then, be more likely to arise in neighboring rather than remote brain regions (Ramachandran and Hubbard, 2005) as also suggested by Bremner et al. (2012), who replicated sound-shape mappings, but failed to show reliable taste-shape mappings across distant cultures.

Sound and Affect

Building on their research on sound-size correspondences, Taylor and Taylor (1965) asked monolinguals from four unrelated languages, English, Japanese, Korean, and Tamil, to rate pseudowords comprising phonemes common to all four languages on pleasantness. Ratings showed consistent patterns within, but differed considerably across languages where different phonemes were perceived as more or less pleasant suggesting that sound-emotional meaning relations are language specific and hence likely to be learned in a given linguistic context.

Focusing on real text instead of artificial word material, Fónagy (1961) contrasted Hungarian poems characterized as either aggressive or tender. He found sonorants (e.g., /l/,/m/) to occur more often in tender but plosives (e.g., /k/,/t/) in aggressive poems. Regarding poetic text samples high in foregrounding, i.e., unexpected irregularities with regard to a common phonological inventory, Miall (2001) states that they not only display differential phonetic features, e.g., relative occurrence of front vowels and plosives, but are also perceived as more affective and striking (cf. Schrott and Jacobs, 2011). A number of cross-linguistic studies (Wiseman and van Peer, 2003; Auracher et al., 2010) following Fónagys approach corroborate parallels across remote languages like German, Chinese, Russian, Ukrainian and Brazilian Portuguese—all using non-contemporary poems.

In a more general approach, Heise (1966) extended these ideas to emotional constructs and the organization of the vocabulary. He collected valence, arousal, and potency ratings for 1000 monosyllabic English words. After segmenting words into single phonemes he found phoneme occurrences to significantly co-vary with affective scales. Extending these findings to more representative text samples, Whissell (1999, 2000) attributed phonemes' emotional quality to both place and manner of articulation as being variably related to different positions in the affective space (e.g., pleasantness, sadness, passivity, etc.).

Aryani et al.' (2013) software tool extracts a given texts' phonologically salient units, which might serve as foregrounding elements—potentially effective at a level of phonological iconicity modulating a text's emotional tone (cf. Jespersen, 1922; Schrott and Jacobs, 2011). Adopting a more acoustic approach, Myers-Schulz et al. (2013) suggested a characteristic dynamic formant shift, rather than distinct phonemes, to predict the matching of nonwords to positive or negative pictures.

Another account of systematic mappings of phonology to affective dimensions was proposed by Zajonc et al. (1989; McIntosh et al., 1997), who contrasted the umlaut /y/ with other vowels, hypothesizing that facial muscle feedback from the corrugator muscle associated with its production would cause rather negative affective states: pleasantness and mood ratings of American and German subjects became indeed more negative after the utterance of this specific vowel or after reading stories with higher occurrence of it.


Systematic form-meaning mappings are abundant in many languages, although not always necessarily iconic in nature. Yet, these latter ones hold strong implications for the essence of human language and its origin.

Given the relatively small inventory of phonemes and the potentially infinite number of concepts to be expressed, the Saussureian principle of arbitrariness certainly remains a general key feature of human language (Gasser, 2004), allowing for large lexica with effective linguistic signals to develop (Monaghan et al., 2011). Nonetheless, cross-linguistic agreement and onset at early stages of language development of the outlined phenomena suggest a universal basis of motivated signs to be considered. From a phylogenetic perspective, Darwin (1871) already suggested language to originate from the imitation of natural sounds, further motivated by emotional impulse. Similarly, Ramachandran and Hubbard (2001) conjecture that language evolution might have been driven by analogies between phonology and perceptuo-motor properties of semantic entities as a solution to the symbol grounding problem (Harnad, 1990). Following Perniss and Vigliocco (in press), iconicity would thereby be essential to jump-start phylogenetic and ontogenetic development in terms of displacement and referentiality. It thus provides an additional mechanism to Hebbian learning and, regarding language processing in later stages, consequently embodies language in experience.

Fay et al. (2013) point in a similar direction, reporting that participants were able to bootstrap meaning from gesture and non-linguistic vocalization, partially depending on item category such as object, action or emotion. In analogy they argue that the evolution of signs from motivated origin to conventional use is still observable in certain sign systems such as Chinese hanzi (Vaccari and Vaccari, 1961) or American Sign Language (Frishberg, 1975).

Strictly arbitrary relations between levels of phonology and semantics as assumed by psycholinguistic models (e.g., Levelt et al., 1999) are incompatible with the effects discussed above and few promising attempts have been made to overcome respective limitations as e.g., the featural and unitary semantic space hypothesis (Vigliocco et al., 2004), or the neurocognitive poetics model of literary reading (Jacobs, 2011, 2014). More effort is thus required for future psycholinguistic theory to incorporate both arbitrariness and iconicity as essential features of human language.

Conflict of Interest Statement

The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.


Abelin, Å. (1999). Studies in Sound Symbolism. Göteborg: Göteborg University dissertation.

Aryani, A., Jacobs, A. M., and Conrad, M. (2013). Extracting salient sublexical units from written texts: emophon, a corpus-based approach to phonological iconicity. Front. Psychol. 4:654. doi: 10.3389/fpsyg.2013.00654

Pubmed Abstract | Pubmed Full Text | CrossRef Full Text

Atoda, T., and Hoshino, K. (1995). Giongo gitaigo tsukaikata jiten (Usage dictionary of sound/manner mimetics). Sotakusha, Tokyo.

Auracher, J., Albers, S., Zhai, Y., Gareeva, G., and Stavniychuk, T. (2010). P is for happiness, N is for sadness: universals in sound iconicity to detect emotions in poetry. Dis. Process. 48, 1–25. doi: 10.1080/01638531003674894

CrossRef Full Text

Bergen, B. K. (2004). The psychological reality of phonaesthemes. Language 290–311. doi: 10.1353/lan.2004.0056

CrossRef Full Text

Berko-Gleason, J. (2005). The Development of Language. New York, NY: Pearson Education Inc.

Berlin, B. (2006). The first congress of ethnozoological nomenclature. J. R. Anthropol. Inst. 12, 23–44. doi: 10.1111/j.1467-9655.2006.00271.x

CrossRef Full Text

Bremner, A. J., Caparos, S., Davidoff, J., de Fockert, J., Linnell, K. J., and Spence, C. (2012). “Bouba” and “Kiki” in Namibia? A remote culture make similar shape-sound matches, but different shape-taste matches to Westerners. Cognition 122, 80–85. doi: 10.1016/j.cognition.2012.09.007

Pubmed Abstract | Pubmed Full Text | CrossRef Full Text

Bühler, K. (1934). Sprachtheorie. Stuttgart: G. Fischer.

Cassidy, K., and Kelly, M. H. (1991). Phonoligical information for grammatical category assignments. J. Mem. Lang. 30, 348–369. doi: 10.1016/0749-596X(91)90041-H

Pubmed Abstract | Pubmed Full Text | CrossRef Full Text

Cassidy, K., and Kelly, M. H. (2001). Children's use of phonology to infer grammatical class in vocabulary learning. Psychon. Bull. Rev. J. 8, 519–523. doi: 10.3758/BF03196187

Pubmed Abstract | Pubmed Full Text | CrossRef Full Text

Cassidy, K., Kelly, M. H., and Sharoni, L. (1999). Infering gender from name phonology. J. Exp. Psychol. Gen. 128, 362–381. doi: 10.1037/0096-3445.128.3.362

CrossRef Full Text

Couper-Kuhlen, E. (2011). “Affectivity in cross-linguistic and cross-cultural perspective,” in Sprachen in Mobilisierten Kulturen: Aspekte der Migrationslinguistik [Languages in mobilizing cultures: Linguistic aspects of migration], ed T. Stehl (Berlin: Universitatsverlag Potsdam), 231–257.

Cutler, A., McQueen, J., and Robinson, K. (1990). Elizabeth and John: sound patterns of men's and women's names. J. Linguist. 26, 471–482. doi: 10.1017/S0022226700014754

CrossRef Full Text

Darwin, C. (1871). The Descent of Man. London: John Murray.

Dell, G. S., and O'Seaghdha, P. G. (1992). Stages of lexical access in language production. Cognition 42, 287–314. doi: 10.1016/0010-0277(92)90046-K

Pubmed Abstract | Pubmed Full Text | CrossRef Full Text

Dingemanse, M. (2011). Ideophones and the aesthetics of everyday language in a West-African society. Sens. Soc. 6, 77–85. doi: 10.2752/174589311X12893982233830

CrossRef Full Text

Dingemanse, M. (2012). Advances in the cross-linguisitc study of ideophones. Lang. Linguist. Compass 6, 654–672. doi: 10.1002/Inc3.361

CrossRef Full Text

Farmer, T. A., Christiansen, M. H., and Monaghan, P. (2006). Phonological typicality influences on-line sentence comprehension. Proc. Natl. Acad. Sci. U.S.A. 103, 12203–12208. doi: 10.1073/pnas.0602173103

Pubmed Abstract | Pubmed Full Text | CrossRef Full Text

Fay, N., Arbib, M., and Garrod, S. (2013). How to bootstrap a human communication system. Cogn. Sci. 37, 1356–1367. doi: 10.1111/cogs.12048

Pubmed Abstract | Pubmed Full Text | CrossRef Full Text

Fónagy, I. (1961). Communication in poetry. Word 17, 194–218.

Frishberg, N. (1975). Arbitrariness and iconicity: historical change in American SignLanguage. Language 51, 696–719. doi: 10.2307/412894

CrossRef Full Text

Gasser, M. (2004). “The origins of arbitrariness in language,” in Proceedings of the 26th Annual Conference of the Cognitive Science Society, (Chicago, IL), 26, 4–7.

Grainger, J., and Jacobs, A. M. (1996). Orthographic processing in visual word recognition: a multiple read-out model. Psychol. Rev. 103, 518–565. doi: 10.1037/0033-295X.103.3.518

Pubmed Abstract | Pubmed Full Text | CrossRef Full Text

Harnad, S. (1990). The symbol grounding problem. Physica D. 42, 335–346. doi: 10.1016/0167-2789(90)90087-6

CrossRef Full Text

Hashimoto, T., Usui, N., Taira, M., Nose, I., Haji, H., and Shozo, K. (2006). The neural mechanism associated with the processing of onomatopoeic sound. Neuroimage 31, 1762–1770. doi: 10.1016/j.neuroimage.2006.02.019

Pubmed Abstract | Pubmed Full Text | CrossRef Full Text

Heise, D. R. (1966). Sound-meaning correlations among 1,000 English words. Lang. Speech 9, 14–27.

Hillyard, S. A., Vogel, E. K., and Luck, S. J. (1998). Sensory gain control (amplification) as a mechanism of selective attention: electrophysiological and neuroimaging evidence. Philos. Trans. R. Soc. Biol. Sci. 393, 1257–1270. doi: 10.1098/rstb.1998.0281

Pubmed Abstract | Pubmed Full Text | CrossRef Full Text

Hinton, L., Nichols, J., and Ohala, J. J. (1994). Sound Symbolism. Cambridge: Cambridge University Press.

Hutchins, S. S. (1998). The Psychological Reality, Variability, and Compositionality of English Phonesthemes. Atlanta, GA: Emory University dissertation.

Imai, M., Kita, S., Nagumo, M., and Okada, H. (2008). Sound symbolism facilitates early verb learning. Cognition 109, 54–65. doi: 10.1016/j.cognition.2008.07.015

Pubmed Abstract | Pubmed Full Text | CrossRef Full Text

Iwasaki, N., Vinson, D. P., and Vigliocco, G. (2007). What do English speakers know about gera-gera and yota-yota? A cross-linguistic investigation of mimetic words for laughing and walking. Jap. Lang. Educ. Around Globe. 17, 53–78.

Jacobs, A. M. (2011). “Neurokognitive poetik: elemente eines modells des literarischen lesens (neurocognitive poetics: elements of a model of literary reading),” in Gehirn und Gedicht: Wie Wir Unsere Wirklichkeiten Konstruieren, R. Schrott and A. M. Jacobs (München: Hanser), 492–520.

Jacobs, A. M. (2014). “Towards a neurocognitive poetics model of literary reading,” in Towards a Cognitive Neuroscience of Natural Language Use, ed R. Willems (Cambridge: Cambridge University Press).

Jakobson, R., and Waugh, L. R. (1979). The Sound Shape of Language. Bloomington, IN: Indiana University Press.

Jespersen, O. (1922). Language - Its Nature, Development and Origin. London: George Allen and Unwim Ltd.

Kantartzis, K., Imai, M., and Kita, S. (2011). Japanese sound-symbolism facilitates word learning in English-speaking children. Cogn. Sci. 35, 575–586. doi: 10.1111/j.1551-6709.2010.01169.x

CrossRef Full Text

Kelly, M. H. (1992). Using sound to solve syntactic problems: the role of phonology in grammatical category assignments. Psychol. Rev. 99, 349–364. doi: 10.1037/0033-295X.99.2.349

Pubmed Abstract | Pubmed Full Text | CrossRef Full Text

Köhler, W. (1929). Gestalt Psychology. New York, NY: Liveright.

Kovic, V., Plunkett, K., and Westermann, G. (2010). The shape of words in the brain. Cognition 114, 19–28. doi: 10.1016/j.cognition.2009.08.016

Pubmed Abstract | Pubmed Full Text | CrossRef Full Text

Levelt, W. J. M., Roelofs, A., and Meyer, A. S. (1999). A theory of lexical access in speech production. Behav. Brain Sci. 22, 1–75. doi: 10.1017/S0140525X99001776

Pubmed Abstract | Pubmed Full Text | CrossRef Full Text

Liszka, J. J. (1996). A General Introduction to the Semeiotic of Charles Sanders Peirce. Bloomington, IN: Indiana University Press.

Magnus, M. (2000). What's in a Word? Evidence for phonosemantics. Trondheim: University of Trondheim dissertation.

Maurer, D., Pathman, T., and Mondloch, C. J. (2006). The shape of boubas: sound-shape correspondances in toddlers and adults. Dev. Sci. 9, 316–322. doi: 10.1111/j.1467-7687.2006.00495.x

Pubmed Abstract | Pubmed Full Text | CrossRef Full Text

McIntosh, D. N., Zajonc, R. B., Vig, P. S., and Emerick, S. W. (1997). Facial movement, breathing, temperature, and affect: implications of the vascular theory of emotional efference. Cogn. Emot. 11, 171–196. doi: 10.1080/026999397379980

CrossRef Full Text

McNeill, D. (2008). Gesture and Thought. Chicago, IL: University of Chicago Press.

Miall, D. S. (2001). Sounds of contrast: an empirical approach to phonemic iconicity. Poetics 29, 55–70. doi: 10.1016/S0304-422X(00)00025-5

CrossRef Full Text

Monaghan, P., Christiansen, M. H., and Fitneva, S. A. (2011). The arbitrariness of the sign: learning advantages from the structure of the vocabulary. J. Exp. Psychol. Gen. 140, 325–347. doi: 10.1037/a0022924

Pubmed Abstract | Pubmed Full Text | CrossRef Full Text

Myers-Schulz, B., Pujara, M., Wolf, R. C., and Koenigs, M. (2013). Inherent emotional quality of human speech sounds. Cogn. Emot. 27, 1–9. doi: 10.1080/02699931.2012.754739

Pubmed Abstract | Pubmed Full Text | CrossRef Full Text

Newman, S. (1933). Further experiments on phonetic symbolism. Am. J. Psychol. 45, 53–75. doi: 10.2307/1414186

CrossRef Full Text

Nygaard, L. C., Cook, A. E., and Namy, L. L. (2009a). Sound to meaning correspondencesfacilitate word learning. Cognition 112, 181–186. doi: 10.1016/j.cognition.2009.04.001

Pubmed Abstract | Pubmed Full Text | CrossRef Full Text

Nygaard, L. C., Herold, D. S., and Namy, L. L. (2009b). The semantics of prosody: acousticand perceptual evidence of prosodic correlates to word meaning. Cogn. Sci. 33, 127–146. doi: 10.1111/j.1551-6709.2008.01007.x

Pubmed Abstract | Pubmed Full Text | CrossRef Full Text

Ohala, J. (1982). The voice of dominance. J. Acoust. Soc. Am. 72, S66. doi: 10.1121/1.2020007

CrossRef Full Text

Ohala, J. J. (1984). An ethological perspective on common cross - language utilization of F0 of voice. Phonetica 41, 1–16. doi: 10.1159/000261706

Pubmed Abstract | Pubmed Full Text | CrossRef Full Text

Ozturk, O., Krehm, M., and Vouloumanos, A. (2012). Sound symbolism in infancy: evidence for sound–shape cross-modal correspondences in 4-month-olds. J. Exp. Child Psychol. 114, 173–186. doi: 10.1016/j.jecp.2012.05.004

Pubmed Abstract | Pubmed Full Text | CrossRef Full Text

Parault, S. J. (2006). Sound symbolic word learning in written context. Contemp. Educ. Psychol. 31, 228–252. doi: 10.1016/j.cedpsych.2005.06.002

CrossRef Full Text

Peirce, C. S. (1931). “Principles of Philosophy,” in The Collected Papers of Charles Sanders Peirce, Vol. 1, eds C. Hartshorne, P. Weiss, and A. Burks (Cambridge: Harvard University Press).

Peña, M., Mehler, J., and Nespor, M. (2011). The role of audiovisual processing in early conceptual development. Psychol. Sci. 22, 1419–1421. doi: 10.1177/0956797611421791

Pubmed Abstract | Pubmed Full Text | CrossRef Full Text

Perniss, P., Thompson, R. L., and Vigliocco, G. (2010). Iconicity as a general property of language: evidence from spoken and signed languages. Front. Psychol. 1:227. doi: 10.3389/fpsyg.2010.00227

Pubmed Abstract | Pubmed Full Text | CrossRef Full Text

Perniss, P., and Vigliocco, G. (in press). The bridge of iconicity: from a world of experience to the experience of language. Philos. Trans. B

Plaut, D., and Gonnerman, L. (2000). Are non-semantic morphological effects incompatible with a distributed connectionist approach to lexical processing? Lang. Cogn. Process. 15, 445–485. doi: 10.1080/01690960050119661

Pubmed Abstract | Pubmed Full Text | CrossRef Full Text

Ramachandran, V. S., and Hubbard, E. M. (2001). Synaesthesia – A window into perception, thought and language. J. Conscious. Stud. 8, 3–34.

Ramachandran, V. S., and Hubbard, E. M. (2005). “The emergence of the human mind: some clues from synesthesia,” in Synesthesia: Perspectives From Cognitive Neuroscience, eds L. C. Robertson and N. Sagiv (Oxford: Oxford University Press), 147–192.

Reilly, J., Westbury, C., Kean, J., and Peelle, J. E. (2012). Arbitrary symbolism in natural language revisited: when word forms carry meaning. PLoS ONE 7:e42286. doi: 10.1371/journal.pone.0042286

Pubmed Abstract | Pubmed Full Text | CrossRef Full Text

Rey, A., Jacobs, A. M., Schmidt-Weigand, F., and Ziegler, J. C. (1998). A phoneme effect in visual word recognition. Cognition 75, B1–B12. doi: 10.1016/S0010-0277(98)00051-1

CrossRef Full Text

Rosten, L. (1968). The Joys of Yiddish. New York, NY: McGraw-Hill.

Sapir, E. (1929). A study in phonetic symbolism. J. Exp. Psychol. 12, 225–239. doi: 10.1037/h0070931

CrossRef Full Text

Saussure, F. de. (1959). Course in General Linguistics. New York, NY: Philosophical Library. (Original work published 1922).

Sauter, D. A., Eisner, F., Ekman, P., and Scott, S. K. (2010). Cross-cultural recognition of basic emotions through nonverbal emotional vocalizations. Proc. Natl. Acad. Sci. U.S.A. 107, 2408. doi: 10.1073/pnas.0908239106

Pubmed Abstract | Pubmed Full Text | CrossRef Full Text

Schrott, R., and Jacobs, A. M. (2011). Gehirn und Gedicht: Wie wir unsere Wirklichkeiten konstruieren (Brain and Poetry: How We Construct Our Realities). München: Hanser.

Shrum, L. J., Lowrey, T. M., Luna, D., Lerman, D. B., and Liu, M. (2012). Sound symbolism effects across languages: implications for global brand names. Int. J. Res. Market. 29, 275–279. doi: 10.1016/j.ijresmar.2012.03.002

CrossRef Full Text

Taylor, I. K., and Taylor, M. M. (1965). Another look at phonetic symbolism. Psychol. Bull. 64, 413–427. doi: 10.1037/h0022737

Pubmed Abstract | Pubmed Full Text | CrossRef Full Text

Thompson, P. D., and Estes, Z. (2011). Sound symbolic naming of novel objects is a graded function. Quart. J. Exp. Psychol. 64, 2392–2404. doi: 10.1080/17470218.2011.605898

Pubmed Abstract | Pubmed Full Text | CrossRef Full Text

Thompson, R. L., Vinson, D. P., Woll, B., and Vigliocco, G. (2012). The road to language learning is iconic evidence from british sign language. Psychol. Sci. 23, 1443–1448. doi: 10.1177/0956797612459763

Pubmed Abstract | Pubmed Full Text | CrossRef Full Text

Tsur, R. (1992). What Makes Sound Patterns Expressive? The poetic mode of speech perception. Durham, NC: Duke University Press.

Tsur, R. (1997). Sound effects of poetry: critical impressionism, reductionism and cognitive poetics. Prag. Cogn. 5, 283–304. doi: 10.1075/pc.5.2.05tsu

CrossRef Full Text

Ultan, R. (1978). “Size-sound symbolism,” in Universals of Human Language: Vol. 2. Phonology, ed J. Greenberg (Stanford, CA: Stanford University Press), 525–568.

Vaccari, O., and Vaccari, E. E. (1961). Pictorial Chinese-Japanese characters. 4th Edn. Tokyo: Charles E. Tuttle.

Vigliocco, G., Vinson, D. P., Lewis, W., and Garrett, M. F. (2004). Representing the meanings of object and action words: the featural and unitary semantic space hypothesis. Cogn. Psychol. 48, 422–488. doi: 10.1016/j.cogpsych.2003.09.001

Pubmed Abstract | Pubmed Full Text | CrossRef Full Text

Volke, S. (2007). Sprachphysiognomik: Grundlagen Einer Leibphänomenologischen Beschreibung der Lautwahrnehmung. [Physiognomy of Language] Freiburg; München: Karl Alber.

Westbury, C. (2005). Implicit sound symbolism in lexical access: evidence from an interference task. Brain Lang. 93, 10–19. doi: 10.1016/j.bandl.2004.07.006

Pubmed Abstract | Pubmed Full Text | CrossRef Full Text

Whissell, C. (1999). Phonosymbolism and the emotional nature of sounds: evidence of the preferential use of particular phonemes in texts of differing emotional tone. Percept. Motor Skills 89, 19–48. doi: 10.2466/pms.1999.89.1.19

Pubmed Abstract | Pubmed Full Text | CrossRef Full Text

Whissell, C. (2000). Phonoemotional profiling: a description of the emotional flavour of English texts on the basis of the phonemes employed in them. Percept. Motor Skills 91, 617–648. doi: 10.2466/pms.2000.91.2.617

Pubmed Abstract | Pubmed Full Text | CrossRef Full Text

Wierzbicka, A. (1991). Cross-Cultural Pragmatics. Berlin: Mouton de Gruyter.

Wiseman, M., and van Peer, W. (2003). “Roman jakobsons konzept der selbstreferenz aus der perspektive der heutigen kognitionswissenschaft [Roman Jakobson's concept of self-reference from the perspective of present-day cognition studies],” in Roman Jakobsons Gedichtanalysen. Eine Herausforderung an die Philologien, eds H. Birus, S. Donat, and B. Meyer- Sickendiek (Göttingen, Germany: Wallstein), 277–306.

Wundt, W. (1904). Völkerpsychologie: Eine Untersuchung der Entwicklungsgesetze von Sprache, Mythus und Sitte, [Social Psychology: Language], Vols. 1–2. Leipzig: W. Engelmann.

Yoshida, H. (2012). A cross-linguistic study of sound symbolism in children's verb learning. J. Cogn. Dev. 13, 232–265. doi: 10.1080/15248372.2011.573515

Pubmed Abstract | Pubmed Full Text | CrossRef Full Text

Zajonc, R. B., Murphy, S. T., and Inglehart, M. (1989). Feeling and facial efference: implications of the vascular theory of emotion. Psychol. Rev. 96, 395–416. doi: 10.1037/0033-295X.96.3.395

Pubmed Abstract | Pubmed Full Text | CrossRef Full Text

Keywords: phonological iconicity, sound symbolism, phonaesthemes, kiki/bouba effect, ideophones

Citation: Schmidtke DS, Conrad M and Jacobs AM (2014) Phonological iconicity. Front. Psychol. 5:80. doi: 10.3389/fpsyg.2014.00080

Received: 27 August 2013; Accepted: 21 January 2014;
Published online: 12 February 2014.

Edited by:

Gabriella Vigliocco, University College London, UK

Reviewed by:

Stefano F. Cappa, Vita-Salute San Raffaele University, Italy
Pamela Perniss, University College London, UK

Copyright © 2014 Schmidtke, Conrad and Jacobs. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

*Correspondence: David S. Schmidtke, Department of General Psychology, Freie Universität Berlin, Habelschwerdter Allee 45, 14195 Berlin, Germany e-mail: