Skip to main content


Front. Psychol., 02 June 2014
Sec. Psychology of Language
Volume 5 - 2014 |

Fast phonetic learning in very young infants: what it shows, and what it doesn't show

  • 1English Degree Program and Center on Autobiographical Memory Research, Department of Psychology, Aarhus University, Denmark
  • 2School of Communication Sciences & Disorders, Centre for Research on Brain, Language and Music, McGill University, Montreal, QC, Canada

A commentary on
Fast phonetic learning occurs already in 2-to-3-month old infants: an ERP study

by Wanrooij, K., Boersma, P., and van Zuijen, T. L. (2014) Front. Psychol. 5:77. doi: 10.3389/fpsyg.2014.00077

One of the very solid findings from infant speech perception research is that infants start out as universal perceivers and that their perception becomes attuned to the ambient language(s) mostly during the second half of the first year of life. This language-specific alignment of perceptual abilities happens early for tones (4–6 months, Yeung et al., 2013) and later for consonants (8–12 months, Werker and Tees, 1984, but see Best et al., 1988). The results for vowels are less clear-cut; some studies report language-specific discrimination by 6 months (Kuhl et al., 1992; Polka and Werker, 1994) whereas others find this pattern emerging as late as 12 months (Polka and Bohn, 1996).

The study by Wanrooij et al. (2014) is a welcome addition to the literature as it explores whether phonetic learning can occur at a very early age, and, if so, what its mechanism(s) might be. Wanrooij et al. (WBZ) examined the neural response of two groups of Dutch-learning 2-to 3-month-olds to non-native English vowels [ε] and [æ] after short exposure (12 min) to either a bimodal or a unimodal distribution of isolated steady-state vowels along an [ε-æ] continuum. Mismatch responses from these infants, whose native language has [ε] but not [æ], indicated discrimination of the [ε-æ] contrast for the bimodally-exposed but not for the unimodally-exposed infants.

WBZ conclude that short-term distributional learning impacts how young infants perceive speech sounds. This claim is well supported, interesting, and informative. A very short laboratory exposure clearly altered the infants' immediate response to speech stimuli (in some conditions). WBZ also claim that this learning mechanism generalizes to shape vowel perception outside the laboratory and can “affect vowel perception already in the first months of life.” However, several critical limitations of this study preclude this appealing but overly broad interpretation.

First, the training conditions implemented by WBZ lack the complex acoustic variability found in a natural language context. Second, their experimental manipulations cannot be directly equated with differences in language experience. WBZ describe the bimodal distribution encountered by one infant group during the 12-min training as a “native contrast,” and the unimodal distribution encountered by the other infant group as a “non-native contrast.” This is a redefinition of the terms “native” and “non-native” which is inconsistent with the literature on speech perception and which has no ecological validity. Both infant groups in the WBZ study are exposed to Dutch in which [ε-æ] is a non-native contrast; their language experience cannot be re-defined on the basis of a 12-min exposure to a set of isolated vowel stimuli from a restricted part of the vowel space. Third, it is unclear whether both training conditions simulate vowel phonetic properties in a realistic way. The study compares the effects of exposure to stimulus distributions with either two well-defined modes or a single poorly-defined mode. Specifically, the variability around the peak in the “unimodal” condition (indexed by standard deviation of formant values) is twice that of the bimodal peaks. Thus, exposure in the “unimodal” group may be more properly described as an “amodal” or flat distribution, unlike a natural vowel category. Importantly, the construction of “bimodal” and “unimodal” exposures implicitly assumes that, in this task, infants perceptually resolve all the points along the manipulated dimension; this is unlikely to be the case and data addressing the perceptual resolution of the continuum are not available. Fourth, as the authors point out, the study lacks an untrained control group; without an “unexposed” baseline the precise impact of the exposure manipulations is unknown.

WBZ also analyze their results to test predictions generated by the Natural Referent Vowel (NRV) framework as presented in Polka and Bohn (2011). According to NRV, young infants display perceptual biases favoring peripheral vowels due to formant convergence or focalization (cf. Schwartz et al., 2005). Studies employing a variety of behavioral and neurophysiological paradigms support this hypothesis (reviewed in Polka and Bohn, 2003, 2011; see also Pons et al., 2012; Dufour et al., 2013). The NRV framework makes general predictions about how perceptual biases will become shaped via long-term natural language experience. Importantly, these predictions are not about the immediate effects of controlled short-term laboratory training manipulations of the sort implemented by WBZ. Contrary to what WBZ claim, the NRV framework currently does not yield differential predictions for 2- to 3-month-olds following a 12-min exposure to artificial stimulus distributions. Specifically, NRV does not predict an asymmetrical response for the “unimodal” but not for the “bimodal” condition. Rather, NRV predicts that infants this young would show an asymmetry in discrimination of [ε]-[æ], regardless of their native language experience. The findings in the bimodal condition support this prediction, providing the first evidence of a vowel perception asymmetry in 2- to 3-month-olds. Among the four subgroups tested (bimodal [ε], bimodal [æ], unimodal [ε], unimodal [æ]), only the bimodal [ε] group had an MMR amplitude that is significantly different from zero. Thus, infants showed a reliable MMR in the bimodal condition and, consistent with NRV, only when the deviant vowel is the more peripheral and more focal [æ]. MMR amplitude differences were also noted across the standards in the “unimodal” group. However, in the unimodal group the MMR amplitudes themselves were not significantly different from zero when either [ε] or [æ] was the standard, thus, a reliable MMR to the test tokens was absent following the “unimodal” exposure, which confirms the main effect of the exposure. We conclude that WBZ's claim that their findings fail to support NRV predictions is not valid. As WBZ point out, the asymmetrical response (in the bimodal group) may or may not have been in place before the exposure conditions.

In summary, WBZ show that the neural response to speech can be altered in very young infants in the laboratory, allocating a potential role for distributional learning mechanisms in the first few months of life. How and when this mechanism operates to shape phonetic perception in natural language contexts remains a mystery. The findings of WBZ leave no doubt that this is a mystery that is well worth solving.

Conflict of Interest Statement

The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.


Support from Danmarks Grundforskningsfond (Danish National Research Foundation, grant DNRF93) to Ocke-Schwen Bohn and from the Natural Sciences and Engineering Research Council of Canada to Linda Polka is gratefully acknowledged.


Best, C. T., McRoberts, G. W., and Sithole, N. N. (1988) Examination of the perceptual reorganization for speech contrasts: zulu click discrimination by English-speaking adults and infants. J. Exp. Psychol. Hum. Percept. Perform. 14, 345–360. doi: 10.1037/0096-1523.14.3.345

Pubmed Abstract | Pubmed Full Text | CrossRef Full Text

Dufour, S., Brunellière, A., and Nguyen, N. (2013). To what extent do we hear phonemic contrasts in a non-native regional variety ? Tracking the dynamics of perceptual processing with EEG. J. Psycholinguist. Res. 42, 161–173. doi: 10.1007/s10936-012-9212-8

Pubmed Abstract | Pubmed Full Text | CrossRef Full Text

Kuhl, P. K., Williams, K. A., Lacerda, F., Stevens, K. N., and Lindblom, B. (1992). Linguistics experience alters phonetic perception in infants by 6 months of age. Science 255, 606–608. doi: 10.1126/science.1736364

Pubmed Abstract | Pubmed Full Text | CrossRef Full Text

Polka, L., and Bohn, O.-S. (1996). A cross-language comparison of vowel perception in English-learning and German-learning infants. J. Acoust. Soc. Am. 100, 577–592. doi: 10.1121/1.415884

Pubmed Abstract | Pubmed Full Text | CrossRef Full Text

Polka, L., and Bohn, O.-S. (2003). Asymmetries in vowel perception. Speech Commun. 41, 221–231. doi: 10.1016/S0167-6393(02)00105-X

CrossRef Full Text

Polka, L., and Bohn, O.-S. (2011). Natural referent vowel (NRV) framework: an emerging view of early phonetic development. J. Phon. 39, 467–478. doi: 10.1016/j.wocn.2010.08.007

CrossRef Full Text

Polka, L., and Werker, J. F. (1994). Developmental changes in perception of non-native vowel contrasts. J. Exp. Psychol. Hum. Percept. Perform. 20, 421–435.

Pubmed Abstract | Pubmed Full Text

Pons, F., Albareda-Castellot, B., and Sebastian-Galles, N. (2012). The interplay between input and initial biases: asymmetries in vowel perception during the first year of life. Child Dev. 83, 965–976. doi: 10.1111/j.1467-8624.2012.01740.x

Pubmed Abstract | Pubmed Full Text | CrossRef Full Text

Schwartz, J.-L., Abry, C., Boe, L.-J., Ménard, L., and Valée, N. (2005). Asymmetries in vowel perception, in the context of the Dispersion-Focalization Theory. Speech Commun. 45, 425–434. doi: 10.1016/j.specom.2004.12.001

CrossRef Full Text

Wanrooij, K., Boersma, P., and van Zuijen, T. L. (2014). Fast phonetic learning occurs already in 2-to-3-month old infants: an ERP study. Front. Psychol. 5:77. doi: 10.3389/fpsyg.2014.00077

Pubmed Abstract | Pubmed Full Text | CrossRef Full Text

Werker, J. F., and Tees, R. C. (1984). Cross-language speech perception: evidence for perceptual reorganization during the first year of life. Infant Behav. Dev. 7, 49–63. doi: 10.1016/S0163-6383(84)80022-3

Pubmed Abstract | Pubmed Full Text | CrossRef Full Text

Yeung, H. H., Chen, K. H., and Werker, J. F. (2013). When does native language input affect phonetic perception? The precocious case of lexical tone. J. Mem. Lang. 68, 123–139. doi: 10.1016/j.jml.2012.09.004

CrossRef Full Text

Keywords: infant vowel perception, fast phonetic learning, Natural Referent Vowel framework, perceptual asymmetry, infant MMR (mismatch response)

Citation: Bohn O-S and Polka L (2014) Fast phonetic learning in very young infants: what it shows, and what it doesn't show. Front. Psychol. 5:511. doi: 10.3389/fpsyg.2014.00511

Received: 30 April 2014; Accepted: 09 May 2014;
Published online: 02 June 2014.

Edited and reviewed by: Marcela Pena, Catholic University of Chile, Chile

Copyright © 2014 Bohn and Polka. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.