Limits on Monolingualism? A Comparison of Monolingual and Bilingual Infants’ Abilities to Integrate Lexical Tone in Novel Word Learning

Singh, Leher; Poh, Felicia L. S.; Fu, Charlene S. L.

doi:10.3389/fpsyg.2016.00667

ORIGINAL RESEARCH article

Front. Psychol., 10 May 2016

Sec. Psychology of Language

Volume 7 - 2016 | https://doi.org/10.3389/fpsyg.2016.00667

Limits on Monolingualism? A Comparison of Monolingual and Bilingual Infants’ Abilities to Integrate Lexical Tone in Novel Word Learning

Leher Singh ^*

Felicia L. S. Poh

Charlene S. L. Fu

Department of Psychology, National University of Singapore, Singapore Singapore

Article metrics

View details

Citations

11,7k

Views

2,3k

Downloads

A correction has been applied to this article in:

Corrigendum: Limits on Monolingualism? A Comparison of Monolingual and Bilingual Infants' Abilities to Integrate Lexical Tone in Novel Word Learning
1. Read correction

Abstract

To construct their first lexicon, infants must determine the relationship between native phonological variation and the meanings of words. This process is arguably more complex for bilingual learners who are often confronted with phonological conflict: phonological variation that is lexically relevant in one language may be lexically irrelevant in the other. In a series of four experiments, the present study investigated English–Mandarin bilingual infants’ abilities to negotiate phonological conflict introduced by learning both a tone and a non-tone language. In a novel word learning task, bilingual children were tested on their sensitivity to tone variation in English and Mandarin contexts. Their abilities to interpret tone variation in a language-dependent manner were compared to those of monolingual Mandarin learning infants. Results demonstrated that at 12–13 months, bilingual infants demonstrated the ability to bind tone to word meanings in Mandarin, but to disregard tone variation when learning new words in English. In contrast, monolingual learners of Mandarin did not show evidence of integrating tones into word meanings in Mandarin at the same age even though they were learning a tone language. However, a tone discrimination paradigm confirmed that monolingual Mandarin learning infants were able to tell these tones apart at 12–13 months under a different set of conditions. Later, at 17–18 months, monolingual Mandarin learners were able to bind tone variation to word meanings when learning new words. Our findings are discussed in terms of cognitive adaptations associated with bilingualism that may ease the negotiation of phonological conflict and facilitate precocious uptake of certain properties of each language.

Introduction

Languages of the world make use of sound in different ways to create words. A classic example is the use of vocal pitch in human languages. When learning a tone language like Mandarin Chinese, listeners must register particular changes in vocal pitch that distinguish the meanings of words. However, pitch variation is also a ubiquitous feature of non-tone languages such as English and is used to distinguish questions/statements, emotional states, and placement of stress and focus. In contrast to Mandarin learners, English learners must disregard pitch variation when determining the lexical identity of a word. It is therefore incumbent upon the young language learner to determine how sound changes effect changes in word meaning in their native language to construct a vocabulary. By necessity, children learning two languages have to learn how words are defined in both of their native languages. This process is potentially complicated by the fact that the phonological rules of two languages can diverge as in the case of Mandarin and English where pitch varies lexically and non-lexically, respectively, causing a potential conflict. The purpose of the current study is to determine how bilingual infants resolve this conflict and negotiate cross-language phonological conflict when learning new words. Specifically, the present study focuses on English–Mandarin bilingual infants’ abilities to define words according to lexical tone when listening to Mandarin and to disregard the same source of variation in pitch when defining new words in English. Bilingual infants’ abilities to integrate pitch in a language-dependent fashion are interpreted in relation to those of monolingual tone language learners.

In prior research, children’s abilities to integrate native phonological variation when learning new words have been widely studied in monolingual children (Stager and Werker, 1997; Pater et al., 2004; Dietrich et al., 2007; Rost and McMurray, 2009, 2010; Yoshida et al., 2009), but to a much lesser extent in bilingual children (but see Fennell et al., 2007; Mattock et al., 2010; Byers-Heinlein et al., 2013; Fennell and Byers-Heinlein, 2014). A substantial proportion of this research has used the Switch task, which has been productively used to investigate infants’ abilities to map similar sounding words onto different meanings. In a common instantiation of this task, infants are familiarized with an on-screen display of two objects and their labels. Labels consist of novel words that are subtle phonemic variants – or minimal pairs (e.g., ‘bih’ and “dih”). During a habituation phase, infants are presented with repetitions of each pairing until their attention to the objects wanes to a pre-set criterion. Following the habituation phase, infants are presented with two test trials. In one test trial (Same trial), infants are presented with the pairing with which they were familiarized. In the other test trial (Switch trial), infants are presented with the visual object with which they were familiarized but it is labeled with the name for the other object (e.g., what was learned as a ‘bih’ is now labeled as a ‘dih’). Infants’ fixation times to each trial type are compared: a relative elevation in fixation to the Switch trial versus the Same trial is interpreted as evidence of infants’ sensitivity to the source of phonological variation incorporated into the task (i.e., to variation in place of articulation in the current example).

In a seminal study that pioneered the Switch task to investigate early word learning, Stager and Werker (1997) demonstrated that 14-month-old monolingual infants failed to incorporate phonological variation (i.e., the difference between ‘b-’ versus ‘d-’) when learning new words, although they could incorporate the same variation when recognizing familiar words (Fennell and Werker, 2003). Comparative studies with bilingual infants reveal a similar set of abilities provided that bilingual infants are provided with input that is consistent with the phonetic properties of their input (i.e., input that sounds native to them). In one such study by Mattock et al. (2010), the authors presented 17-month-old bilingual infants with tokens drawn from both of their languages. Mattock et al. (2010) demonstrated that under these conditions, bilingual infants linked similar sounding words to their meanings at 17 months. More recently, Fennell and Byers-Heinlein (2014) demonstrated that both 17-month-old monolingual and bilingual infants succeeded in learning similar sounding words when the speaker matched their language background (i.e., when the speaker was monolingual or bilingual, respectively), although bilingual infants were not able to learn similar sounding words when presented with monolingual input (see Fennell et al., 2007). In sum, this set of studies suggests that both 17- to 18-month-old monolingual and bilingual infants maintain keen perceptual sensitivities to subtle phonetic detail that are optimally engaged when they listen to language input reminiscent of their environment.

Previous research has focused on bilingual infants’ sensitivity to phonological variation that draws lexical distinctions in both of their native languages (although the sub-phonetic realization of these sounds varies across languages; e.g., Fennell et al., 2007; Mattock et al., 2010; Fennell and Byers-Heinlein, 2014). Nevertheless, in each of the aforementioned studies, the phonemes used to distinguish word meanings belonged to separate phonetic categories in both languages. However, bilinguals often have to negotiate phonological conflict where the same source of variation draws lexical distinctions only in one language and not in the other. In this situation, learners of two languages have to alternate between activating and de-activating sensitivity to this source of variation depending on the language in use. For example, learners of Mandarin–Chinese and English have to inhibit integration of pitch variation when defining new words in English but have to incorporate certain forms of pitch variation (i.e., tone contrasts) when learning new words in Mandarin. One prior study has investigated bilingual English–Mandarin infants’ abilities to integrate tone in English and Mandarin in a language-selective manner. In a word segmentation task investigating how effectively infants segment words from passages, Singh and Foong (2012) familiarized infants with isolated words and then tested infants’ recognition of the familiarized words in fluent speech. Each infant was tested in English and in Mandarin in succession. The critical manipulation was that in the test phase, the target word was either matched or mis-matched in tone (Mandarin session) or matched or mis-matched in pitch (English session). Infants were tested at 7.5-, 9-, and 11-months. While infants did not demonstrate language-selective integration of pitch at 7.5- and 9-months (either integrating pitch/tone variation or disregarding pitch/tone variation in both languages), at 11 months, infants selectively defined words by tone in Mandarin and not by pitch in English. However, this study did not involve forming word-object associations, as it was an auditory-only word segmentation task, rendering it unclear as to whether infants linked the familiarized words to meaning. Additionally, the pitch transformations qualitatively differed between English and Mandarin sessions: Mandarin pitch variants encompassed Mandarin lexical tone contrasts, while English pitch variants were digitized, uniform transformations across the entire syllable. However, most crucially, word segmentation is thought to measure an infants’ ability to track repetitions of the same word and prior to 12 months, and is thought to precede an infants’ determination of meaning (Jusczyk and Aslin, 1995).

Subsequent studies investigating integration of pitch and tone when forming word-object associations reveal more fragile abilities in young children when they are required to link words to meaning. Influences of tone variation in newly learned words have been investigated in English monolingual infants, non-tone language learning bilingual infants and English–Mandarin bilingual infants (Singh et al., 2014; Graf Estes and Hay, 2015; Hay et al., 2015). Collectively, these findings suggest that the language-specific functions of pitch are not consolidated as early as 11 months. Using a preferential looking paradigm, a study by Singh et al. (2014) involved teaching infants novel tone-marked words in a referential context. Infants were then tested on their recognition of tone-matched and tone-varying labels of familiarized words (as well as vowel matches/variants). The authors reported that non-tone learning infants (monolingual and bilingual) were similar to their Mandarin learning peers in that they were sensitive to tone as a source of lexical contrast, rejecting tone variants as labels for words at 18 months. It was not until 24 months that non-tone learning infants (monolingual and bilingual) demonstrated selective inhibition to tone in English when learning new words, whereas Mandarin learning infants continued to associate and integrate lexical tone into newly learned words at 24 months. Tone integration was reflected by participant’s construal of tone changes as mispronunciations of newly learned words. In an investigation of tone sensitivity in English monolingual infants using the Switch paradigm, Hay et al. (2015) investigated English learning infants’ sensitivity to rising and falling tones when learning new words at 14, 17, and 19 months. Infants exhibited developmental change in tone sensitivity between 14 and 17–19 months: while 14-month-old infants were sensitive to tone variation, at 17 and 19 months, infants were no longer sensitive to the same source of tone variation in the Switch paradigm. Posing this question with bilingual infants learning two non-tone languages, Graf Estes and Hay (2015) reported a protracted period of tone sensitivity in bilingual learners, demonstrating that these infants were sensitive to lexical tones at 14 and 19 months, but not at 22 months. In the aggregate, it appears that when infants are confronted with the added burden of forming word-object associations, their sensitivity to phonological variation appears much more fragile than when they are simply tracking repetitions of words across time as in Singh and Foong’s study. However, in Singh et al. (2014), although tone learners were English–Mandarin bilinguals, the language context of newly learned words was not manipulated within bilingual participants. As such, it was not possible to examine whether bilingual participants could actually shift their interpretation of tone as befitted the language context. The ability on the part of bilingual learners to re-interpret the same phonetic information in a language-selective manner – termed perceptual switching – has been well researched in adult bilinguals (Flege and Eefting, 1987; Hazan and Boulakia, 1993; García-Sierra et al., 2009, 2012; Gonzales and Lotto, 2013) and to a limited extent, in children (Singh and Quam, 2016), but not yet in infants. However, this process of rapid alternation is a fundamental component of bilingual proficiency. The current study focuses on monolingual and bilingual infants’ abilities to alternate between the phonological systems of each of their languages when these systems conflict.

The primary goal of this study is to compare monolingual and bilingual phonological representations of lexical tone by assessing infants’ responsiveness to tone mispronunciations in their native language(s). In light of the multi-functionality of pitch in English–Mandarin bilingual infants’ environments, infants were provided with naming phrases ending with target words to cue a particular language (i.e., English or Mandarin). Prior research has demonstrated that bilingual infants make productive use of naming phrases to identify the relevant phonological rules (see Fennell and Byers-Heinlein, 2011). A secondary goal of the present study was to determine whether sensitivity to a change in lexical tone depended not only on the language in use, but furthermore, on the acoustic salience of the tone change. Mandarin Chinese has four lexical tones [high (Tone 1), rising (Tone 2), dipping (Tone 3), falling (Tone 4)], three of which (Tones 1, 2, and 3) were used in our study (please see Figure 1 for an illustration of Tones 1, 2, and 3). Some tones are highly distinctive from one another (such as Tones 1 and 3) such that Mandarin speakers readily discriminate them (Chen, 2013). Other tones are highly similar, notably Tones 2 and 3, such that these tones are often poorly discriminated (Zue, 1976; Shen and Lin, 1991). Prior studies investigating infants’ sensitivity to lexical tones have revealed that sensitivity to lexical tone pairs progresses asynchronously for different tone pairs (see Mattock and Burnham, 2006; Tsao, 2008; Yeung et al., 2013; Liu and Kager, 2014). An important determinant of lexical tone perception appears to be the salience of the tone contrast (see Liu and Kager, 2014 and Tsao, 2008 for investigations of sensitivity to high and low salience tone contrasts), a pattern also evidenced in production (e.g., Wong et al., 2005). Prior studies have demonstrated that emergent sensitivity to lexical tone contrasts do not necessarily generalize across the entire tone inventory (see Singh and Fu, 2016, for a review of this evidence in perception and production). Conclusions drawn about tone sensitivity are therefore necessarily qualified by the relative similarity of a given tone pair. Tone similarity is commonly defined by properties of the pitch contour (Gandour, 1983), primarily by pitch direction and secondarily by pitch height (Chandrasekaran et al., 2010). In light of discrepant effects of similar and distinct tone pairs on tone sensitivity, in the current study, infants’ sensitivity to lexical tone as a source of contrast was compared across similar and distinct tone pairs.

FIGURE 1

**Pitch contours for the three target syllables used in the habituation and test phases**.

A series of four experiments are reported. In Experiment 1, 12–13-month-old bilingual English–Mandarin infants were tested on a similar task, but were tested in both Mandarin and English in direct succession. In Experiment 2, 12–13-month-old monolingual Mandarin learning infants were tested on their sensitivity to lexical tone contrasts when learning novel words in Mandarin. Experiments 3 and 4 were designed to further investigate the apparent insensitivity to lexical tone observed in Mandarin learning monolingual infants at 12–13 months. Experiment 3 investigated whether Mandarin learning monolingual infants could discriminate the tones used in Experiment 2, even though they did not appear sensitive to variation in these tones when learning novel words. Experiment 4 investigated whether Mandarin learning monolingual infants could integrate lexical tone contrasts at a later age, testing 18-month-old infants on the same word learning task administered to 12- to 13-month-old Mandarin infants in Experiment 2.

Experiment 1

In Experiment 1, we investigated whether bilingual infants, learning English and Mandarin, integrated tone in a language-selective manner within each of their native languages. The purpose of this experiment was to determine whether habitual exposure to two native languages that conflicted in their use of tone would facilitate a language-selective interpretation of tone. We hypothesized that the contrastive use of tone in each of the participants’ native languages would contribute to a more mature understanding of the linguistic functions served by tone in each language.

Infants were familiarized with a word object pairing via the Switch paradigm. The label used to introduce the object was spoken in Tone 3. After successfully habituating to the pairing, infants were tested on their sensitivity to a similar (Tone 2 versus Tone 3) mispronunciation and to a distinct (Tone 1 versus Tone 3) mispronunciation. Infants were tested in each of their native languages: English and Chinese. Responses to each type of tone mispronunciation were compared across languages.

Methods

Participants

Our sample comprised eighteen 12- to 13-month-old Mandarin–English bilingual infants (age range: 12 months 10 days to 13 months 21 days, average age = 13 months 1 day). All infants were born healthy and full term. Another seven infants were tested but excluded from the final sample due to fussiness during test (n = 6) or on account of data that deviated from the group mean by more than 3 standard deviations (n = 1). All infants received between at least 35% exposure to English or Mandarin with no third language exposure (range of English exposure: 38 to 63%, mean = 51%; range of Mandarin exposure: 37 to 62%; mean = 48%). Language exposure was determined by the Language Exposure Questionnaire developed by Bosch and Sebastián-Gallés (1997). Language exposure was derived from parental estimates of the relative proportion that each caregiver used when communicating directly to the child, and the amount of time each caregiver spent with the child in a typical week.

The age of testing was motivated by prior research investigating sensitivity to suprasegmental lexical variation (see Curtin, 2009). When tasked with learning minimally contrastive words differing in lexical stress, Curtin (2009) demonstrated that infants were sensitive to contrasts in stress at 1213 months. This finding stands in contrast to the broad swath of studies defining similar sounding words by consonant variation demonstrating that infants are challenged by this task prior to 14 months (e.g., Stager and Werker, 1997; Werker et al., 2002; Pater et al., 2004). As concluded by Curtin (2009), it appears that suprasegmental lexical variation is integrated into word meaning earlier than segmental variation. As our study manipulated suprasegmental lexical variation (i.e., tones), we tested infants at 12–13 months. This study was carried out with the approval of the National University of Singapore Institutional Review Board. Participants’ parents or legal guardians gave written informed consent in accordance with the National University of Singapore Institutional Review Board requirements.

Stimuli

Auditory stimuli for the study consisted of seven Mandarin and seven English naming phrases adapted from Fennell and Waxman (2010) (see Table 1). The target word was the label “pa” produced in Tones 1, 2, and 3 by a female native speaker of Mandarin in the context of each naming phrase. All stimuli were produced in infant-directed speech. The mean duration of the Mandarin phrases was 1.28 s (SD: 0.4) and the mean duration of the English phrases was 1.14 s (SD: 0.3). English and Mandarin phrase durations did not differ significantly. The mean pitch range of the carrier phrases was 288.14 Hz (SD: 45.81; Mandarin) and 277.95 Hz (SD: 48.51; English), which again, did not differ significantly across languages. The mean duration of the target words was 0.47 s (SD: 0.07). The same tokens were spliced into English and Mandarin introductory phrases to mitigate possible effects of language-specific differences in tone productions. Each instantiation of the target syllable was separated by 800 ms.

Table 1

Naming phrases used in Experiments 1, 2, and 4.

The target word was labeled by the syllable /pa/, which begins with an unaspirated voiceless onset consonant. This segment was chosen for the entire series of experiments because it assimilates to the native phonological inventories of English and Mandarin. In English, the unaspirated /p/ typically follows a word-initial /s/, such as in “spa,” but it does not appear in the word-initial position. However, unaspirated voiceless stops in word-initial position sound native to English speakers and are classified as voiced stops (in this case, ‘ba’; Pegg and Werker, 1997). They are judged to be as good an instance of ‘ba’ as the voiced stop ‘ba’ when produced in word-initial position (Lisker and Abramson, 1964).

Acoustic analyses of the target syllable, /pa/, were conducted to ensure that the tokens matched monolingual productions within each language. The voice-onset-time (VOT) values of the three tokens ranged from 11 ms (Tone 3 production) to 18 ms (Tone 1 production). These values overlap with published VOT values of monolingual Mandarin productions that range from 11 to 18 ms (Liao, 2005; Chao et al., 2006; Chen et al., 2007; Deterding and Nolan, 2007) as well as with English monolingual values for /ba/ (Lisker and Abramson, 1967). Formant values also fell within the range of values reported for Mandarin and English monolingual productions (Mandarin monolingual F1: 1104 Hz, English monolingual F1: 850, bilingual F1: 802.7–1213.6 Hz, Mandarin monolingual F2: 1593.6 Hz, English monolingual F2: 1220, bilingual F2: 1046.3–1633.2 Hz; Peterson and Barney, 1952; Zee and Lee, 2001). F3 was not examined as it relates to lip rounding, which is not used contrastively for the target vowel in English or Mandarin. Auditory stimuli were accompanied by a visually presented novel object (see Figure 2) that moved in a circular path. Objects were counterbalanced to each language across participants.

FIGURE 2

**Visual stimuli used in Experiments 1, 2, and 4**.

English and Mandarin versions of the task were created. The target word was paired with a different object in each language. However, the target word remained the same so as to determine whether infants were capable of switching to a new set of phonological rules based on contextual cues alone.

Procedure

Before testing, all caregivers provided informed consent for their child’s participation, in accordance with guidelines set out by the National University of Singapore Institutional Review Board. Infants sat on their parents’ lap in a dimly lit testing suite facing a computer screen. Parents were asked not to interact with their child during the session. The experimenter observed the infant’s behavior from an adjoining room. During the experiment, both the parent and the experimenter listened to instrumental masking music.

During the task, novel objects were presented in the context of naming phrases to infants in the Switch task (Stager and Werker, 1997; Fennell and Waxman, 2010). The experiment consisted of a habituation and test phase. Before each trial, an attention getter was presented. Trials were initiated when infants oriented to the visual display. When the infant fixated to the visual display, the habituation phase commenced. Habituation consisted of repeated presentations of the target word /pa/ in Tone 3, embedded within the naming phrases and presented with the novel object. The habituation phase terminated when infant’s looking times to two trials decreased to less than 65% of two longest consecutive trials or until the infant completed a maximum of 24 trials. This habituation criterion was informed by a prior study that used the Switch task with carrier sentences (Fennell and Waxman, 2010). Once either of these criteria was met, the test phase commenced. The test phase included a Same trial and two Switch trials as adopted in previous studies (e.g., Curtin, 2010; Escudero et al., 2014). Trial order was counterbalanced across infants. For the Same trial, infants were presented with the word-object pairing to which they had habituated (i.e., /pa/ in Tone 3). The Switch trials violated this pairing, presenting infants with the same visual stimulus but paired with the target word /pa/ produced in Tones 1 and 2, respectively. Across all phases, trials lasted for a maximum of 20 s, or until the child looked away from the screen for more than 2 s. Trials were repeated if infants fixated to the screen for less than 1 s. Following the test trials, a post-test was presented. This consisted of a novel object produced by a different female speaker and labeled as a /pI/ produced in a novel tone (Tone 4). The object was animated to enlarge and shrink on the screen. A post-test trial is commonly included in the Switch paradigm to provide an indication of attention to the task during the terminal phase of the experiment. In prior studies (e.g., Fennell et al., 2007; Byers-Heinlein et al., 2013), fixation to the last habituation block has been compared to fixation to the post-test trial. Elevated attention (recovery) between these is recruited as an interpretative safeguard against a Type II error: in the event of a null result whereby fixation to Same and Switch trials do not differ, the presence of recovery between the last habituation block and the post-test trial indicates that this is unlikely to be accounted for by fatigue or disengagement from the experiment during the test trials. An example of the stimuli is provided in Figure 2. Infants were presented with a Mandarin and an English version of the same task. The order of presentation of the English and Mandarin task was counterbalanced across infants. Between the two tasks, infants were presented with a 1-min non-verbal cartoon.

Both the Switch task and preferential looking approaches are well-established measures of infants’ sensitivity to phonological variation when learning new words. However, in pilot studies, a preferential looking approach to the present task (including relevant parameters such as two languages, three test trial types) proved excessively demanding for participants. Each session was substantially longer than the auditory word segmentation task used within subjects by Singh and Foong (2012), and in recent research, a preferential looking approach to the question of perceptual switching was only successfully used in older children at 3–5 years of age (see Singh and Quam, 2016). As a consequence, the Switch task was selected for the current study. It should be noted that it is possible to use the Switch task to measure sensitivity to phonological variation using two objects (e.g., Werker et al., 2002; Fennell and Werker, 2003). However, familiarization with two objects could not be integrated into a design with a three trial [Same; Switch (distinct); Switch (similar)] test phase. An alternative design would have been to incorporate a two-trial (Same and Switch) test phase and manipulate contrast salience across participants. We prioritized the manipulation of salience as a within-subjects contrast in light of the fact that our sample comprised bilingual infants; a between-subjects comparison between two groups of bilinguals can introduce differences in performance due to background variables (specifically, the nature and extent of bilingual input, which are hard to match across bilingual groups with precision). Uncontrolled effects of error variance due to individual variation are somewhat mitigated by within-subjects comparisons, which motivated our decision to incorporate a single object and to manipulate salience within participants for each experiment. Although less common than a two-object paradigm, a single-object Switch paradigm has been used in several prior studies (see Stager and Werker, 1997; Werker et al., 1998; Pater et al., 2004; Thiessen, 2007; Fennell and Waxman, 2010; Fennell, 2012).

Results

All infants habituated within the 24 trial maximum habituation window. A preliminary analysis was conducted to determine whether participants recovered to the post-test by comparing the last habituation block to the post-test stimulus. A 2 × 2 (phase: last habituation block/post-test × language: English/Mandarin) repeated-measures ANOVA revealed a main effect of phase [F(1,34) = 13.91, p = 0.001, : 0.29], accounted for by an elevation in fixation times between the last habituation block and the post-test. There were no effects of language on fixation times nor was there an interaction of phase and language on fixation times (p > 0.8).

An initial set of analyses was conducted to determine if there was an effect of test order on fixation times to test trials. A 3 × 2 × 2 (Trial type: Same; Switch-similar; Switch-distinct × Language: English; Mandarin × Order: Mandarin first; English first) repeated-measures ANOVA was conducted with fixation times during test trials as the dependent variable. Results revealed no effects of interactions with order (p > 0.3). Fixation times were therefore collapsed across test orders for subsequent analyses.

As the other of test trials was rotated across participants, a preliminary analysis was conducted to investigate effects and interactions test trial order, trial type, and language, revealing no effects or interactions with test trial order (p > 0.6). Test trial order was excluded from subsequent analyses. A 3 × 2 Trial type × Language repeated-measures ANOVA was then conducted. Results revealed a main effect of trial type [F(2,34) = 11.18, p = 0.0001, = 0.39], no main effect of language (p = 0.23) and no interaction of trial type and language [F(2,34) = 2.46, p = 0.1]. Planned comparisons were conducted within each language to determine whether participants differed in how they responded to each tone change based on the language of testing. For each language, a repeated measures ANOVA was conducted to determine the effect of trial type (Same; Switch-similar; Switch-distinct) on fixation times to test trials. When participants were tested in Mandarin, results revealed a main effect of trial type [F(2,34) = 10.56, p = 0.0001m, : 0.39]. Simple contrasts revealed higher fixation times to Switch-distinct trials than to Same trials [F(1,17) = 20.35, p > 0.0001, : 0.54] as well as higher fixation times to Switch-similar trials than to Same trials [F(1,17) = 5.93, p = 0.03, : 0.26]. A post hoc analysis comparing fixation times to Same and Switch trials for the two Switch trials (similar and distinct) demonstrated that differences in Same–Switch trials were greater for when the Switch involved a distinct contrast (i.e. change from Tone 3 to Tone 1) than when it involved a similar contrast [i.e., change from Tone 3 to Tone 2; t(17) = 2.3, p = 0.04 (Cohen’s d: 0.57)]. This analysis revealed effects of perceptual salience on tone integration in Mandarin, although both similar and distinct substitutions were recognized as lexically contrastive. When participants were tested in English, results revealed a main effect of trial type [F(2,34) = 3.27, p = 0.05]. Simple contrasts revealed no significant difference in fixation to Switch-distinct trials than to same trials [F(1,17) = 3.15, p = 0.1] nor to Switch-similar trials than to Same trials [F(1,17) = 0.54, p = 0.47]. Fixation times to each trial type for English and Mandarin are plotted in Figure 3.

FIGURE 3

**Fixation times to the visual stimulus for Same, Switch (similar), and Switch (distinct) trials in 12–13-month-old bilingual infants (error bars: SEM)**.

Findings suggest that bilingual English–Mandarin infants recognized the lexical relevance of tone in English and Mandarin, responding differentially to tone variants based on the language in which words were introduced. In a second experiment, Mandarin monolingual infants were tested recognition of tone-matched and tone-varying words in the same task as employed in Experiment 1 (Mandarin version). The goal of this experiment was to provide a monolingual point of comparison for findings obtained in Experiment 1. Given that bilingual infants were sensitive to tone variation when words were introduced in Mandarin, it was expected that Mandarin monolingual infants would be comparably sensitive to tone variation.

Experiment 2

We investigated Mandarin monolingual infants’ sensitivity to tone changes in a similar paradigm as that used in Experiment 1. The primary methodological difference with Experiment 1 was that all participants were tested in Mandarin only. As in Experiment 1, tone changes consisted of similar and distinct contrasts.