<?xml version="1.0" encoding="utf-8"?>
<!DOCTYPE article PUBLIC "-//NLM//DTD Journal Publishing DTD v2.3 20070202//EN" "journalpublishing.dtd">
<article xml:lang="EN" xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" article-type="research-article" dtd-version="2.3">
<front>
<journal-meta>
<journal-id journal-id-type="publisher-id">Front. Psychol.</journal-id>
<journal-title>Frontiers in Psychology</journal-title>
<abbrev-journal-title abbrev-type="pubmed">Front. Psychol.</abbrev-journal-title>
<issn pub-type="epub">1664-1078</issn>
<publisher>
<publisher-name>Frontiers Media S.A.</publisher-name>
</publisher>
</journal-meta>
<article-meta>
<article-id pub-id-type="doi">10.3389/fpsyg.2022.918737</article-id>
<article-categories>
<subj-group subj-group-type="heading">
<subject>Psychology</subject>
<subj-group>
<subject>Original Research</subject>
</subj-group>
</subj-group>
</article-categories>
<title-group>
<article-title>The effect of simultaneous exposure on the attention selection and integration of segments and lexical tones by Urdu-Cantonese bilingual speakers</article-title>
</title-group>
<contrib-group>
<contrib contrib-type="author" corresp="yes">
<name>
<surname>Ning</surname>
<given-names>Jinghong</given-names>
</name>
<xref rid="aff1" ref-type="aff"><sup>1</sup></xref>
<xref rid="c001" ref-type="corresp"><sup>&#x002A;</sup></xref>
</contrib>
<contrib contrib-type="author">
<name>
<surname>Peng</surname>
<given-names>Gang</given-names>
</name>
<xref rid="aff1" ref-type="aff"><sup>1</sup></xref>
<uri xlink:href="https://loop.frontiersin.org/people/214142/overview"/>
</contrib>
<contrib contrib-type="author">
<name>
<surname>Liu</surname>
<given-names>Yi</given-names>
</name>
<xref rid="aff1" ref-type="aff"><sup>1</sup></xref>
<xref rid="aff2" ref-type="aff"><sup>2</sup></xref>
<uri xlink:href="https://loop.frontiersin.org/people/606113/overview"/>
</contrib>
<contrib contrib-type="author">
<name>
<surname>Li</surname>
<given-names>Yingnan</given-names>
</name>
<xref rid="aff1" ref-type="aff"><sup>1</sup></xref>
<uri xlink:href="https://loop.frontiersin.org/people/1972034/overview"/>
</contrib>
</contrib-group>
<aff id="aff1"><sup>1</sup><institution>Department of Chinese and Bilingual Studies, The Hong Kong Polytechnic University</institution>, <addr-line>Kowloon, Hong Kong SAR</addr-line>, <country>China</country></aff>
<aff id="aff2"><sup>2</sup><institution>Poly U &#x2013; Peking U Research Centre on Chinese Linguistics</institution>, <addr-line>The Hong Kong Polytechnic University, Kowloon, Hong Kong SAR</addr-line>, <country>China</country></aff>
<author-notes>
<fn id="fn0001" fn-type="edited-by"><p>Edited by: Linjun Zhang, Peking University, China</p></fn>
<fn id="fn0002" fn-type="edited-by"><p>Reviewed by: Han Wu, Beijing Language and Culture University, China; Lingzhi Kong, Beijing Language and Culture University, China</p></fn>
<corresp id="c001">&#x002A;Correspondence: Jinghong Ning, <email>chion.ning@polyu.edu.hk</email></corresp>
<fn id="fn0003" fn-type="other"><p>This article was submitted to Language Sciences, a section of the journal Frontiers in Psychology</p></fn>
</author-notes>
<pub-date pub-type="epub">
<day>07</day>
<month>09</month>
<year>2022</year>
</pub-date>
<pub-date pub-type="collection">
<year>2022</year>
</pub-date>
<volume>13</volume>
<elocation-id>918737</elocation-id>
<history>
<date date-type="received">
<day>12</day>
<month>04</month>
<year>2022</year>
</date>
<date date-type="accepted">
<day>12</day>
<month>08</month>
<year>2022</year>
</date>
</history>
<permissions>
<copyright-statement>Copyright &#x00A9; 2022 Ning, Peng, Liu and Li.</copyright-statement>
<copyright-year>2022</copyright-year>
<copyright-holder>Ning, Peng, Liu and Li</copyright-holder>
<license xlink:href="http://creativecommons.org/licenses/by/4.0/">
<p>This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.</p>
</license>
</permissions>
<abstract>
<p>In the perceptual learning of lexical tones, an automatic and robust attention-to-phonology system enables native tonal listeners to adapt to acoustically non-optimal speech, such as phonetic conflicts in daily communications. Previous tone research reveals that non-native listeners who do not linguistically employ lexical tones in their mother tongue may find it challenging to attend to the tonal dimension or integrate it with the segmental features. However, it is unknown whether the attentional interference initially caused by a maternal attentional system would continue influencing the non-optimal tone perception for simultaneous bilingual teenagers. From an endpoint in the age of language acquisition, we investigate whether the tone-specific attention mechanism developed by the Urdu-Cantonese simultaneous bilinguals is automatic enough to assist them in adapting to a phonetically-conflicting environment. Three groups of teenagers engaged in a four-condition ABX task: Urdu-Cantonese simultaneous bilinguals, Cantonese native listeners, and Urdu-speaking, late learners of Cantonese. The results showed that although the simultaneous bilinguals could phonologically process Cantonese tones in a Cantonese-like way under a conflict-free listening condition, they still failed in adapting to the phonetic conflicts, especially the segment-induced ones. It thus demonstrated that the simultaneous exposure and years of regular education in Hong Kong local schools still could not automatically guarantee simultaneous bilingual processing of Cantonese tones. In interpreting the findings, it hypothesized that, except for simultaneous exposure, the development of a tone-specific attention mechanism is also likely to be L1-inhibitory, tone experience-driven, and language-specific for simultaneous bilinguals.</p>
</abstract>
<kwd-group>
<kwd>attention distribution and integration</kwd>
<kwd>lexical tones</kwd>
<kwd>simultaneous bilinguals</kwd>
<kwd>non-optimal perception</kwd>
<kwd>phonetic conflicts</kwd>
</kwd-group>
<contract-num rid="cn3">PolyU/RFS2122-5H01</contract-num>
<contract-sponsor id="cn1">Research and Development<named-content content-type="fundref-id">10.13039/100006190</named-content>
</contract-sponsor>
<contract-sponsor id="cn2">Hong Kong Polytechnic University<named-content content-type="fundref-id">10.13039/501100004377</named-content>
</contract-sponsor>
<contract-sponsor id="cn3">Research Grants Council of the Hong Kong Special Administrative Region, China</contract-sponsor>
<counts>
<fig-count count="3"/>
<table-count count="3"/>
<equation-count count="0"/>
<ref-count count="85"/>
<page-count count="18"/>
<word-count count="13822"/>
</counts>
</article-meta>
</front>
<body>
<sec id="sec1" sec-type="intro">
<title>Introduction</title>
<p>The speech variances on an acoustic-to-phonetic level (e.g., sound conflicts, fast-talking speed, talker variances, etc.) may further lead to perceptual barriers for non-native listeners. To deal with the non-optimal conditions, a language-specific selective attention-to-phonology (hereafter, SATP) system can play a vital role for listeners. SATP refers to a sensory mechanism that can cognitively reinforce listeners only to select the language-specific acoustic cues while remaining other redundant and chaotic inputs blend into the background automatically (<xref ref-type="bibr" rid="ref31">Francis and Nusbaum, 2002</xref>; <xref ref-type="bibr" rid="ref50">McCandliss and Yoncheva, 2011</xref>; <xref ref-type="bibr" rid="ref71">Strange, 2011</xref>). In the perceptual learning of a second language (L2), bilinguals, particularly those with reduced proficiency, have to pay efforts to inhibit a SATP transfer shaped by their first language (L1) when perceiving a non-native speech (<xref ref-type="bibr" rid="ref74">Trofimovich, 2008</xref>; <xref ref-type="bibr" rid="ref48">MacWhinney, 2018</xref>; <xref ref-type="bibr" rid="ref75">Verbeek et al., 2022</xref>). The failure to inhibit an L1 transfer on the attentional level may lead to a non-adaptation of the speech variants. Further, such adapting barriers on an acoustic-to-phonetic level may tangle with wrong lexical interpretation from sound to a word and arouse misunderstanding in daily non-optimal listening environments (<xref ref-type="bibr" rid="ref11">Bradlow and Alexander, 2007</xref>; <xref ref-type="bibr" rid="ref52">Meng, 2021</xref>; <xref ref-type="bibr" rid="ref59">Pelzl, 2021</xref>).</p>
<p>In the field of SATP, increasing attention has been paid to the research issue regarding how an early learning age/ early bilingualism modulates bilinguals&#x2019; adaptation to the non-optimal listening condition by selecting appropriate L2 cues through the SATP system. The early bilingualism effect on the non-optimal perception of L2 contrasts has been examined by a large number of SATP research, which dominantly focused on segmental or stress learning (<xref ref-type="bibr" rid="ref57">Navarra et al., 2005</xref>; <xref ref-type="bibr" rid="ref27">Dupoux et al., 2008</xref>, <xref ref-type="bibr" rid="ref26">2010</xref>; <xref ref-type="bibr" rid="ref55">Molnar et al., 2014</xref>; <xref ref-type="bibr" rid="ref36">Hisagi et al., 2015</xref>; <xref ref-type="bibr" rid="ref82">Yan et al., 2019</xref>; <xref ref-type="bibr" rid="ref22">Datta et al., 2020</xref>). Comparatively, there is a lack of research investigating the early bilingualism effect on SATP about tone learning (e.g., <xref ref-type="bibr" rid="ref47">Liu and Ning, 2021</xref>). Moreover, it is a research gap about how simultaneous bilingualism modulates bilinguals&#x2019; SATP systems in terms of tone perception in a non-optimal condition. In this study, simultaneous bilinguals refer to those who have had extensive exposure to two languages and have continued to use both from the first year of their lives till pre-adolescence (<xref ref-type="bibr" rid="ref23">De Houwer, 1990</xref>, pp. 3; <xref ref-type="bibr" rid="ref66">Sebasti&#x00E1;n-Gall&#x00E9;s et al., 2005</xref>). The early sequential bilinguals have fully or partially established an L1-specific SATP before they start learning L2 at an early age (<xref ref-type="bibr" rid="ref23">De Houwer, 1990</xref>, pp. 2; <xref ref-type="bibr" rid="ref51">Meisel, 2004</xref>). Comparatively, simultaneous bilinguals are more experienced in utilizing the attentional system to govern and refine two language systems from the very beginning of their childhood. Thus, it is interesting to see if simultaneous bilingualism can guarantee an automatic SATP for bilinguals to adapt to non-optimal and cognitively-demanding environments.</p>
<p>The current study investigates the tone-specific SATP by observing two attentional components under a non-optimal (i.e., phonetic conflicting) condition: attention integration and distribution across linguistic cues (<xref ref-type="bibr" rid="ref63">Repp and Lin, 1990</xref>; <xref ref-type="bibr" rid="ref12">Braun and Johnson, 2011</xref>; <xref ref-type="bibr" rid="ref46">Lin and Francis, 2014</xref>; <xref ref-type="bibr" rid="ref84">Zou et al., 2017</xref>). Moreover, the automaticity of SATP was examined based on listeners&#x2019; accuracy and reaction time (RT) in response to stimuli (<xref ref-type="bibr" rid="ref84">Zou et al., 2017</xref>; <xref ref-type="bibr" rid="ref75">Verbeek et al., 2022</xref>). As an experimental group, we recruited simultaneous bilinguals in Urdu (non-tonal) and Cantonese (tonal language). The simultaneous bilinguals were among the second-generation immigrant population in Hong Kong. They had been simultaneously exposed to Cantonese and Urdu through their bilingual parents since they were born in Hong Kong. They kept learning Cantonese regularly after enrolling in local kindergarten and primary schools.</p>
<p>Moreover, they were secondary-school students in Hong Kong at the time of the experiment. Based on participants&#x2019; ethnic identity and language proficiency, Urdu was defined as their native language, whereas Cantonese was classified as their non-native language in this study. It is challenging to recruit Urdu/ Cantonese monolinguals in Hong Kong, a dense multilingual city. Hence, we recruited the following two control groups: the native Cantonese teenagers who were dominant in Cantonese and the Urdu native teenagers who were late and low-proficiency learners of Cantonese. All the groups participated in a multi-conditioned ABX task, with or without phonetic conflicts.</p>
</sec>
<sec id="sec2">
<title>Literature review</title>
<sec id="sec3">
<title>Perceptual barriers in non-optimal listening conditions and the role of bilinguals&#x2019; SATP</title>
<p>A typical scenario in our daily communications is that the speeches we perceive are rich in acoustic variations (<xref ref-type="bibr" rid="ref21">Crandell and Smaldino, 2000</xref>). For example, listening with phonetic conflicts (<xref ref-type="bibr" rid="ref11">Bradlow and Alexander, 2007</xref>), fast speaking speed (<xref ref-type="bibr" rid="ref27">Dupoux et al., 2008</xref>, <xref ref-type="bibr" rid="ref26">2010</xref>), or other non-optimal conditions involving attention switching from one phonetic dimension to another (<xref ref-type="bibr" rid="ref19">Costa et al., 2009</xref>; <xref ref-type="bibr" rid="ref41">Kalashnikova et al., 2021</xref>), etc. Bilinguals may encounter significant barriers in adapting to the above non-optimal listening conditions, even though they may be able to cope with a conflict-free and cognitively-less-demanding condition (<xref ref-type="bibr" rid="ref27">Dupoux et al., 2008</xref>, <xref ref-type="bibr" rid="ref26">2010</xref>; <xref ref-type="bibr" rid="ref47">Liu and Ning, 2021</xref>). This is also true for tone learning. The tone perceptual difficulty driven by a non-optimal listening condition has been consistently reported for language beginners (e.g., <xref ref-type="bibr" rid="ref59">Pelzl, 2021</xref>), advanced learners (e.g., <xref ref-type="bibr" rid="ref84">Zou et al., 2017</xref>), and even the early and highly proficient bilinguals (e.g., <xref ref-type="bibr" rid="ref47">Liu and Ning, 2021</xref>). Moreover, tonal languages (e.g., Cantonese, Mandarin) use pitch frequency (F0) as one of the primary cues of tones to convey lexical meanings in a large-scale lexicon (<xref ref-type="bibr" rid="ref76">Whalen and Xu, 1992</xref>; <xref ref-type="bibr" rid="ref33">Hall&#x00E9; et al., 2004</xref>). Thus, the tone perceptual errors induced by a non-optimal condition on an acoustic-to-phonetic level may further lead to misrepresentation of spoken words, which may thus hinder the mutual understanding in our daily communications (<xref ref-type="bibr" rid="ref49">Marslen-Wilson and Warren, 1994</xref>; <xref ref-type="bibr" rid="ref56">Munro, 2008</xref>).</p>
<p>To adapt to a non-optimal speech, bilinguals must develop the ability to select linguistic cues automatically and accurately with the help of a matured SATP system (<xref ref-type="bibr" rid="ref46">Lin and Francis, 2014</xref>; <xref ref-type="bibr" rid="ref84">Zou et al., 2017</xref>). According to the Automatic Selective Perception model (hereafter, ASP model; <xref ref-type="bibr" rid="ref72">Strange and Shafer, 2008</xref>; <xref ref-type="bibr" rid="ref71">Strange, 2011</xref>), bilinguals&#x2019; SATP system may function as a &#x201C;navigator&#x201D; to weight, integrate, and map the chaotic inputs to different phonology categories (<xref ref-type="bibr" rid="ref31">Francis and Nusbaum, 2002</xref>; <xref ref-type="bibr" rid="ref28">Ellis, 2006</xref>). In this process, how listeners &#x201C;select/ distribute&#x201D; (i.e., be sensitive to specific linguistic cues) and &#x201C;integrate&#x201D; (i.e., be sensitive to the relation between linguistic cues), linguistic cues are determinant in sharpening their adaptation to a phonetically-conflicting condition (<xref ref-type="bibr" rid="ref63">Repp and Lin, 1990</xref>; <xref ref-type="bibr" rid="ref12">Braun and Johnson, 2011</xref>; <xref ref-type="bibr" rid="ref46">Lin and Francis, 2014</xref>; <xref ref-type="bibr" rid="ref84">Zou et al., 2017</xref>). In other words, bilinguals&#x2019; perceptual barriers in a non-optimal condition may be underlyingly interfered with by their failure to select and integrate different cues on an acoustic-to-phonetic level.</p>
<p>Moreover, the development of bilinguals&#x2019; SATP system tends to be a language-specific, L1-inhibitory, and experience-driven process. When establishing an L2 phonology system, bilinguals&#x2019; L1-shaped SATP will act as a lens for weighting the perceptual salience of L2 sub-syllabic features (<xref ref-type="bibr" rid="ref28">Ellis, 2006</xref>; <xref ref-type="bibr" rid="ref29">Figueiredo and Da Silva, 2009</xref>; <xref ref-type="bibr" rid="ref75">Verbeek et al., 2022</xref>). With the accumulation of L2 learning experiences, bilinguals can gradually shift their attention to sub-syllabic features from those relevant to L1 to those appropriate to L2 (<xref ref-type="bibr" rid="ref43">Kohl, 1993</xref>; <xref ref-type="bibr" rid="ref70">Steinhauer et al., 2009</xref>; <xref ref-type="bibr" rid="ref69">Steinhauer, 2014</xref>; <xref ref-type="bibr" rid="ref77">White et al., 2017</xref>). For example, to adapt to speech conflicts, L1 tonal listeners can process incongruent cues accurately and rapidly by selecting and integrating both segmental and tonal dimensions (<xref ref-type="bibr" rid="ref80">Wood, 1974</xref>; <xref ref-type="bibr" rid="ref73">Tong et al., 2008</xref>). On the contrary, L1 non-tonal listeners are unlikely to focus on tonal information when perceiving tonal contrasts. Thus, they would find it not difficult to cope with tone-induced conflicts since tones are not salient to contrast meaning in their L1 vocabulary (<xref ref-type="bibr" rid="ref12">Braun and Johnson, 2011</xref>; <xref ref-type="bibr" rid="ref84">Zou et al., 2017</xref>). Furthermore, in addition to L1 inhibition and L2 experiences, it has been documented that the automaticity of bilinguals&#x2019; SATP system also can be altered by their language acquisition age.</p>
</sec>
<sec id="sec4">
<title>Effect of early bilingualism on the development of L2-specific SATP</title>
<p>The first year of language exposure is critical for developing bilinguals&#x2019; adaptation to phonetic variances (<xref ref-type="bibr" rid="ref41">Kalashnikova et al., 2021</xref>, see <xref ref-type="bibr" rid="ref35">Hendry et al., 2016</xref> for a review). Children&#x2019;s attention system for a native language is established in the first year of life (<xref ref-type="bibr" rid="ref44">Kuhl et al., 1992</xref>; <xref ref-type="bibr" rid="ref41">Kalashnikova et al., 2021</xref>). Early bilingual children, who simultaneously or sequentially acquire a second language, will refine their bilingual phonology and attention system during childhood (<xref ref-type="bibr" rid="ref34">Hazan and Barrett, 2000</xref>; <xref ref-type="bibr" rid="ref75">Verbeek et al., 2022</xref>). For specific phonology contrasts, the refining may continue past puberty (9&#x2013;11&#x2009;years old) until adulthood (<xref ref-type="bibr" rid="ref29">Figueiredo and Da Silva, 2009</xref>; <xref ref-type="bibr" rid="ref67">Shafer et al., 2011</xref>; <xref ref-type="bibr" rid="ref82">Yan et al., 2019</xref>; <xref ref-type="bibr" rid="ref24">Dollmann et al., 2020</xref>). In this developmental trajectory, there is an increasing concern about whether the early acquisition age can guarantee bilinguals adapting to a non-optimal condition in L2. To answer this issue, many neural and behavioral studies have examined whether an early acquisition age can prevent bilinguals from suffering a transfer from their L1-specific SATP patterns in a long-term fashion. These SATP-oriented studies usually adopted cognitively-demanding or phonetically-conflicting perceptual tasks (e.g., a task with phonetic incongruence, a speeded identification task, etc.). Most of the above research focused on segmental learning (e.g., <xref ref-type="bibr" rid="ref57">Navarra et al., 2005</xref>; <xref ref-type="bibr" rid="ref55">Molnar et al., 2014</xref>; <xref ref-type="bibr" rid="ref36">Hisagi et al., 2015</xref>; <xref ref-type="bibr" rid="ref82">Yan et al., 2019</xref>; <xref ref-type="bibr" rid="ref22">Datta et al., 2020</xref>). Only a few empirical studies examined stress (e.g., <xref ref-type="bibr" rid="ref27">Dupoux et al., 2008</xref>, <xref ref-type="bibr" rid="ref26">2010</xref>) or tone learning (e.g., <xref ref-type="bibr" rid="ref84">Zou et al., 2017</xref>; <xref ref-type="bibr" rid="ref47">Liu and Ning, 2021</xref>).</p>
<p>The views are diverse across the above SATP-oriented studies. On the one hand, it was reported that early bilinguals have a far greater cost in attentional resources than monolinguals when adapting to the chaotic L2 cues that are commonly difficult to perceive by non-native listeners (e.g., acoustically non-distinct contrasts or contrasts that are absent from L1). This attentional difference between early bilinguals and monolinguals has been detected both for simultaneous (e.g., stress learning for adults: <xref ref-type="bibr" rid="ref26">Dupoux et al., 2010</xref>) and early sequential (stress learning for adults: <xref ref-type="bibr" rid="ref27">Dupoux et al., 2008</xref>; vowel learning for adults: <xref ref-type="bibr" rid="ref36">Hisagi et al., 2015</xref>; vowel learning for adults: <xref ref-type="bibr" rid="ref57">Navarra et al., 2005</xref>; vowel learning for infants and bilinguals aged between 3 and 47&#x2009;years old: <xref ref-type="bibr" rid="ref82">Yan et al., 2019</xref>) bilingual groups. The above finding may be primarily because bilinguals&#x2019; attention device has to simultaneously modulate and navigate two phonological systems (<xref ref-type="bibr" rid="ref41">Kalashnikova et al., 2021</xref>), usually with reduced experience in each language (<xref ref-type="bibr" rid="ref13">Byers-Heinlein and Fennell, 2014</xref>). Early bilinguals should be considered a unique population different from monolingual groups (<xref ref-type="bibr" rid="ref58">Pallier et al., 2001</xref>; <xref ref-type="bibr" rid="ref5">Antoniou et al., 2012</xref>). On the other hand, some studies utilize easy-to-detect phonetic contrasts (e.g., acoustically distinct, familiar in L1). These studies did not detect a long-term attentional inhibition from L1 for both simultaneous (e.g., vowel learning for adults: <xref ref-type="bibr" rid="ref55">Molnar et al., 2014</xref>) and sequential (vowel learning for adults and children aged between 9 and 11&#x2009;years old: <xref ref-type="bibr" rid="ref22">Datta et al., 2020</xref>; vowel learning for adults: <xref ref-type="bibr" rid="ref55">Molnar et al., 2014</xref>) early bilinguals. The above finding could be explained by the fact that L2 learning experiences can gradually modulate learners&#x2019; L2-specific SATP system (<xref ref-type="bibr" rid="ref43">Kohl, 1993</xref>; <xref ref-type="bibr" rid="ref70">Steinhauer et al., 2009</xref>; <xref ref-type="bibr" rid="ref69">Steinhauer, 2014</xref>; <xref ref-type="bibr" rid="ref77">White et al., 2017</xref>).</p>
<p>In addition, a little different from the above results in the segment and stress-focused studies, attention-selecting barriers in non-optimal conditions have been detected on the tone level, even for the easy-to-acquire tone contrasts. The acquisition age effect has been examined by tone-specific SATP research for both early sequential bilinguals (e.g., tone learning for middle school-aged children: <xref ref-type="bibr" rid="ref47">Liu and Ning, 2021</xref>) and late bilinguals (e.g., tone learning for adults: <xref ref-type="bibr" rid="ref84">Zou et al., 2017</xref>). For example, developed from <xref ref-type="bibr" rid="ref84">Zou et al.&#x2019;s (2017)</xref> research design, <xref ref-type="bibr" rid="ref47">Liu and Ning (2021)</xref> investigated how the early sequential Urdu-Cantonese bilingual teenagers redistributed their attention when selecting segments and tones in phonetically-conflicting conditions. The results revealed that even the early sequential bilinguals fluent in both languages might be confused by the phonetic conflicts. The sequential bilinguals failed to distribute their attention to tones as automatically as tonal native listeners did. Moreover, <xref ref-type="bibr" rid="ref47">Liu and Ning (2021)</xref> also found that language dominance could positively modulate bilinguals&#x2019; tone-specific SATP.</p>
<p>As extension research of <xref ref-type="bibr" rid="ref47">Liu and Ning (2021)</xref>, the current study investigated the tone-specific SATP developed by simultaneous bilingual teenagers. The participants in the current study were employed among Pakistani ethnic immigrants in Hong Kong secondary schools.</p>
</sec>
<sec id="sec5">
<title>Cantonese lexical tones and early Urdu-Cantonese bilinguals in Hong Kong</title>
<p>Cantonese was chosen as the target tonal language for this study, and it is the most widely spoken language in Hong Kong (<xref ref-type="bibr" rid="ref15">Census and Statistics Department of HKSAR, 2021</xref>). There are six lexical tones in Cantonese (checked tones are not included): T1 (high-level, pitch value: 55), T2 (high-rising, 25), T3 (mid-level, 33), T4 (low-falling, 21), T5 (low-rising, 23), and T6 (low-level, 22; e.g., <xref ref-type="bibr" rid="ref54">Mok et al., 2013</xref>). For example, /fu/ means &#x201C;man&#x201D; in T1, &#x201C;caress&#x201D; in T2, &#x201C;trousers&#x201D; in T3, &#x201C;support&#x201D; in T4, &#x201C;woman&#x201D; in T5, and &#x201C;attach&#x201D; in T6 (see <xref ref-type="bibr" rid="ref78">Wong and Arai, 2020</xref>). Cantonese uses F0 as one of the primary cues to distinguish lexical tone contrasts (<xref ref-type="bibr" rid="ref76">Whalen and Xu, 1992</xref>; <xref ref-type="bibr" rid="ref33">Hall&#x00E9; et al., 2004</xref>). Among Cantonese tones, T2 and T4 are acoustically distinct differing average F0, F0 onset, F0 endpoint as well as a contrast pitch direction, with a sharp linguistic boundary between the two categories (<xref ref-type="bibr" rid="ref30">Francis et al., 2003</xref>; <xref ref-type="bibr" rid="ref60">Qin and Mok, 2011</xref>; <xref ref-type="bibr" rid="ref53">Mok et al., 2018</xref>). Urdu, a non-tonal member of the Indo-Aryan family, is Pakistan&#x2019;s official language (<xref ref-type="bibr" rid="ref2">Akkharasena, 2018</xref>). There are lexical stress and sentential/phrasal intonation in Urdu, but no lexical tones (<xref ref-type="bibr" rid="ref1">Abbasi et al., 2017</xref>). A rising contour (LH) is prevalent among speakers as a phrase boundary in Urdu (<xref ref-type="bibr" rid="ref38">Jabeen, 2019</xref>). Besides, a downward pitch contour (L H L-L%) is predominantly used in Urdu declarative sentences (<xref ref-type="bibr" rid="ref39">Jabeen and Hussain, 2012</xref>).</p>
<p>Moreover, given that listeners&#x2019; SATP may be further integrated with the acquisition difficulty in phonetic learning (<xref ref-type="bibr" rid="ref32">Garc&#x00ED;a and Froud, 2018</xref>; <xref ref-type="bibr" rid="ref75">Verbeek et al., 2022</xref>), it is also necessary to clarify how Urdu listeners are influenced by language typology when processing Cantonese tones. For example, when perceiving Mandarin tones, L1 non-tonal (English) listeners tend to categorize the mid-rising tone and the high-falling tone in Mandarin into English intonation categories of &#x201C;question&#x201D; and &#x201C;statement&#x201D; (<xref ref-type="bibr" rid="ref68">So and Best, 2011</xref>). It deducts that the Urdu listeners may categorize Cantonese T2 as Urdu question intonation due to their similar rising pitch contours. Meanwhile, T4 may be assimilated as downward declarative intonation in Urdu. Therefore, a positive transfer from L1 prosodic typology is predicted for Urdu listeners, implying a relatively easy perception in T2&#x2013;T4.</p>
<p>Hong Kong is a cosmopolitan city with a large immigrant population in East Asia. According to population reports conducted by the Census and Statistics Department of HKSAR in 2021, about 8% of residents are non-Chinese speakers, with over 90% being non-tonal L1 speakers. Pakistani is one of the largest groups of Hindi-Urdu (non-tonal) speakers in Hong Kong, accounting for 20% of South-Asian secondary school students (<xref ref-type="bibr" rid="ref14">Census and Statistics Department of HKSAR, 2017</xref>). Meanwhile, 68% of Pakistani students were second-generation immigrants born in Hong Kong, according to a social survey based on a large population of ethnic minority students in Hong Kong (<xref ref-type="bibr" rid="ref18">Cheung and Chou, 2018</xref>). Additionally, ethnic minority students may have difficulties perceiving and producing Cantonese tones even under optimal conditions, requiring little attentional effort (<xref ref-type="bibr" rid="ref79">Wong and Leung, 2018</xref>; <xref ref-type="bibr" rid="ref83">Yao et al., 2020</xref>). In brief, it is of great practical value to examine tone perception learning for second-generation immigrants in Hong Kong who are native speakers of a non-tonal language (e.g., Urdu).</p>
<p>In sum, SATP plays a vital role in facilitating bilinguals to deal with L2 perceptual barriers in non-optimal listening conditions (e.g., <xref ref-type="bibr" rid="ref72">Strange and Shafer, 2008</xref>; <xref ref-type="bibr" rid="ref71">Strange, 2011</xref>). In the research field of early bilingualism effect on SATP development, there is a large number of studies focused on segmental and stress learning (<xref ref-type="bibr" rid="ref57">Navarra et al., 2005</xref>; <xref ref-type="bibr" rid="ref27">Dupoux et al., 2008</xref>, <xref ref-type="bibr" rid="ref26">2010</xref>; <xref ref-type="bibr" rid="ref55">Molnar et al., 2014</xref>; <xref ref-type="bibr" rid="ref36">Hisagi et al., 2015</xref>; <xref ref-type="bibr" rid="ref82">Yan et al., 2019</xref>; <xref ref-type="bibr" rid="ref22">Datta et al., 2020</xref>). In contrast, much less attention has been paid to the non-optimal tone perception for early bilinguals (e.g., <xref ref-type="bibr" rid="ref47">Liu and Ning, 2021</xref>). Moreover, there is a research gap on the tonal level about how simultaneous bilinguals would adapt to non-optimal conditions by developing sensitivity and integrability toward L2 tones. It is noteworthy that, though many segmental and stress-focused studies have examined the early bilingualism effect on SATP, it is still necessary to enrich this field from a perspective of tone learning. This is because, as previously mentioned, early bilinguals might encounter attentional barriers, even when perceiving easy-to-detect tones under a phonetically-conflicting condition (<xref ref-type="bibr" rid="ref84">Zou et al., 2017</xref>; <xref ref-type="bibr" rid="ref47">Liu and Ning, 2021</xref>). Such result examined in the tone field is thus much different from that in segmental and stress learning (e.g., <xref ref-type="bibr" rid="ref26">Dupoux et al., 2010</xref>; <xref ref-type="bibr" rid="ref22">Datta et al., 2020</xref>). Concerning this, the current study tapped into the research issue about the early bilingual effect on SATP by investigating how simultaneous learning influences SATP in tone perceptual learning. This study was an extension work of <xref ref-type="bibr" rid="ref47">Liu and Ning (2021)</xref>, investigating tone-specific SATP for early sequential bilinguals. Also, this study could enrich the line of studies (<xref ref-type="bibr" rid="ref57">Navarra et al., 2005</xref>; <xref ref-type="bibr" rid="ref55">Molnar et al., 2014</xref>; <xref ref-type="bibr" rid="ref36">Hisagi et al., 2015</xref>; <xref ref-type="bibr" rid="ref82">Yan et al., 2019</xref>; <xref ref-type="bibr" rid="ref22">Datta et al., 2020</xref>) by providing new evidence from tone learning for simultaneous bilinguals.</p>
</sec>
</sec>
<sec id="sec6">
<title>Current study</title>
<p>This study conducted an exploratory work to investigate whether the simultaneous learning experiences contribute to the SATP system when adapting to phonetically-conflicting conditions by the simultaneous Urdu-Cantonese bilinguals. For this purpose, listeners&#x2019; attention performance in tone perception was examined with two critical components of SATP development as introduced previously, namely, attention integration and distribution of segments and tones (<xref ref-type="bibr" rid="ref63">Repp and Lin, 1990</xref>; <xref ref-type="bibr" rid="ref12">Braun and Johnson, 2011</xref>; <xref ref-type="bibr" rid="ref46">Lin and Francis, 2014</xref>; <xref ref-type="bibr" rid="ref84">Zou et al., 2017</xref>). Thereby addressing two questions:</p>
<p>1.Can the simultaneous bilinguals successfully adapt to segmentally or tonally induced speech conflicts by integrally processing lexical tones and segments in Cantonese?</p>
<p>2.Can the simultaneous bilinguals redistribute selective attention to lexical tones as automatically as Cantonese native listeners?</p>
<p>Cantonese native speakers (hereafter, CN), Urdu-speaking Cantonese late learners (hereafter, LL), as well as simultaneous bilinguals in Urdu (native) and Cantonese (non-native; hereafter, SB) were employed as participants. The ABX test revised from the design of <xref ref-type="bibr" rid="ref84">Zou et al. (2017)</xref> was adopted including four conditions: segment-and-tone (conflict-free), forced-segment (tonally-conflicting), forced-tone (segmentally-conflicting), and segment-or-tone (forced-selecting) conditions. In line with the first research question, comparing the segmentally-conflicting and tonally-conflicting conditions demonstrated how listeners integrate their selective attention on segments and tones. Moreover, the segment-or-tone condition results highlighted how listeners redistribute segmental and tonal information.</p>
<p>Several predictions were made based on the four-condition design. Firstly, when examining experimental effectiveness, it predicted that the CN and SB groups were bound to show the quickest responses in the segment-and-tone condition since listeners can respond either depending on tones or segments as they will. Secondly, the LL listeners might perform poorly compared to the other two subject groups under the forced-tone scenario, where the segmental incongruence may hinder them in identifying speech. It is worth noting that though the LL listeners may assimilate Cantonese T2 and T4 into Urdu prosodic typology, it does not mean that they may allocate their attention more frequently to tonal dimensions than to segments. Moreover, the low Cantonese proficiency might also prevent the LL listeners from achieving satisfactory performance in the segmentally-conflicting condition. As non-tonal L1 listeners, the SATP is shaped in a segment-depend way by the LL group (<xref ref-type="bibr" rid="ref12">Braun and Johnson, 2011</xref>; <xref ref-type="bibr" rid="ref84">Zou et al., 2017</xref>). Thirdly, it may not be easy to predict the performance of simultaneous bilinguals precisely. According to <xref ref-type="bibr" rid="ref47">Liu and Ning (2021)</xref>, the early Urdu-Cantonese sequential bilinguals showed a weak performance in the attention distribution task. The early bilinguals thus exhibited an apparent divergence from the tonal native listeners in their study. However, compared with the early sequential bilinguals in <xref ref-type="bibr" rid="ref47">Liu and Ning (2021)</xref>, the SATP device for the SB listeners had to adapt to two sets of language systems from the first year of exposure, which is vital to developing attentional flexibility and adaptation (e.g., <xref ref-type="bibr" rid="ref41">Kalashnikova et al., 2021</xref>). Hence, it is worth exploring whether language development during the initial years can assist simultaneous bilinguals in acquiring the automaticity of a tone-specific SATP.</p>
<p>The contributions of this study were three-fold: (1) contributing to the ASP model by filling the research gap in tone-specific SATP for simultaneous bilinguals; (2) contributing to bilinguals&#x2019; tone learning under a non-optimal condition. Investigating tone-specific SATP would facilitate researchers to better understand how bilinguals deal with perceptual barriers when processing tones in a non-optimal listening environment; (3) contributing to tone learning under the multi-immigrating context in Hong Kong. Since second-generation immigrants (early bilinguals) make up a large population in Hong Kong, this study benefited tone learning for the large group of ethnic minority students in Hong Kong secondary schools.</p>
</sec>
<sec id="sec7" sec-type="materials|methods">
<title>Materials and methods</title>
<sec id="sec8">
<title>Participants</title>
<p>A total of 26 Urdu-Cantonese simultaneous bilingual speakers (13 female, 13 male), 27 native Cantonese speakers (14 female, 13 male), as well as 26 Urdu (L1)-speaking late learners of Cantonese (14 female, 12 male) were selected as participants. The SB (mean age&#x2009;=&#x2009;11.3&#x2009;years, SD&#x2009;=&#x2009;1.4) and LL (mean age&#x2009;=&#x2009;10.8&#x2009;years, SD&#x2009;=&#x2009;1.3) participants were Pakistani year-one students in the secondary schools in Hong Kong, where over 50&#x2009;~&#x2009;80% percent of the students are non-Chinese speakers. The native Cantonese speakers (mean age&#x2009;=&#x2009;10.7&#x2009;years, SD&#x2009;=&#x2009;1.3) were secondary school students in Hong Kong. Due to the Bi-literacy and Trilingualism Language Policy in Hong Kong, all the Cantonese native speakers learned English and Mandarin, but their dominant language was Cantonese. The middle-school-aged students were employed because (1) we needed to compare the performance of middle-school students in <xref ref-type="bibr" rid="ref47">Liu and Ning (2021)</xref>; (2) the period of late childhood (9&#x2013;11-year-olds) is critical for the refining of the SATP system in distinguishing phonetic contrasts (<xref ref-type="bibr" rid="ref29">Figueiredo and Da Silva, 2009</xref>; <xref ref-type="bibr" rid="ref24">Dollmann et al., 2020</xref>); (3) grade one is also known as a typical &#x201C;transition period&#x201D; from primary to secondary school for students. At this age, students have an enhanced need to adapt to new and complex speech variances when interacting with unfamiliar teachers and peer groups (<xref ref-type="bibr" rid="ref64">Saiegh-Haddad and Geva, 2008</xref>; <xref ref-type="bibr" rid="ref20">Courtney, 2014</xref>). Hence, it had practical values in language education to investigate how middle-school students, who have integrated experiences of simultaneous learning and new transition needs, develop their SATP system to adapt to the non-optimal perceptual environments.</p>
<p>The recommended students and one of their parents received a small compensation gift after completing the online student and parental questionnaires, in either Cantonese or English versions, <italic>via</italic> &#x201C;google form.&#x201D; The student questionnaires were designed based on the Bilingual Language Profile (BLP, <xref ref-type="bibr" rid="ref9">Birdsong et al., 2012</xref>), a widely used tool for assessing speakers&#x2019; bilingualism (e.g., <xref ref-type="bibr" rid="ref4">Amengual, 2016</xref>; <xref ref-type="bibr" rid="ref62">Renwick and Nadeu, 2019</xref>; <xref ref-type="bibr" rid="ref3">Aldrich, 2020</xref>). The degree of bilingualism was examined along four modules separately for L1 and L2, including &#x201C;language history,&#x201D; &#x201C;language use,&#x201D; &#x201C;language proficiency,&#x201D; and &#x201C;language attitudes.&#x201D; Similarly, the parental questionnaires also consisted of four modules, including &#x201C;language history for parents,&#x201D; &#x201C;language education history for children,&#x201D; &#x201C;language proficiency for parents,&#x201D; and &#x201C;Urdu proficiency for children.&#x201D; Also, their Chinese teachers were invited to assess their students&#x2019; proficiency in Cantonese. Teachers, students, and parents were required to evaluate on a seven-point Likert scale, with &#x201C;1&#x201D; representing &#x201C;the lowest level&#x201D; and &#x201C;7&#x201D; meaning &#x201C;the greatest level&#x201D; for each module except for the &#x201C;language history.&#x201D; Moreover, the Pakistani students&#x2019; language proficiency was examined by averaging Likert scores rated by teachers/ parents and the students.</p>
<p>According to the parental reports, all parents (N&#x2009;=&#x2009;102) were non-Chinese residents, with 97.1% of fathers and mothers being Urdu native speakers. 100% of parents agreed that Hindi-Urdu is their children&#x2019;s native language. Further, we classified the Pakistani students into the LL and SB groups based on their language history, Cantonese proficiency, and the frequency of Cantonese use. Most LL participants in the control group were first-generation immigrants of Hong Kong, born in Pakistan. The LL participants&#x2019; parents rarely talked with their children in Cantonese at home, and the LL students were not largely exposed to Cantonese until they were about 7.15&#x2009;years old (<italic>SD</italic>&#x2009;=&#x2009;0.92). For language use, the LL students consistently used Urdu across social spheres (e.g., at home, at school, and outside school) with an average use frequency of 6.03 (<italic>SD</italic>&#x2009;=&#x2009;0.36) in Urdu, which was much higher than their Cantonese use (mean&#x2009;=&#x2009;1.42, <italic>SD</italic>&#x2009;=&#x2009;0.42; <italic>t</italic>-test of language use between Cantonese and Urdu: <italic>t</italic>(25)&#x2009;=&#x2009;32.58, <italic>p</italic>&#x2009;&#x003C;&#x2009;0.001). For &#x201C;language proficiency,&#x201D; the LL students got a fairly low proficiency score in Cantonese (mean&#x2009;=&#x2009;2.22, <italic>SD</italic>&#x2009;=&#x2009;0.26) across language abilities of speaking, reading, writing, and listening, which was far lower than that in Urdu (mean&#x2009;=&#x2009;5.67, <italic>SD</italic>&#x2009;=&#x2009;0.42; <italic>t</italic>-test of language use between Cantonese and Urdu: <italic>t</italic>(25)&#x2009;=&#x2009;16.71, <italic>p</italic>&#x2009;&#x003C;&#x2009;0.001). Comparing with the SB group, the LL students got significantly lower scores in Cantonese proficiency (<italic>t</italic>(25)&#x2009;=&#x2009;28.68, <italic>p</italic>&#x2009;&#x003C;&#x2009;0.001) and Cantonese use frequency (<italic>t</italic>(25)&#x2009;=&#x2009;17.42, <italic>p</italic>&#x2009;&#x003C;&#x2009;0.001).</p>
<p>The SB students were all second-generation immigrants born in Hong Kong. Moreover, at least one SB parent was proficient in Cantonese, with a parental rating of 6.1 (<italic>SD</italic>&#x2009;=&#x2009;0.8) for Cantonese and 6.3 (<italic>SD</italic>&#x2009;=&#x2009;1.1) for Urdu. Since their children&#x2019;s infancy, the parents have used Urdu and Cantonese for communication. The pairwise <italic>t</italic>-test suggested no statistical difference in the age of onset exposure (in year) to Urdu (mean&#x2009;=&#x2009;0.21, <italic>SD</italic>&#x2009;=&#x2009;0.08) and Cantonese (mean&#x2009;=&#x2009;0.23, <italic>SD</italic>&#x2009;=&#x2009;0.07) for the SB students (<xref rid="tab1" ref-type="table">Table 1</xref>, Q i-1). However, before the 3 years old of the child, the parents tended to spend more time using Urdu to communicate with their children than using Cantonese (<xref rid="tab1" ref-type="table">Table 1</xref>, Q ii-1). According to the students&#x2019; reports, all the bilingual students had enrolled in local kindergartens and primary schools before entering secondary schools in Hong Kong, where the school instruction languages were either Cantonese or English. At the time of the experiment, all the students had been continuously learning Cantonese for 7.1&#x2009;years (<italic>SD</italic>&#x2009;=&#x2009;1.39) through regular classes. The length of formal education in Urdu (mean&#x2009;=&#x2009;2.19, <italic>SD</italic>&#x2009;=&#x2009;2.78) ranged widely among the SB students and overall was much less than that in Cantonese (<xref rid="tab1" ref-type="table">Table 1</xref>, Q iii-1). The SB students felt comfortable using Urdu at the age of 4 (<italic>SD</italic>&#x2009;=&#x2009;0.97), which was much earlier than using Cantonese (mean&#x2009;=&#x2009;9.19, <italic>SD</italic>&#x2009;=&#x2009;3.35; <xref rid="tab1" ref-type="table">Table 1</xref>, Q iv-1). For &#x201C;language use,&#x201D; the SB students were likely to use Urdu more often in the family while speaking Cantonese more with friends (<xref rid="tab1" ref-type="table">Table 1</xref>, Q i-2, Q ii-2, Q-iii-2). For &#x201C;language proficiency,&#x201D; the results reported an equal proficiency in Urdu and Cantonese for reading, speaking, and listening comprehension, but the bilingual students tended to get a lower proficiency in Cantonese than in Urdu for writing (<xref rid="tab1" ref-type="table">Table 1</xref>, Q i-3, Q ii-3, Q iii-3, Q iv-3). In addition, the bilingual students self-evaluated that they got more positive attitudes toward using Cantonese than Urdu (<xref rid="tab1" ref-type="table">Table 1</xref>, Q i-4, Q ii-4).</p>
<table-wrap position="float" id="tab1">
<label>Table 1</label>
<caption><p>Main results of parental and students&#x2019; questionnaires (revised from <xref ref-type="bibr" rid="ref9">Birdsong et al., 2012</xref>).</p></caption>
<table frame="hsides" rules="groups">
<thead>
<tr>
<th align="left" valign="top" colspan="2" rowspan="2">Question module</th>
<th align="center" valign="top" colspan="2">Mean value (standard deviations)</th>
<th align="center" valign="top" colspan="2"><italic>T</italic>-test (2-tailed; <italic>df</italic>&#x2009;=&#x2009;25) Urdu&#x2013;Cantonese</th>
</tr>
<tr>
<th align="center" valign="top">Urdu</th>
<th align="center" valign="top">Cantonese</th>
<th align="center" valign="top"><italic>t</italic></th>
<th align="center" valign="top"><italic>p</italic></th>
</tr>
</thead>
<tbody>
<tr>
<td align="char" valign="top" char="&#x00B1;" colspan="6"><bold>LANGUAGE HISTORY (in years)</bold></td>
</tr>
<tr>
<td align="char" valign="top" char="&#x00B1;" colspan="6"><bold>Parental</bold></td>
</tr>
<tr>
<td align="char" valign="top" char=".">Q i-1</td>
<td align="char" valign="top" char="&#x00B1;">What age did your child start to expose to the language?</td>
<td align="char" valign="top" char=".">0.21 (0.08)</td>
<td align="char" valign="top" char=".">0.23 (0.07)</td>
<td align="char" valign="top" char=".">&#x2212;1.54</td>
<td align="char" valign="top" char=".">0.148</td>
</tr>
<tr>
<td align="char" valign="top" char=".">Q i-2</td>
<td align="char" valign="top" char="&#x00B1;">How long did your child live in a family where this language is spoken before he/she was 3 years old?</td>
<td align="char" valign="top" char=".">2.75 (0.41)</td>
<td align="char" valign="top" char=".">2.25 (0.69)</td>
<td align="char" valign="top" char=".">4.372</td>
<td align="char" valign="top" char=".">&#x003C;0.001&#x002A;&#x002A;&#x002A;</td>
</tr>
<tr>
<td align="char" valign="top" char="." colspan="6"><bold>Students</bold></td>
</tr>
<tr>
<td align="char" valign="top" char=".">Q i-3</td>
<td align="char" valign="top" char="&#x00B1;">How long have you received a classroom education in this language?</td>
<td align="char" valign="top" char=".">2.19 (2.78)</td>
<td align="char" valign="top" char=".">7.11 (1.39)</td>
<td align="char" valign="top" char=".">&#x2212;9.109</td>
<td align="char" valign="top" char=".">&#x003C;0.001&#x002A;&#x002A;&#x002A;</td>
</tr>
<tr>
<td align="char" valign="top" char=".">Q i-4</td>
<td align="char" valign="top" char="&#x00B1;">What age did you start to feel comfortable using this language?</td>
<td align="char" valign="top" char=".">4.00 (0.97)</td>
<td align="char" valign="top" char=".">9.19 (3.35)</td>
<td align="char" valign="top" char=".">&#x2212;7.446</td>
<td align="char" valign="top" char=".">&#x003C;0.001&#x002A;&#x002A;&#x002A;</td>
</tr>
<tr>
<td align="char" valign="top" char="." colspan="6"><bold>LANGUAGE USE (7-Likert scale)</bold></td>
</tr>
<tr>
<td align="char" valign="top" char=".">Q ii-1</td>
<td align="char" valign="top" char="&#x00B1;">In an average week, how frequently do you use this at home?</td>
<td align="char" valign="top" char=".">6.34 (1.06)</td>
<td align="char" valign="top" char=".">3.58 (1.10)</td>
<td align="char" valign="top" char=".">9.733</td>
<td align="char" valign="top" char=".">&#x003C;0.001&#x002A;&#x002A;&#x002A;</td>
</tr>
<tr>
<td align="char" valign="top" char=".">Q ii-2</td>
<td align="char" valign="top" char="&#x00B1;">In an average week, how frequently do you use this with friends?</td>
<td align="char" valign="top" char=".">2.46 (1.21)</td>
<td align="char" valign="top" char=".">4.81 (0.80)</td>
<td align="char" valign="top" char=".">&#x2212;8.149</td>
<td align="char" valign="top" char=".">&#x003C;0.001&#x002A;&#x002A;&#x002A;</td>
</tr>
<tr>
<td align="char" valign="top" char=".">Q ii-3</td>
<td align="char" valign="top" char="&#x00B1;">In an average week, how frequently do you use this language with teachers and classmates at school?</td>
<td align="char" valign="top" char=".">4.31 (2.05)</td>
<td align="char" valign="top" char=".">6.04 (0.72)</td>
<td align="char" valign="top" char=".">&#x2212;4.524</td>
<td align="char" valign="top" char=".">&#x003C;0.001&#x002A;&#x002A;&#x002A;</td>
</tr>
<tr>
<td align="char" valign="top" char="." colspan="6"><bold>LANGUAGE PROFICIENCY (7-Likert scale)</bold></td>
</tr>
<tr>
<td align="char" valign="top" char=".">Q iii-1</td>
<td align="char" valign="top" char="&#x00B1;">How well do you speak it?</td>
<td align="char" valign="top" char=".">6.41 (0.58)</td>
<td align="char" valign="top" char=".">5.88 (0.82)</td>
<td align="char" valign="top" char=".">1.629</td>
<td align="char" valign="top" char=".">0.146</td>
</tr>
<tr>
<td align="char" valign="top" char=".">Q iii-2</td>
<td align="char" valign="top" char="&#x00B1;">How well do you understand it?</td>
<td align="char" valign="top" char=".">6.44 (0.48)</td>
<td align="char" valign="top" char=".">6.27 (0.81)</td>
<td align="char" valign="top" char=".">1.159</td>
<td align="char" valign="top" char=".">0.257</td>
</tr>
<tr>
<td align="char" valign="top" char=".">Q iii-3</td>
<td align="char" valign="top" char="&#x00B1;">How well do you read it?</td>
<td align="char" valign="top" char=".">6.15 (0.55)</td>
<td align="char" valign="top" char=".">5.83 (0.83)</td>
<td align="char" valign="top" char=".">1.722</td>
<td align="char" valign="top" char=".">0.097</td>
</tr>
<tr>
<td align="char" valign="top" char=".">Q iii-4</td>
<td align="char" valign="top" char="&#x00B1;">How well do you write it?</td>
<td align="char" valign="top" char=".">5.55 (0.49)</td>
<td align="char" valign="top" char=".">4.72 (0.61)</td>
<td align="char" valign="top" char=".">5.598</td>
<td align="char" valign="top" char=".">&#x003C;0.001&#x002A;&#x002A;&#x002A;</td>
</tr>
<tr>
<td align="char" valign="top" char="." colspan="6"><bold>LANGUAGE ATTITUDES (7-Likert scale)</bold></td>
</tr>
<tr>
<td align="char" valign="top" char=".">Q iv-1</td>
<td align="char" valign="top" char="&#x00B1;">I identify with this culture</td>
<td align="char" valign="top" char=".">5.58 (1.10)</td>
<td align="char" valign="top" char=".">6.50 (0.81)</td>
<td align="char" valign="top" char=".">3.011</td>
<td align="char" valign="top" char=".">0.006&#x002A;&#x002A;</td>
</tr>
<tr>
<td align="char" valign="top" char=".">Q iv-2</td>
<td align="char" valign="top" char="&#x00B1;">I want others to think I am a native speaker of it.</td>
<td align="char" valign="top" char=".">5.65 (0.69)</td>
<td align="char" valign="top" char=".">6.92 (0.27)</td>
<td align="char" valign="top" char=".">&#x2212;8.935</td>
<td align="char" valign="top" char=".">&#x003C;0.001&#x002A;&#x002A;&#x002A;</td>
</tr>
</tbody>
</table>
<table-wrap-foot>
<p>For &#x201C;language proficiency,&#x201D; &#x201C;language use,&#x201D; and &#x201C;language attitude,&#x201D; participants estimated on 7-point Likert scale, with 1 representing the lowest level and 7 representing the highest level. The results of pairwise <italic>t</italic>-test (2-tails) between Urdu and Cantonese are shown in the right column. Significant codes: 0 &#x2018;&#x002A;&#x002A;&#x002A;&#x2019; 0.001 &#x2018;&#x002A;&#x002A;&#x2019; 0.01.</p>
</table-wrap-foot>
</table-wrap>
<p>Also, it is noteworthy that all the SB students had a long L2 learning experience in English (non-tonal) about 7.31&#x2009;years (<italic>SD</italic>&#x2009;=&#x2009;1.41), which was comparable to the length of classroom education in Cantonese (mean&#x2009;=&#x2009;7.12&#x2009;years, <italic>SD</italic>&#x2009;=&#x2009;1.39; <italic>t</italic>-test in education length: English and Cantonese: <italic>t</italic>(25)&#x2009;=&#x2009;1.04, <italic>p</italic>&#x2009;=&#x2009;0.306). Besides, bilingual students tended to substantially communicate in English, with a much higher self-rating in English use (mean&#x2009;=&#x2009;6.23, <italic>SD</italic>&#x2009;=&#x2009;0.59) than in Cantonese use (mean&#x2009;=&#x2009;4.81, <italic>SD</italic>&#x2009;=&#x2009;0.44) averagely across social spheres (<italic>t</italic>-test in language use: English and Cantonese: <italic>t</italic>(25)&#x2009;=&#x2009;10.11, <italic>p</italic>&#x2009;&#x003C;&#x2009;0.001). The bilingual students also frequently used English in daily communication but did not speak another tonal language other than Cantonese (which will be discussed in the paper&#x2019;s final part).</p>
<p>To further examine the dominance degree of Cantonese and Urdu for each bilingual participant, we calculated dominance scores separately for Urdu and Cantonese based on <xref ref-type="bibr" rid="ref9">Birdsong et al.&#x2019;s (2012)</xref> method by equally weighting different modules of parental and students&#x2019; questionnaires. According to <xref ref-type="bibr" rid="ref9">Birdsong et al. (2012)</xref>, the higher the dominance scores for a specific language, the more likely the bilingual students are to be dominant in that language. The comparison between Urdu and Cantonese in dominance continua shows whether the bilinguals show bias or balance between Urdu and Cantonese. According to <xref rid="fig1" ref-type="fig">Figure 1</xref> and the paired <italic>t</italic>-test results displayed in the figure, the bilinguals showed a balance between Urdu and Cantonese in the modules of &#x201C;language history&#x201D; and &#x201C;language use&#x201D; since there was no statistical difference between scores of the two languages. In addition, the bilingual students showed a Cantonese bias in the module on &#x201C;language attitude.&#x201D; At the same time, they were more Urdu-dominant in &#x201C;language proficiency.&#x201D; By equally weighting the estimations across four modules, the total dominance scores showed a balance for the bilingual students using Urdu and Cantonese.</p>
<fig position="float" id="fig1">
<label>Figure 1</label>
<caption><p>The dominance scores based on BLP calculating method (<xref ref-type="bibr" rid="ref9">Birdsong et al., 2012</xref>). <italic>T</italic>-test results between Cantonese and Urdu were shown on the chat.</p></caption>
<graphic xlink:href="fpsyg-13-918737-g001.tif"/>
</fig>
</sec>
<sec id="sec9">
<title>Stimuli</title>
<p>T2 and T4 in Cantonese were selected as target tonal contrasts. This is because T2 and T4 have acoustically distinct pitch directions (<xref ref-type="bibr" rid="ref54">Mok et al., 2013</xref>; <xref ref-type="bibr" rid="ref17">Chen et al., 2017</xref>). As previously predicted, T2 and T4 are easily discerned for Urdu speakers in a conflict-free listening condition (see details in the &#x201C;Cantonese Lexical Tones and Early Urdu-Cantonese Bilinguals in Hong Kong&#x201D; section). Thus, adopting T2&#x2013;T4 could reduce certain negative impedes imposed by Urdu prosodic typology and allow us to concentrate on bilinguals&#x2019; attentional performance in segment and tone processing. As shown in <xref rid="tab2" ref-type="table">Table 2</xref>, to avoid lexical interference, two pairs of CVCV disyllabic non-words in Cantonese and Urdu, /kasu/&#x2212;/tafu/ and /biso/&#x2212;/diso/, were selected with the target tones on the initial syllables. Disyllables were utilized because the first syllable location can reduce the phonological influence of Urdu sentence-final intonation on bilingual speakers (<xref ref-type="bibr" rid="ref84">Zou et al., 2017</xref>). The target tone in the first syllable was carried with either Cantonese T2 (high-rising) or T4 (low-falling). The second syllable for each disyllabic non-word was neutralized as Cantonese high-level tones (T1), the most stable tone in Cantonese that can facilitate the discrimination of the adjacent tones (<xref ref-type="bibr" rid="ref60">Qin and Mok, 2011</xref>). The vowels were [a], [i], [u], [o], and the consonants included [k], [t], [th], [p], [s], [f]. In each non-word pair, the vowels remained unchanged; only the consonant changed in the place of articulation or articulatory manner. Three native Cantonese speakers (two female and one male) were invited to record the disyllables with CoolEdit 2.0 on a Lenovo ThinkCentre desktop computer (i5 core, USB interface: 3.0) with Boom microphone in the audio booth at Hong Kong Polytechnic University.</p>
<table-wrap position="float" id="tab2">
<label>Table 2</label>
<caption><p>Arrangement of stimuli in ABX tasks.</p></caption>
<table frame="hsides" rules="groups">
<thead>
<tr>
<th align="left" valign="top">Task</th>
<th align="left" valign="top">A</th>
<th align="left" valign="top">B</th>
<th align="left" valign="top">X</th>
</tr>
</thead>
<tbody>
<tr>
<td align="char" valign="top" char="." rowspan="4">Forced-segment</td>
<td align="char" valign="top" char="&#x00B1;">ka2su1</td>
<td align="char" valign="top" char="&#x00B1;">ta2fu1</td>
<td align="char" valign="top" char="&#x00B1;">ka4su1/ta4fu1</td>
</tr>
<tr>
<td align="char" valign="top" char="&#x00B1;">ka4su1</td>
<td align="char" valign="top" char="&#x00B1;">ta4fu1</td>
<td align="char" valign="top" char="&#x00B1;">ka2su1/ta2fu1</td>
</tr>
<tr>
<td align="char" valign="top" char="&#x00B1;">bi4so1</td>
<td align="char" valign="top" char="&#x00B1;">di4fo1</td>
<td align="char" valign="top" char="&#x00B1;">bi2so1/di2fo1</td>
</tr>
<tr>
<td align="char" valign="top" char="&#x00B1;">bi2so1</td>
<td align="char" valign="top" char="&#x00B1;">di2fo1</td>
<td align="char" valign="top" char="&#x00B1;">bi4so1/di4fo1</td>
</tr>
<tr>
<td align="char" valign="top" char="." rowspan="4">Forced-tone</td>
<td align="char" valign="top" char="&#x00B1;">ka2su1</td>
<td align="char" valign="top" char="&#x00B1;">ka4su1</td>
<td align="char" valign="top" char="&#x00B1;">ta2fu1/ta4fu1</td>
</tr>
<tr>
<td align="char" valign="top" char="&#x00B1;">ka4su1</td>
<td align="char" valign="top" char="&#x00B1;">ka2su1</td>
<td align="char" valign="top" char="&#x00B1;">ta2fu1/ta4fu1</td>
</tr>
<tr>
<td align="char" valign="top" char="&#x00B1;">bi4so1</td>
<td align="char" valign="top" char="&#x00B1;">bi2so1</td>
<td align="char" valign="top" char="&#x00B1;">di4fo1/di2fo1</td>
</tr>
<tr>
<td align="char" valign="top" char="&#x00B1;">bi2so1</td>
<td align="char" valign="top" char="&#x00B1;">bi4so1</td>
<td align="char" valign="top" char="&#x00B1;">di4fo1/di2fo1</td>
</tr>
<tr>
<td align="char" valign="top" char="." rowspan="4">Segment-and-tone</td>
<td align="char" valign="top" char="&#x00B1;">ka2su1</td>
<td align="char" valign="top" char="&#x00B1;">ta4fu1</td>
<td align="char" valign="top" char="&#x00B1;">ka2su1/ta4fu1</td>
</tr>
<tr>
<td align="char" valign="top" char="&#x00B1;">ka4su1</td>
<td align="char" valign="top" char="&#x00B1;">ta2fu1</td>
<td align="char" valign="top" char="&#x00B1;">ka4su1/ta2fu1</td>
</tr>
<tr>
<td align="char" valign="top" char="&#x00B1;">bi4so1</td>
<td align="char" valign="top" char="&#x00B1;">di2fo1</td>
<td align="char" valign="top" char="&#x00B1;">bi4so1/di2fo1</td>
</tr>
<tr>
<td align="char" valign="top" char="&#x00B1;">bi2so1</td>
<td align="char" valign="top" char="&#x00B1;">di4fo1</td>
<td align="char" valign="top" char="&#x00B1;">bi2so1/di4fo1</td>
</tr>
<tr>
<td align="char" valign="top" char="." rowspan="4">Segment-or-tone</td>
<td align="char" valign="top" char="&#x00B1;">ka2su1</td>
<td align="char" valign="top" char="&#x00B1;">ta4fu1</td>
<td align="char" valign="top" char="&#x00B1;">ka4su1/ta2su1</td>
</tr>
<tr>
<td align="char" valign="top" char="&#x00B1;">ka4su1</td>
<td align="char" valign="top" char="&#x00B1;">ta2fu1</td>
<td align="char" valign="top" char="&#x00B1;">ka2su1/ta4fu1</td>
</tr>
<tr>
<td align="char" valign="top" char="&#x00B1;">bi4so1</td>
<td align="char" valign="top" char="&#x00B1;">di2fo1</td>
<td align="char" valign="top" char="&#x00B1;">bi2fo1/di4fo1</td>
</tr>
<tr>
<td align="char" valign="top" char="&#x00B1;">bi2so1</td>
<td align="char" valign="top" char="&#x00B1;">di4fo1</td>
<td align="char" valign="top" char="&#x00B1;">bi4fo1/di2fo1</td>
</tr>
</tbody>
</table>
<table-wrap-foot>
<p>The number between syllables represents a Cantonese tone mark; 1, 2, and 4 stand for T1, T2, and T4 in Cantonese, respectively. For the stimuli X, &#x201C;ka4su1/ta4fu1&#x201D; means X can be &#x201C;ka4su1&#x201D; or &#x201C;ta4fu1&#x201D; following the A and B stimuli and so are the other X stimuli shown in the table.</p>
</table-wrap-foot>
</table-wrap>
<p>ABX test was adopted as the experimental method, with four conditions: Segment-and-tone: participants were required to identify target X that matched in both segment and tonal dimensions; forced-segment and forced-tone: participants were forced to classify the target X along the segmental or tonal dimension, respectively; segment-or-tone: target X matched with either the segmental dimension in A, or tonal dimension in B, and vice versa. The distribution of attention could therefore be observed from the results. The target X contains the same tone or (and) segment as A or B, and the stimuli order can be ABX or BAX. Thus, we got 16 ABX stimuli (two non-word pairs &#x00D7; two Cantonese tones &#x00D7; two AB orders &#x00D7; two matches with A or B). The arrangement of stimuli is displayed in <xref rid="tab2" ref-type="table">Table 2</xref>, which only shows one AB order.</p>
<p>For stimuli recording, three Cantonese native speakers were shuffled in ABX combination instead of being produced by the same speaker (e.g., speaker 1: A, speaker 2: B, speaker 3: X, and vice versa) to avoid the possibility that listeners only process the stimuli acoustically other than referring to the linguistic features (<xref ref-type="bibr" rid="ref84">Zou et al., 2017</xref>). The non-words were provided in Roman script and Cantonese tone marks (see <xref rid="tab2" ref-type="table">Table 2</xref>). The native speakers, who were previously trained, were asked to produce the disyllabic pairs with an interval of around 1 s in a natural speaking speed. The recordings were saved as 16-bit .wav files with a sampling rate of 44,100&#x2009;Hz. <xref rid="fig2" ref-type="fig">Figure 2</xref> depicts the averaged pitch contours across stimuli for each speaker. In the first syllable, T2 raised from a low pitch to a higher pitch for each native speaker (female 1: 188&#x2009;Hz to 250&#x2009;Hz; female 2: 158&#x2009;Hz to 225&#x2009;Hz; male: 88&#x2009;Hz to 155&#x2009;Hz). T4 in the first syllable showed a falling contour for each speaker (female 1: 143&#x2009;Hz to 100&#x2009;Hz; female 2: 155&#x2009;Hz to 132&#x2009;Hz; male: 93&#x2009;Hz to 63&#x2009;Hz). T1 in the second syllable showed stable high pitch contours in all syllables. The pitch features obtained from the current study were in line with the description of Cantonese tones in <xref ref-type="bibr" rid="ref17">Chen et al. (2017)</xref>.</p>
<fig position="float" id="fig2">
<label>Figure 2</label>
<caption><p>The pitch contours of disyllabic non-words produced by one male native speaker and two female native speakers in Cantonese. The pitch frequencies are averaged across /kasu/&#x2212;/tafu/ and /biso/&#x2212;/diso/.</p></caption>
<graphic xlink:href="fpsyg-13-918737-g002.tif"/>
</fig>
</sec>
<sec id="sec10">
<title>Procedure</title>
<p>The participants were tested separately by sitting in front of a computer (Lenovo ThinkCentre desktop, i5 core, USB interface: 3.0) in a quiet classroom in Hong Kong secondary schools with high-quality headphones (Philips Fidelio X2). The participants were adequately briefed in Cantonese and English about the task procedure by the professional experimenters, who were proficient Cantonese (L1)-English bilingual speakers. The students knew what to do and were required to concentrate on the overall similarity between sounds. The experiment was conducted through the ExperimentMFC script in Praat software (<xref ref-type="bibr" rid="ref10">Boersma and Weenink, 2014</xref>).</p>
<p>With a quasi-random approach, the four listening conditions (segment-and-tone, forced-segment, forced-tone, and segment-or-tone) were tested with different blocks, and the stimuli were played randomly within each block. The adoption of a quasi-random design was out of concern for the operability of the experiment. A total of 14 local year-one secondary school students (not included in the CN group) were invited to conduct a pilot study designed with different stimuli orders before the formal implementation of the experiment. Besides, a mini-interview was conducted for each student to collect their opinions after the pilot test. The preliminary assessment showed that if the experiment was completely randomized across conditions and stimuli, the ratio of missing trials for most conditions would yield bias in the following analyses. Only when a quasi-random design was used, the portion of missing trials was within an acceptable range (under 10%) according to the statistical guidance research based on a large data set (<xref ref-type="bibr" rid="ref25">Dong and Peng, 2013</xref>; <xref ref-type="bibr" rid="ref40">Jakobsen et al., 2017</xref>). In addition, students in the mini-interviews reported that they might feel confused across trials when the stimuli were completely randomized. So they mostly made random judgments (guesses) instead of focusing on the similarity of sounds. Therefore, compared with a full randomized fashion, a quasi-random approach might be more suitable and feasible for the current task, specifically for middle-school-aged students.</p>
<p>In the experiment, the participants focused on a &#x201C;fixation&#x201D; shown on the screen (20&#x2009;ms), then listened to the three sounds, A, B, and X, and indicated if X sounded more similar to the first or second by clicking the mouse on a &#x201C;1&#x201D; or &#x201C;2&#x201D; shown on the computer screen. The onscreen buttons (&#x201C;1&#x201D; and &#x201C;2&#x201D;) would appear directly after the three sounds were finished playing. Each task had a 600&#x2009;ms interval between standard A and standard B, and X appeared after a 900&#x2009;ms pause (<xref ref-type="bibr" rid="ref12">Braun and Johnson, 2011</xref>). We applied a comparatively shorter response key duration than <xref ref-type="bibr" rid="ref84">Zou et al. (2017)</xref> and <xref ref-type="bibr" rid="ref47">Liu and Ning (2021)</xref>. After the &#x201C;X&#x201D; ended, the participant who did not give a response within 1,000&#x2009;ms would receive a reminder onscreen to hurry up. If the participant failed to respond within 2,200&#x2009;ms, a &#x201C;failed&#x201D; hint was shown and passed quickly to the following stimuli. Three-minute familiarization exercises in the segment-and-tone task were given to the listeners before a formal experiment started. Only when the listeners had successfully passed the exercises could they proceed to the subsequent formal task. In the formal experiment, ABX stimuli were repeated three times, generating 192 ABX trials (16 ABX stimuli &#x00D7; four conditions &#x00D7; three repetitions) for each participant. The whole experiment lasted for about 30&#x2013;40&#x2009;min for each participant. RT and response type were recorded for each individual, with RT started recording after playing the &#x201C;X.&#x201D;</p>
</sec>
</sec>
<sec id="sec11" sec-type="results">
<title>Results</title>
<p>Segment or tone-based responses were collected for each individual across conditions. Further, accuracy was calculated for the forced-segment, forced-tone, and segment-and-tone conditions, respectively, based on the frequencies of the segment, tone, and segment/tone responses. Since there is no accurate answer for the segment-or-tone condition, the segment-based response rate was calculated. The RT values in the missing trials were involved in the data pool as 2,200&#x2009;ms. The RT values, which exceeded 3SD below or above mean scores, were excluded for each subject group and experimental condition. Totally we got, for the CN group, 5,156 responses (192 trials &#x00D7; 27 participants-28 missing) and 5,156 RT (192 trials &#x00D7; 27 participants-28 outliers); for the SB group, 4,925 responses (192 trials &#x00D7; 26 participants-67 missing) and 4,914 RT (192 trials &#x00D7; 26 participants-78 outliers); for the LL group, 4,942 responses (192 trials &#x00D7; 26 participants-50 missing) and 4,922 RT (192 trials &#x00D7; 26 participants-70 outliers).</p>
<p>For RT data, the absolute values of skewness ranged between 0.008 and 2.17, and absolute kurtosis values ranged around 0.095&#x2013;6.96 across sub-groups. Considering the large sample size (<italic>N</italic>&#x2009;&#x003E;&#x2009;1,000 per sub-group) in the current study, RT data was determined to obey a Gaussian distribution based on <xref ref-type="bibr" rid="ref42">Kim&#x2019;s (2013)</xref>&#x2019; criterion (absolute skewness &#x003C;2 and absolute kurtosis &#x003C;7). According to <xref ref-type="bibr" rid="ref8">Bates et al. (2015)</xref>, the linear mixed-effect model was performed in R (<xref ref-type="bibr" rid="ref61">R Development Core Team, 2008</xref>) using the lme4 package for statistical analysis of RT. Besides, the logistic mixed-effect model was utilized for the binary responses using the same package. According to <xref ref-type="bibr" rid="ref6">Baayen et al. (2008)</xref>, the mixed-effect model is advantageous in processing nested hierarchical data. The <italic>p</italic>-values of the models were obtained with the lemerTest package (<xref ref-type="bibr" rid="ref45">Kuznetsova et al., 2017</xref>). The determination of intercepts and slopes for random effects was based on comparing different models with the likelihood ratio test proposed in <xref ref-type="bibr" rid="ref6">Baayen et al. (2008)</xref>. Marginal <italic>R</italic><sup>2</sup> and conditional <italic>R</italic><sup>2</sup> were used to examine the efficiency of the models using the MuMIn package (<xref ref-type="bibr" rid="ref7">Barto&#x0144;, 2015</xref>), which, respectively, measure the variances of fixed or fixed and random effects (<xref ref-type="bibr" rid="ref001">Nakagawa and Schielzeth, 2013</xref>; <xref ref-type="bibr" rid="ref84">Zou et al., 2017</xref>). <italic>Post-hoc</italic> multiple comparisons across levels were tested with the Multcomp package (<xref ref-type="bibr" rid="ref37">Hothorn et al., 2008</xref>). Moreover, <italic>p</italic>-values were corrected with Bonferroni adjustment (if necessary) for <italic>post-hoc</italic> multi-comparisons. For statistical analysis in the &#x201C;Attention Distribution in Segments and Tones&#x201D; section, two-proportions <italic>z</italic>-tests were conducted with the Stats package (see more details in <xref ref-type="bibr" rid="ref81">Wooditch et al., 2021</xref>, pp. 180&#x2013;192) in base R.</p>
<p>Since our research questions should be examined by conducting multiple comparisons across different experimental conditions, different levels of experimental conditions were included in separate models following the research question. In addition, for each model of response and RT, initially, tone type (T2 and T4), consonant type (/b, d, k, t/), and vowel type (/a, i/) were included as fixed effects. However, they were removed from each model because they were not significant as main effects or had no significant interactions with other fixed factors. <xref rid="fig3" ref-type="fig">Figure 3</xref> depicts the average accuracy for the first three conditions, the segment-based response rate for the segment-or-tone condition, and RTs across four experimental conditions for each subject group.</p>
<fig position="float" id="fig3">
<label>Figure 3</label>
<caption><p>Responses and reaction time for the CN <bold>(A)</bold>, SB <bold>(B)</bold>, and LL <bold>(C)</bold> groups in the forced-segment, forced-tone, segment-and-tone, and segment-or-tone conditions. For the first three conditions, mean values of accuracy were exhibited, and for the last condition, mean response rate in segment was illustrated. For reaction time, 5 times of standard deviation is illustrated with error bar.</p></caption>
<graphic xlink:href="fpsyg-13-918737-g003.tif"/>
</fig>
<sec id="sec12">
<title>The effectiveness of the experimental design</title>
<p>In the segment-and-tone condition, the four subject groups achieved high accuracy in syllable classification, with a correct classification averagely ranging above 87% across groups. It ensures that all subject groups could at least robustly rely on one acoustic dimension (segment/tone) to make a decent response to the stimuli. To examine the accuracy in the first three conditions (see <xref rid="tab3" ref-type="table">Table 3</xref>, model 1), a logistic mixed-effect model was finally conducted with the fixed effects of the subject group (CN, SB, and LL), experimental condition (segment-and-tone, forced-segment, forced-tone), and the interaction. Also, the subject and stimuli intercepts were included as random effects. To examine RT values across the four conditions (see <xref rid="tab3" ref-type="table">Table 3</xref>, model 2), a linear mixed-effect model was conducted, with fixed effects of subject group (CN, SB, and LL), experimental condition (segment-and-tone, forced-segment, forced-tone, segment-or-tone), and their interaction.</p>
<table-wrap position="float" id="tab3">
<label>Table 3</label>
<caption><p>The results of logistic and linear mixed-effect models for responses and RT.</p></caption>
<table frame="hsides" rules="groups">
<tbody>
<tr>
<td align="char" valign="top" char=".">Model-1</td>
<td align="char" valign="top" char="&#x00B1;" colspan="7"><italic>glmer(response&#x2009;~&#x2009;condition&#x2009;&#x00D7;&#x2009;participate&#x2009;+&#x2009;(1|subject)&#x2009;+&#x2009;(1|stimulus), family&#x2009;=&#x2009;&#x201C;binomial&#x201D;)</italic></td>
</tr>
<tr>
<td/>
<td align="char" valign="bottom" char="&#x00B1;" colspan="7"><bold>Test filed&#x2009;=&#x2009;response</bold></td>
</tr>
<tr>
<td/>
<td align="char" valign="bottom" char="&#x00B1;"><bold>Fixed effects</bold></td>
<td align="char" valign="bottom" char="&#x00B1;"><bold>&#x03B2;</bold></td>
<td align="char" valign="bottom" char="&#x00B1;"><bold>SE</bold></td>
<td align="char" valign="bottom" char="&#x00B1;"><bold>z value</bold></td>
<td align="char" valign="bottom" char="&#x00B1;"><bold>Pr(&#x003E;|z|)</bold></td>
<td align="char" valign="bottom" char="&#x00B1;"><bold>Mar R</bold><sup><bold>2</bold>
</sup></td>
<td align="char" valign="bottom" char="&#x00B1;"><bold>Con R</bold><sup><bold>2</bold>
</sup></td>
</tr>
<tr>
<td/>
<td align="char" valign="top" char="&#x00B1;">(Intercept)</td>
<td align="char" valign="bottom" char=".">0.64771</td>
<td align="char" valign="bottom" char=".">0.234</td>
<td align="char" valign="bottom" char=".">2.767</td>
<td align="char" valign="bottom" char=".">0.00566&#x002A;&#x002A;</td>
<td align="char" valign="bottom" char=".">0.551</td>
<td align="char" valign="bottom" char=".">0.605</td>
</tr>
<tr>
<td/>
<td align="char" valign="top" char="&#x00B1;">Condition</td>
<td align="char" valign="top" char=".">0.64051</td>
<td align="char" valign="top" char=".">0.091</td>
<td align="char" valign="top" char=".">7.014</td>
<td align="char" valign="top" char=".">&#x003C;0.001&#x002A;&#x002A;&#x002A;</td>
<td/>
<td/>
</tr>
<tr>
<td/>
<td align="char" valign="top" char="&#x00B1;">Participant group</td>
<td align="char" valign="top" char=".">1.313</td>
<td align="char" valign="top" char=".">0.116</td>
<td align="char" valign="top" char=".">11.236</td>
<td align="char" valign="top" char=".">&#x003C;0.001&#x002A;&#x002A;&#x002A;</td>
<td/>
<td/>
</tr>
<tr>
<td/>
<td align="char" valign="top" char="&#x00B1;">Condition: Participant</td>
<td align="char" valign="top" char=".">0.72841</td>
<td align="char" valign="top" char=".">0.044</td>
<td align="char" valign="top" char=".">16.511</td>
<td align="char" valign="top" char=".">&#x003C;0.001&#x002A;&#x002A;&#x002A;</td>
<td/>
<td/>
</tr>
<tr>
<td/>
<td align="char" valign="top" char="&#x00B1;"><bold>Random effects</bold></td>
<td align="char" valign="top" char="."><bold>Var.</bold></td>
<td align="char" valign="top" char="."><bold>SD</bold></td>
<td/>
<td/>
<td/>
<td/>
</tr>
<tr>
<td/>
<td align="char" valign="top" char="&#x00B1;">1|subject</td>
<td align="char" valign="top" char=".">0.1261</td>
<td align="char" valign="top" char=".">0.3552</td>
<td/>
<td/>
<td/>
<td/>
</tr>
<tr>
<td/>
<td align="char" valign="top" char="&#x00B1;">1|stimulus</td>
<td align="char" valign="top" char=".">0.1778</td>
<td align="char" valign="top" char=".">0.1333</td>
<td/>
<td/>
<td/>
<td/>
</tr>
<tr>
<td align="char" valign="top" char=".">Model-2</td>
<td align="char" valign="top" char="." colspan="7"><italic>lmer(RT&#x2009;~&#x2009;L1&#x2009;&#x00D7;&#x2009;task&#x2009;+&#x2009;(1 | subject)&#x2009;+&#x2009;(1 | stimuli)&#x2009;+&#x2009;(1 | task:stimuli))</italic></td>
</tr>
<tr>
<td/>
<td align="char" valign="top" char="&#x00B1;"><bold>Test filed&#x2009;=&#x2009;RT</bold></td>
<td/>
<td/>
<td/>
<td/>
<td/>
<td/>
</tr>
<tr>
<td/>
<td align="char" valign="bottom" char="&#x00B1;"><bold>Fixed effects</bold></td>
<td align="char" valign="bottom" char="."><bold>&#x03B2;</bold></td>
<td align="char" valign="bottom" char="."><bold>SE</bold></td>
<td align="char" valign="bottom" char="."><bold>t value</bold></td>
<td align="char" valign="bottom" char="."><bold>Pr(&#x003E;|z|)</bold></td>
<td align="char" valign="bottom" char="."><bold>Mar R</bold><sup><bold>2</bold>
</sup></td>
<td align="char" valign="bottom" char="."><bold>Con R</bold><sup><bold>2</bold>
</sup></td>
</tr>
<tr>
<td/>
<td align="char" valign="top" char="&#x00B1;">(Intercept)</td>
<td align="char" valign="top" char=".">10.0376</td>
<td align="char" valign="top" char=".">23.6716</td>
<td align="char" valign="top" char=".">23.788</td>
<td align="char" valign="top" char=".">&#x003C;0.001&#x002A;&#x002A;&#x002A;</td>
<td align="char" valign="top" char=".">0.811</td>
<td align="char" valign="top" char=".">&#x003E;0.999</td>
</tr>
<tr>
<td/>
<td align="char" valign="top" char="&#x00B1;">Condition</td>
<td align="char" valign="top" char=".">10.2312</td>
<td align="char" valign="top" char=".">9.8925</td>
<td align="char" valign="top" char=".">8.095</td>
<td align="char" valign="top" char=".">&#x003C;0.001&#x002A;&#x002A;&#x002A;</td>
<td/>
<td/>
</tr>
<tr>
<td/>
<td align="char" valign="top" char="&#x00B1;">Participant group</td>
<td align="char" valign="top" char=".">3.7547</td>
<td align="char" valign="top" char=".">6.6941</td>
<td align="char" valign="top" char=".">19.711</td>
<td align="char" valign="top" char=".">&#x003C;0.001&#x002A;&#x002A;&#x002A;</td>
<td/>
<td/>
</tr>
<tr>
<td/>
<td align="char" valign="top" char="&#x00B1;">Condition: Participant</td>
<td align="char" valign="top" char=".">12.1318</td>
<td align="char" valign="top" char=".">3.3234</td>
<td align="char" valign="top" char=".">8.0311</td>
<td align="char" valign="top" char=".">&#x003C;0.001&#x002A;&#x002A;&#x002A;</td>
<td/>
<td/>
</tr>
<tr>
<td/>
<td align="char" valign="top" char="&#x00B1;"><bold>Random effects</bold></td>
<td align="char" valign="top" char="."><bold>Var.</bold></td>
<td align="char" valign="top" char="."><bold>SD</bold></td>
<td/>
<td/>
<td/>
<td/>
</tr>
<tr>
<td/>
<td align="char" valign="top" char="&#x00B1;">1|task:stimuli</td>
<td align="char" valign="top" char=".">19.9969</td>
<td align="char" valign="top" char=".">6.705</td>
<td/>
<td/>
<td/>
<td/>
</tr>
<tr>
<td/>
<td align="char" valign="top" char="&#x00B1;">1|subject</td>
<td align="char" valign="top" char=".">1.7405</td>
<td align="char" valign="top" char=".">7.602</td>
<td/>
<td/>
<td/>
<td/>
</tr>
<tr>
<td/>
<td align="char" valign="top" char="&#x00B1;">1|stimulus</td>
<td align="char" valign="top" char=".">13.9362</td>
<td align="char" valign="top" char=".">2.3673</td>
<td/>
<td/>
<td/>
<td/>
</tr>
<tr>
<td align="char" valign="top" char=".">Model-3</td>
<td align="char" valign="top" char="." colspan="7"><italic>glmer(response&#x2009;~&#x2009;condition&#x2009;&#x00D7;&#x2009;participate&#x2009;+&#x2009;(1|subject)&#x2009;+&#x2009;(1|stimulus),family&#x2009;=&#x2009;&#x201C;binomial&#x201D;)</italic></td>
</tr>
<tr>
<td/>
<td align="char" valign="top" char="." colspan="7"><bold>Test filed&#x2009;=&#x2009;response</bold></td>
</tr>
<tr>
<td/>
<td align="char" valign="top" char="&#x00B1;"><bold>Fixed effects</bold></td>
<td align="char" valign="top" char="."><bold>&#x03B2;</bold></td>
<td align="char" valign="top" char="."><bold>SE</bold></td>
<td align="char" valign="top" char="."><bold>t value</bold></td>
<td align="char" valign="top" char="."><bold>Pr(&#x003E;|z|)</bold></td>
<td align="char" valign="top" char="."><bold>Mar R</bold><sup><bold>2</bold>
</sup></td>
<td align="char" valign="top" char="."><bold>Con R</bold><sup><bold>2</bold>
</sup></td>
</tr>
<tr>
<td/>
<td align="char" valign="top" char="&#x00B1;">(Intercept)</td>
<td align="char" valign="top" char=".">1.54093</td>
<td align="char" valign="top" char=".">0.41672</td>
<td align="char" valign="top" char=".">3.698</td>
<td align="char" valign="top" char=".">0.00021&#x002A;&#x002A;&#x002A;</td>
<td align="char" valign="top" char=".">0.515</td>
<td align="char" valign="top" char=".">0.673</td>
</tr>
<tr>
<td/>
<td align="char" valign="top" char="&#x00B1;">Condition</td>
<td align="char" valign="top" char=".">1.38765</td>
<td align="char" valign="top" char=".">0.16057</td>
<td align="char" valign="top" char=".">8.642</td>
<td align="char" valign="top" char=".">&#x003C;0.001&#x002A;&#x002A;&#x002A;</td>
<td/>
<td/>
</tr>
<tr>
<td/>
<td align="char" valign="top" char="&#x00B1;">Participant group</td>
<td align="char" valign="top" char=".">2.83008</td>
<td align="char" valign="top" char=".">0.20015</td>
<td align="char" valign="top" char=".">14.139</td>
<td align="char" valign="top" char=".">&#x003C;0.001&#x002A;&#x002A;&#x002A;</td>
<td/>
<td/>
</tr>
<tr>
<td/>
<td align="char" valign="top" char="&#x00B1;">Condition: Participant</td>
<td align="char" valign="top" char=".">1.25794</td>
<td align="char" valign="top" char=".">0.07542</td>
<td align="char" valign="top" char=".">16.68</td>
<td align="char" valign="top" char=".">&#x003C;0.001&#x002A;&#x002A;&#x002A;</td>
<td/>
<td/>
</tr>
<tr>
<td/>
<td align="char" valign="top" char="&#x00B1;"><bold>Random effects</bold></td>
<td align="char" valign="top" char="."><bold>Var.</bold></td>
<td align="char" valign="top" char="."><bold>SD</bold></td>
<td/>
<td/>
<td/>
<td/>
</tr>
<tr>
<td/>
<td align="char" valign="top" char="&#x00B1;">1|subject</td>
<td align="char" valign="top" char=".">5.344</td>
<td align="char" valign="top" char=".">3.468</td>
<td/>
<td/>
<td/>
<td/>
</tr>
<tr>
<td/>
<td align="char" valign="top" char="&#x00B1;">1|stimulus</td>
<td align="char" valign="top" char=".">3.998</td>
<td align="char" valign="top" char=".">3.267</td>
<td/>
<td/>
<td/>
<td/>
</tr>
<tr>
<td align="char" valign="top" char=".">Model-4</td>
<td align="char" valign="top" char="." colspan="7"><italic>lmer(RT&#x2009;~&#x2009;L1&#x2009;&#x00D7;&#x2009;task&#x2009;+&#x2009;(1 | subject)&#x2009;+&#x2009;(1 | stimuli)&#x2009;+&#x2009;(1 | task:subject))</italic></td>
</tr>
<tr>
<td/>
<td align="char" valign="top" char="&#x00B1;"><bold>Test filed&#x2009;=&#x2009;RT</bold></td>
<td/>
<td/>
<td/>
<td/>
<td/>
<td/>
</tr>
<tr>
<td/>
<td align="char" valign="top" char="&#x00B1;"><bold>Fixed effects</bold></td>
<td align="char" valign="top" char="."><bold>&#x03B2;</bold></td>
<td align="char" valign="top" char="."><bold>SE</bold></td>
<td align="char" valign="top" char="."><bold>t value</bold></td>
<td align="char" valign="top" char="."><bold>Pr(&#x003E;|z|)</bold></td>
<td align="char" valign="top" char="."><bold>Mar R</bold><sup><bold>2</bold>
</sup></td>
<td align="char" valign="top" char="."><bold>Con R</bold><sup><bold>2</bold>
</sup></td>
</tr>
<tr>
<td/>
<td align="char" valign="top" char="&#x00B1;">(Intercept)</td>
<td align="char" valign="top" char=".">2.69193</td>
<td align="char" valign="top" char=".">0.13575</td>
<td align="char" valign="top" char=".">134.3001</td>
<td align="char" valign="top" char=".">&#x003C;0.001&#x002A;&#x002A;&#x002A;</td>
<td align="char" valign="top" char=".">0.978</td>
<td align="char" valign="top" char=".">&#x003E;0.999</td>
</tr>
<tr>
<td/>
<td align="char" valign="top" char="&#x00B1;">Condition</td>
<td align="char" valign="top" char=".">&#x2212;1.1873</td>
<td align="char" valign="top" char=".">0.06316</td>
<td align="char" valign="top" char=".">134.2994</td>
<td align="char" valign="top" char=".">&#x003C;0.001&#x002A;&#x002A;&#x002A;</td>
<td/>
<td/>
</tr>
<tr>
<td/>
<td align="char" valign="top" char="&#x00B1;">Participant group</td>
<td align="char" valign="top" char=".">&#x2212;0.5336</td>
<td align="char" valign="top" char=".">0.04516</td>
<td align="char" valign="top" char=".">77.00673</td>
<td align="char" valign="top" char=".">&#x003C;0.001&#x002A;&#x002A;&#x002A;</td>
<td/>
<td/>
</tr>
<tr>
<td/>
<td align="char" valign="top" char="&#x00B1;">Condition: Participant</td>
<td align="char" valign="top" char=".">0.49843</td>
<td align="char" valign="top" char=".">0.02101</td>
<td align="char" valign="top" char=".">77.01085</td>
<td align="char" valign="top" char=".">&#x003C;0.001&#x002A;&#x002A;&#x002A;</td>
<td/>
<td/>
</tr>
<tr>
<td/>
<td align="char" valign="top" char="&#x00B1;"><bold>Random effects</bold></td>
<td align="char" valign="top" char="."><bold>Var.</bold></td>
<td align="char" valign="top" char="."><bold>SD</bold></td>
<td/>
<td/>
<td/>
<td/>
</tr>
<tr>
<td/>
<td align="char" valign="top" char="&#x00B1;">1|task:subject</td>
<td align="char" valign="top" char=".">1.1401</td>
<td align="char" valign="top" char=".">0.10782</td>
<td/>
<td/>
<td/>
<td/>
</tr>
<tr>
<td/>
<td align="char" valign="top" char="&#x00B1;">1|subject</td>
<td align="char" valign="top" char=".">15.0245</td>
<td align="char" valign="top" char=".">0.24361</td>
<td/>
<td/>
<td/>
<td/>
</tr>
<tr>
<td/>
<td align="char" valign="top" char="&#x00B1;">1|stimulus</td>
<td align="char" valign="top" char="&#x00B1;">6.1554</td>
<td align="char" valign="top" char="&#x00B1;">0.00212</td>
<td/>
<td/>
<td/>
<td/>
</tr>
</tbody>
</table>
<table-wrap-foot>
<p>Significant codes: 0 &#x2018;&#x002A;&#x002A;&#x002A;&#x2019; 0.001 &#x2018;&#x002A;&#x002A;&#x2019; 0.01.</p>
</table-wrap-foot>
</table-wrap>
<p>Moreover, random effects included the subject intercept and stimuli intercept, as well as the by-stimuli slope for conditions. The results indicated that the subject group and experiment conditions had significant main effects and interacted with each other regarding response and RT. It suggests that the attentional resources are consumed differently across conditions and subject groups. In aligning with our predictions, <italic>post-hoc</italic> comparisons were run between the segment-and-tone and the other conditions within each subject group.</p>
<p>CN group (<xref rid="fig3" ref-type="fig">Figure 3A</xref>): the CN listeners achieved averagely high accuracy (all mean accuracy&#x2009;&#x003E;&#x2009;86%) across the conditions, with no statistical differences between segment-and-tone and the forced-segment (<italic>z</italic>&#x2009;=&#x2009;1.63; <italic>p</italic>&#x2009;=&#x2009;0.303), or the segment-and-tone and forced-tone (<italic>z</italic>&#x2009;=&#x2009;2.32; <italic>p</italic>&#x2009;=&#x2009;0.093). It suggests that Cantonese native listeners were able to make a correct identification relying on the segments or tones provided in the non-optimal conditions. In the results of RT, the CN group obtained the lowest RT in response to the segment-and-tone condition than to the other three conditions (segment-and-tone and forced-segment: <italic>z</italic> =&#x2009;31.12, <italic>p</italic>&#x2009;&#x003C;&#x2009;0.01; segment-and-tone and forced-tone: <italic>z</italic> =&#x2009;31.14, <italic>p</italic>&#x2009;&#x003C;&#x2009;0.01; segment-and-tone and segment-or-tone: <italic>z</italic> =&#x2009;31.11, <italic>p</italic>&#x2009;&#x003C;&#x2009;0.01). It reflects the fact that the phonetically-conflicting conditions demanded more attentional costs for Cantonese native listeners in comparison with they were in a conflict-free condition.</p>
<p>LL group (<xref rid="fig3" ref-type="fig">Figure 3C</xref>): the LL listeners were able to achieve a comparable accuracy in the segment-and-tone and forced-segment conditions (<italic>z</italic>&#x2009;=&#x2009;1.03, <italic>p</italic>&#x2009;=&#x2009;0.324). Also, they could process stimuli equally rapidly across conditions except for the forced-tone (RT: segment-and-tone and forced-segment: <italic>z</italic> =&#x2009;1.09, <italic>p</italic> =&#x2009;0.237; segment-and-tone and segment-or-tone: <italic>z</italic> =&#x2009;1.49, <italic>p</italic> =&#x2009;0.136; segment-or-tone and forced-segment: <italic>z</italic> =&#x2009;0.13, <italic>p</italic> =&#x2009;0.891). It may attribute that the segment-and-tone, forced-segment, and segment-or-tone conditions enabled the LL listeners to identify stimuli depending on segmental information. Besides, the LL listeners got much poorer performance in the forced-tone condition compared with what they did in the segment-and-tone condition, with a much lower accuracy (<italic>z</italic>&#x2009;=&#x2009;21.83, <italic>p</italic>&#x2009;&#x003C;&#x2009;0.001) and a much higher RT (<italic>z</italic> =&#x2009;30.39, <italic>p</italic>&#x2009;&#x003C;&#x2009;0.001). It indicates that though the LL listeners could make a quick and accurate response in a conflict-free condition with the help of segmental information, the insufficiency of the learning experience disappointed them when merely tonal information was available in speech.</p>
<p>SB group (<xref rid="fig3" ref-type="fig">Figure 3B</xref>): the SB listeners derived a much lower accuracy in the forced-segment and forced-tone than in the segment-and-tone (forced-segment and segment-and-tone: <italic>z</italic>&#x2009;=&#x2009;7.09; <italic>p</italic>&#x2009;&#x003C;&#x2009;0.001; forced-tone and segment-and-tone: <italic>z</italic>&#x2009;=&#x2009;13.54, <italic>p</italic>&#x2009;&#x003C;&#x2009;0.001). In terms of RT, the SB listeners responded fastest in the segment-and-tone compared with what they did in the other three conditions (segment-and-tone and forced-segment: <italic>z</italic> =&#x2009;30.42, <italic>p</italic>&#x2009;&#x003C;&#x2009;0.001; segment-and-tone and forced-tone: <italic>z</italic>&#x2009;=&#x2009;30.17, <italic>p</italic>&#x2009;&#x003C;&#x2009;0.001; segment-and-tone and segment-or-tone: <italic>z</italic> =&#x2009;30.46, <italic>p</italic>&#x2009;&#x003C;&#x2009;0.001). It reflects that the incongruence of tonal and segmental dimensions would largely reduce accuracy and decelerate reaction speed for the simultaneous bilinguals.</p>
<p>The CN and SB groups showed a delayed reaction (long RT) in coping with phonetic conflicts (i.e., forced-tone, forced-segment, segment-or-tone). At the same time, they were responsive when confronting a conflict-free one (i.e., segment-and-tone). Thus, the conflicting conditions required more attentional resources for the CN and SB listeners. In contrast, the LL listeners who were much less sensitive to lexical tones would not feel effortful to neglect the tone-induced conflicts. They thereby would not burn additive attentional resources in the tonally-conflicting conditions such as the forced-segment and segment-or-tone conditions. Besides, the high accuracy obtained in the segment-and-tone condition across subject groups further verified the experimental design&#x2019;s high effectiveness.</p>
</sec>
<sec id="sec13">
<title>Attention integration of segments and tones</title>
<p>To examine participants&#x2019; performance of attention integration, we compared the forced-tone and forced-segment conditions across the subject groups. A logistic mixed-effect model and a linear mixed-effect model were conducted separately for the response type and RT, with fixed effects of subject group (CN, SB, and LL), experimental condition (forced-segment and forced-tone), and their interaction. The results of the linear and logistic models are exhibited in <xref rid="tab3" ref-type="table">Table 3</xref> (Model 3 and Model 4). The subject and stimuli intercepts were included as random effects for the two models. Besides, the by-subject slope for the condition was additionally included in the RT model. The results showed significant main effects and interaction of subject group and condition, suggesting that the CN, SB, and LL listeners might have different biases to the segmentally and tonally induced conflicts. <italic>Post-hoc</italic> comparisons were conducted between the forced-segment and forced-tone conditions and between the subject groups.</p>
<p>CN group (<xref rid="fig3" ref-type="fig">Figure 3A</xref>): there was no statistical difference in response and RT between the forced-segment and forced-tone conditions (response: <italic>z</italic>&#x2009;=&#x2009;0.749; <italic>p</italic>&#x2009;=&#x2009;0.454; RT: <italic>z</italic>&#x2009;=&#x2009;1.68; <italic>p</italic>&#x2009;=&#x2009;0.091). However, there was a slight trend that CN listeners got a higher accuracy and shorter RT in the forced-segment condition than in the forced-tone condition.</p>
<p>LL group (<xref rid="fig3" ref-type="fig">Figure 3C</xref>): they got a much higher accuracy and a lower RT in the forced-segment condition than in the forced-tone condition (response: <italic>z</italic>&#x2009;=&#x2009;22.95; <italic>p</italic>&#x2009;&#x003C;&#x2009;0.001; RT: <italic>z</italic> =&#x2009;30.08; <italic>p</italic> &#x003C;&#x2009;0.001). It highlights that the mismatch of segmental dimensions would lead to a much more perceptual difficulty for the LL listeners in processing Cantonese tones than the other way around.</p>
<p>SB group (<xref rid="fig3" ref-type="fig">Figure 3B</xref>): the SB listeners responded more accurately and quickly in the forced-segment condition than in the forced-tone condition (response: <italic>z</italic>&#x2009;=&#x2009;13.54, <italic>p</italic>&#x2009;&#x003C;&#x2009;0.001; RT: <italic>z</italic> =&#x2009;30.16; <italic>p</italic> &#x003C;&#x2009;0.001). It reveals that the simultaneous bilinguals encountered great difficulty when dealing with the segmental conflicts in tone perception.</p>
<p>CN, SB, and LL groups: for the forced-tone condition, the SB group obtained a far lower accuracy and a longer RT than the CN group (response: <italic>z</italic>&#x2009;=&#x2009;12.62, <italic>p</italic>&#x2009;&#x003C;&#x2009;0.001; RT: <italic>z</italic> =&#x2009;30.38, <italic>p</italic> &#x003C;&#x2009;0.001). But the SB group outperformed the LL group in the forced-tone condition (response: <italic>z</italic>&#x2009;=&#x2009;7.78, <italic>p</italic>&#x2009;&#x003C;&#x2009;0.001; RT: <italic>z</italic> =&#x2009;21.37, <italic>p</italic> &#x003C;&#x2009;0.001). For the forced-segment condition, the SB group exhibited a relatively weak performance compared to the CN (response: <italic>z</italic>&#x2009;=&#x2009;7.33, <italic>p</italic>&#x2009;&#x003C;&#x2009;0.001; RT: <italic>z</italic> =&#x2009;30.43, <italic>p</italic> &#x003C;&#x2009;0.001) and LL groups (response: z&#x2009;=&#x2009;13.00, <italic>p</italic>&#x2009;&#x003C;&#x2009;0.001; RT: z&#x2009;=&#x2009;30.40, <italic>p</italic> &#x003C;&#x2009;0.001).</p>
<p>Combining &#x201C;The Effectiveness of the Experimental Design&#x201D; and &#x201C;Attention Integration of Segments and Tones&#x201D; sections, it finds that both the CN and SB listeners burned more attentional resources when dealing with a conflicting condition than in a conflict-free environment. The difference is that the CN listeners could still maintain high accuracy when processing conflicting stimuli. In contrast, the SB listeners were disturbed by the phonetic conflicts and thus declined their performance. For the SB listeners, it also illustrates that either tonal or segmental mismatch would hinder their perception of Cantonese speech. The segmental conflicts affected them more than the tonal conflicts. With insufficient Cantonese proficiency, the LL listeners failed in the forced-tone condition, where they could not depend on the segmental information.</p>
</sec>
<sec id="sec14">
<title>Attention distribution in segments and tones</title>
<p>We compared responses across subject groups in the segment-or-tone condition to investigate how listeners redistribute attention along with segments and tones. The CN listeners attached importance to both the segmental and tonal dimensions in the attention distribution pattern. The LL listeners tended to dominantly rely on the segmental dimension, with a high response rate (above 90%) in the segment. The distribution pattern for the SB listeners was intermediate between the LL and CN listeners, with a 74% chance of attention selection along with segments. According to the repeated two-proportions z-tests, significant distinctions were detected in attention distribution across the LL, CN, and SB listeners: CN and SB: <italic>&#x03C7;</italic><sup>2</sup>&#x2009;=&#x2009;68.26, <italic>p</italic>&#x2009;&#x003C;&#x2009;0.001; CN and LL: <italic>&#x03C7;</italic><sup>2</sup>&#x2009;=&#x2009;354.71, <italic>p</italic>&#x2009;&#x003C;&#x2009;0.001; LL and SB: <italic>&#x03C7;</italic><sup>2</sup>&#x2009;=&#x2009;125.68, <italic>p</italic>&#x2009;&#x003C;&#x2009;0.001.</p>
</sec>
</sec>
<sec id="sec15" sec-type="discussions">
<title>Discussion</title>
<p>By referring to the attention integration and distribution of segments and tones, we examined the tone-specific SATP for simultaneous Urdu-Cantonese bilinguals. The results confirmed the high effectiveness of the experiment, with the CN and SB listeners consuming relatively much less RT in the segment-and-tone condition than in the other conditions. It is safe to define the &#x201C;segment-and-tone&#x201D; as a low-attention-demanding environment for Cantonese users. The remaining three conditions can be regarded as non-optimal conditions with phonetic conflicts. By further combing the findings of the integration and distribution of segments and tones, the following paragraphs discuss whether the bilinguals&#x2019; tone-specific SATP was automatic enough to help them to adapt to the phonetic conflicts. Moreover, we are also concerned about the factors, in addition to simultaneous language exposure, that might potentially alter bilinguals&#x2019; attentional performance in the current results.</p>
<sec id="sec16">
<title>The SATP developed by the Cantonese native listeners and late learners</title>
<p>Generally speaking, the CN listeners showed a high cognitive adaptation to the non-optimal environments by maintaining a consistently-high accuracy across the first three conditions. It is thus in line with the ASP hypothesis that native listeners can adapt to a degraded and conflicting listening scenario (<xref ref-type="bibr" rid="ref72">Strange and Shafer, 2008</xref>; <xref ref-type="bibr" rid="ref71">Strange, 2011</xref>). For the attention integration, the CN listeners could integrate tonal and segmental information in the conflicting speech since we detected similar performances in the forced-segment and forced-tone for Cantonese native listeners. The performance of the CN group showed that Cantonese native listeners would not ignore both the tonally and segmentally induced phonetic conflicts. This result supports previous studies on Cantonese tone-segment integration (e.g., <xref ref-type="bibr" rid="ref63">Repp and Lin, 1990</xref>; <xref ref-type="bibr" rid="ref65">Schirmer et al., 2005</xref>; <xref ref-type="bibr" rid="ref46">Lin and Francis, 2014</xref>). Also, the result is in line with the results of <xref ref-type="bibr" rid="ref12">Braun and Johnson (2011)</xref> and <xref ref-type="bibr" rid="ref84">Zou et al. (2017)</xref>, demonstrating that tonal language listeners distribute their attention across tonal and segmental dimensions when processing native speech.</p>
<p>In contrast, the LL listeners only obtained a poor accuracy and a long RT in the forced-tone condition but responded fast in the others. That is to say, the segment-induced conflicts can evoke a perceptual barrier for the LL listeners, while the tonal conflicts only exert limited influence on them. This is mainly because low-proficiency learners may be relatively insensitive to tonal conflicts and have limited exposure to lexical tones. They tend to copy the attention pattern in Urdu by overly relying on segmental information. In addition, the segment-depend distribution pattern detected for the LL listeners echoes that observed for the non-tonal L1 listeners by <xref ref-type="bibr" rid="ref84">Zou et al. (2017)</xref> and <xref ref-type="bibr" rid="ref12">Braun and Johnson (2011)</xref>.</p>
<p>The CN listeners might have developed a robust and mature SATP specific to Cantonese tones. The LL listeners were sensitive to segmental dimensions, while the CN listeners integrated both aspects of segments and tones. Thus, it is in line with the claims in the ASP model, suggesting that native listeners can select language-specific cues robustly. The finding also supports the view that their mother tongue shapes the &#x201C;highly over-learned&#x201D; attention system (<xref ref-type="bibr" rid="ref72">Strange and Shafer, 2008</xref>; <xref ref-type="bibr" rid="ref71">Strange, 2011</xref>).</p>
</sec>
<sec id="sec17">
<title>The SATP developed by simultaneous bilinguals</title>
<p>The SB listeners could make correct and rapid responses in the segment-and-tone condition. However, a noticeable difference was found between the SB and CN listeners when processing in a conflicting environment. In the forced-tone and forced-segment conditions, the SB listeners made far more errors and consumed more attentional resources than the CN listeners. Thus, the SB listeners paid considerable attentional efforts but still struggled to adapt to the phonetic conflicts. It supports the statements in the ASP model. When an optimal condition is provided, non-native listeners are likely to perform well. They have ample time to refer to the L2 knowledge and extract sufficient information to make a correct decision (<xref ref-type="bibr" rid="ref71">Strange, 2011</xref>).</p>
<p>Notably, the segment-induced and tone-induced conflicts did not decelerate the SB listeners&#x2019; performance to the same extent. The SB listeners resulted in much lower accuracy and a far longer RT in the forced-tone condition than in the forced-segment condition. It showed that segmental information was more important for bilinguals when judging tones than vice versa. Hence, it is hypothesized that simultaneous bilinguals might establish a weaker link between tones and segments than Cantonese native listeners will.</p>
<p>Regarding the results of attention distribution, around 74% of stimuli were classified relying on the segmental dimension by the SB listeners, implying that the SB listeners were more sensitive to segmental information than tonal information. Thus, the SB listeners tended to show an intermediate performance between the CN (segment-selection rate&#x2009;=&#x2009;58%) and LL (segment-selection rate&#x2009;=&#x2009;91%) listeners in the attention distribution pattern.</p>
<p>Generally, we observed a noticeable reduction in accuracy and a high cost in attentional resources for the SB group in conflict environments. It implied a clear distinction between the SB and CN groups in integrating and redistributing segments and tones. As previously demonstrated, the experimental design showed high effectiveness. The Cantonese tones employed (T2 and T4) are distinct enough to be easily distinguished by non-native listeners. Hence, the differences in performance between the SB and CN groups most likely stem from the SB listeners&#x2019; immature development of a tone-specific SATP mechanism. Simultaneous bilinguals cannot be considered &#x201C;native&#x201D; in the same way as Cantonese native speakers are. Their non-automatic SATP may thus subsequently influence the integration and selection of tonal information to interpret words and sentences in daily conversations. The results corroborate the statements in <xref ref-type="bibr" rid="ref5">Antoniou et al. (2012)</xref>, illustrating that bilingual speakers should be treated as a unique and configured speaker group from a monolingual one since they have to accommodate their perceptual system to more than one language.</p>
<p>Compared with the results in <xref ref-type="bibr" rid="ref47">Liu and Ning (2021)</xref>, where the SATP detected for the sequential Urdu-Cantonese teenagers lagged behind that of the Cantonese native listeners, the current study further confirmed such attentional divergence between early bilinguals and native listeners. It expanded the claim to include simultaneous bilinguals. This is possible because, with reduced experience in the language (<xref ref-type="bibr" rid="ref13">Byers-Heinlein and Fennell, 2014</xref>), even early simultaneous and sequential bilinguals&#x2019; attention devices may slow down in order to navigate two phonology systems (<xref ref-type="bibr" rid="ref67">Shafer et al., 2011</xref>; <xref ref-type="bibr" rid="ref36">Hisagi et al., 2015</xref>; <xref ref-type="bibr" rid="ref41">Kalashnikova et al., 2021</xref>), especially in a non-optimal listening condition (<xref ref-type="bibr" rid="ref72">Strange and Shafer, 2008</xref>; <xref ref-type="bibr" rid="ref71">Strange, 2011</xref>).</p>
</sec>
<sec id="sec18">
<title>To ASP model: Evidence from an endpoint of age in tone acquisition</title>
<p>The current study contributed to the ASP model in two ways. Firstly, it highlighted the fact that SATP is language-specific. Native listeners extract L1-specific information automatically through complex speech contexts. In contrast, non-native listeners may find it effortful to redistribute and integrate attention to the non-native cues, especially those that cannot be linguistically used in their L1 (<xref ref-type="bibr" rid="ref72">Strange and Shafer, 2008</xref>; <xref ref-type="bibr" rid="ref19">Costa et al., 2009</xref>; <xref ref-type="bibr" rid="ref71">Strange, 2011</xref>; <xref ref-type="bibr" rid="ref41">Kalashnikova et al., 2021</xref>). Second, it depicted a picture of a tone-specific attention mechanism from the angle of an endpoint in the age of tone acquisition. The simultaneous bilingual students in the current study were exposed to Urdu and Cantonese from the start of their lives through their multilingual parents. Even after years of regular Cantonese instruction from kindergarten to secondary school in Hong Kong, they could not adapt to a phonetically-conflicting condition when perceiving Cantonese tones. In other words, first-year exposure is insufficient to guarantee that simultaneous bilinguals establish a mature attentional mechanism for tone acquisition. It uncovered that, in addition to simultaneous exposure, more factors might be promising to play integrated roles in developing a tone-specific SATP system for simultaneous bilinguals.</p>
<sec id="sec19">
<title>Interference from an Urdu-specific SATP</title>
<p>ASP model predicts that bilinguals may use L1-specific SATP as a lens for weighting L2 cues (<xref ref-type="bibr" rid="ref28">Ellis, 2006</xref>; <xref ref-type="bibr" rid="ref74">Trofimovich, 2008</xref>; <xref ref-type="bibr" rid="ref75">Verbeek et al., 2022</xref>). Due to the initial attentional transfer from Urdu, bilinguals may be constrained to rely overly on segments. In the current results, the SB listeners showed an intermediate performance between the CN and LL groups regarding attention distribution and integration. Besides, they were more interfered with by segmental violations than tonal ones. Presumably, the SB listeners were suffering from inhibiting an attentional pattern in Urdu when perceiving Cantonese tones. It hypothesizes that the initial SATP transfer will likely affect simultaneous bilinguals as they enter late childhood or pre-adolescence.</p>
</sec>
<sec id="sec20">
<title>Individual differences in tone-related experience</title>
<p>According to <xref ref-type="bibr" rid="ref47">Liu and Ning&#x2019;s (2021)</xref> findings, there was a greater than 70% chance that the early sequential bilinguals who were dominant in Urdu (L1) used segments to distinguish Cantonese (L2) speech. While more than 60% for the Cantonese-dominant bilinguals. As shown previously, the current simultaneous bilinguals were estimated as balanced language users of Urdu and Cantonese. They allocated the same amount of attention to segments (segment-selection was around 74%) as the above Urdu dominants. As can be seen, except for simultaneous exposure, language use and proficiency are, therefore, of extreme importance to the development of tone-specific SATP (<xref ref-type="bibr" rid="ref70">Steinhauer et al., 2009</xref>; <xref ref-type="bibr" rid="ref69">Steinhauer, 2014</xref>; <xref ref-type="bibr" rid="ref77">White et al., 2017</xref>). In the current study, the bilingual dominance of the simultaneous bilinguals was calculated based on a relative comparison of Urdu and Cantonese rating scores with the BLP method. However, according to participants&#x2019; self-reports, the bilinguals might also use English across social contexts, owning to the multilingualism in Hong Kong society. As previously highlighted, English use (Likert score&#x2009;=&#x2009;6.23) was even scored higher than the use of Cantonese (Likert score&#x2009;=&#x2009;4.81) and Urdu (Likert score&#x2009;=&#x2009;5.14) by the simultaneous bilingual participants (all <italic>p</italic>&#x2009;&#x003C;&#x2009;0.001). However, extensive use of English, a non-tonal language, is not directly helpful in enhancing bilinguals&#x2019; sensitivity to tonal features. Insufficient tone use may be very likely to prevent simultaneous bilinguals&#x2019; from establishing the automaticity of the SATP system.</p>
</sec>
<sec id="sec21">
<title>Differences between the Cantonese and Mandarin-specific SATP</title>
<p>According to <xref ref-type="bibr" rid="ref84">Zou et al. (2017)</xref>, the Dutch-speaking advanced late learners of Mandarin obtained an accuracy comparable to Mandarin native listeners when integrating and redistributing segments and tones. However, there was a slight trend for both beginners and advanced learners to exhibit slower responses than native Mandarin speakers. Compared to the advanced late learners in <xref ref-type="bibr" rid="ref84">Zou et al. (2017)</xref>, the simultaneous bilinguals in the current study demonstrated weakened attention integration and distribution patterns. Hence, delayed development in tone-specific SATP is revealed in Cantonese acquisition. According to the studies on tone learning of South-Asian students in Hong Kong (<xref ref-type="bibr" rid="ref79">Wong and Leung, 2018</xref>; <xref ref-type="bibr" rid="ref83">Yao et al., 2020</xref>), early Cantonese learners persistently made errors in perceiving and producing tones even in optimal conditions, where an attentional effort was not highly demanded. Thus, the delayed acquisition of Cantonese tones (also mentioned in <xref ref-type="bibr" rid="ref83">Yao et al., 2020</xref>) will increase the likelihood that initial SAPT transfer will affect tone learning in later stages of life for simultaneous bilinguals.</p>
</sec>
<sec id="sec22">
<title>Differences between tone and segment-focused SATP studies</title>
<p>This study utilized the falling and rising tones, which are documented to be easily discriminated by learners under optimal listening conditions (<xref ref-type="bibr" rid="ref30">Francis et al., 2003</xref>; <xref ref-type="bibr" rid="ref60">Qin and Mok, 2011</xref>; <xref ref-type="bibr" rid="ref53">Mok et al., 2018</xref>). Like <xref ref-type="bibr" rid="ref47">Liu and Ning (2021)</xref> and <xref ref-type="bibr" rid="ref84">Zou et al. (2017)</xref>, the current study suggested that the phonetically-conflicting environment can cause specific perceptual barriers for bilinguals in perceiving lexical tones, even though the target tones are theoretically easier to be acquired by language learners. Differently, as introduced previously, the segment-focused SATP research reported that early bilinguals would show non-adaptation to a phonetically-conflicting condition only when they are distinguishing the difficult-to-detect segmental contrasts (<xref ref-type="bibr" rid="ref57">Navarra et al., 2005</xref>; <xref ref-type="bibr" rid="ref36">Hisagi et al., 2015</xref>; <xref ref-type="bibr" rid="ref82">Yan et al., 2019</xref>), other than the easy-to-detect ones (<xref ref-type="bibr" rid="ref55">Molnar et al., 2014</xref>; <xref ref-type="bibr" rid="ref22">Datta et al., 2020</xref>). Hence, there is an apparent discrepancy between the results in the above tone studies and the segment-based SATP studies. One reason is that the different experimental designs (e.g., monosyllabic or disyllabic stimuli, words or non-words, task types, etc.) in tone and segment-focused studies may directly lead to different results. The other reason is that there may be a lag in language learners&#x2019; acquisition of tones. For example, by investigating tone acquisition under an optimal condition by ethnic minority children in Hong Kong, <xref ref-type="bibr" rid="ref83">Yao et al. (2020)</xref> found that tonal errors are more severe than segmental errors in bilinguals&#x2019; production and perception. Similarly, in <xref ref-type="bibr" rid="ref16">Chen et al.&#x2019;s (2016)</xref> large corpus study, adult learners made much more tone errors than segment errors.</p>
<p>The future study suggests incorporating more pairs of tonal contrasts to examine if there exists an interplay between attentional inhibition and tone typology for bilinguals when accommodating a non-optimal listening condition. For example, in Cantonese, T2&#x2013;T5 and T3&#x2013;T6 are difficult-to-detect tone contrasts (e.g., <xref ref-type="bibr" rid="ref60">Qin and Mok, 2011</xref>; <xref ref-type="bibr" rid="ref53">Mok et al., 2018</xref>). It is thus interesting to examine whether and how different learners adapt themselves in a non-optimal listening condition when perceiving T2&#x2013;T4, T2&#x2013;T5, or T3&#x2013;T6 contrasts.</p>
</sec>
</sec>
</sec>
<sec id="sec23" sec-type="conclusions">
<title>Conclusion</title>
<p>The optimal and non-optimal conditions were provided to listeners to investigate how simultaneous exposure influences bilinguals&#x2019; distribution and integration of selective attention when processing Cantonese tones. The results showed that the simultaneous bilinguals could process Cantonese speech accurately and quickly when both segmental and tonal dimensions were provided (i.e., segment-and-tone). However, they were more likely to retain an Urdu-like attentional strategy in processing Cantonese tones, especially when the segmental dimension of the stimuli was mismatched. The current study provides evidence for the ASP model through tone acquisition by simultaneous bilinguals. The current study also hypothesizes that the development of simultaneous bilinguals&#x2019; tone-specific attention system could result from various factors, including the individual variances in tone-related experiences, language-specific differences, and L1-inhibition, in addition to an early learning age.</p>
</sec>
<sec id="sec24" sec-type="data-availability">
<title>Data availability statement</title>
<p>The raw data supporting the conclusions of this article will be made available by the authors, without undue reservation.</p>
</sec>
<sec id="sec25">
<title>Ethics statement</title>
<p>The studies involving human participants were reviewed and approved by the Human Subjects Ethics Subcommittee of the Hong Kong Polytechnic University. Written informed consent to participate in this study was provided by the participants&#x2019; legal guardian/next of kin.</p>
</sec>
<sec id="sec26">
<title>Author contributions</title>
<p>JN, GP, and YL contributed equally to the experiment design, conduction, data analysis, and manuscript drafting. YLN also made a significant contribution to the manuscript drafting and revision. All authors contributed to the article and approved the submitted version.</p>
</sec>
<sec id="sec27" sec-type="funding-information">
<title>Funding</title>
<p>This work was funded by the Language Fund under Research and Development Projects 2018&#x2013;19 of the Standing Committee on Language Education and Research (SCOLAR), Hong Kong SAR, and the Hong Kong Polytechnic University Projects of 4-88F3 and G-UALY. The work described in this paper was partially supported by a fellowship award from the Research Grants Council of the Hong Kong Special Administrative Region, China (Project No. PolyU/RFS2122-5H01).</p>
</sec>
<sec id="conf1" sec-type="COI-statement">
<title>Conflict of interest</title>
<p>The authors declare that the research was conducted without any commercial or financial relationships construed as a potential conflict of interest.</p>
</sec>
<sec id="sec100" sec-type="disclaimer">
<title>Publisher&#x2019;s note</title>
<p>All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article, or claim that may be made by its manufacturer, is not guaranteed or endorsed by the publisher.</p>
</sec>
</body>
<back>
<ref-list>
<title>References</title>
<ref id="ref1"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Abbasi</surname> <given-names>A. M.</given-names></name> <name><surname>Channa</surname> <given-names>M. A.</given-names></name> <name><surname>Kakepoto</surname> <given-names>I.</given-names></name> <name><surname>Ali</surname> <given-names>R.</given-names></name> <name><surname>Mehmood</surname> <given-names>M.</given-names></name></person-group> (<year>2017</year>). <article-title>A perceptual study of phonological variations in Pakistani English</article-title>. <source>In. J. Eng. Linguist.</source> <volume>8</volume>, <fpage>92</fpage>&#x2013;<lpage>100</lpage>. doi: <pub-id pub-id-type="doi">10.5539/ijel.v8n2p92</pub-id></citation></ref>
<ref id="ref2"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Akkharasena</surname> <given-names>K.</given-names></name></person-group> (<year>2018</year>). <article-title>Production of Bangkok Thai tones by native speakers of Burmese and Urdu</article-title>. <source>Vacana</source> <volume>3</volume>, <fpage>1</fpage>&#x2013;<lpage>23</lpage>. Avaliable at: <ext-link xlink:href="http://rs.mfu.ac.th/ojs/index.php/vacana/article/view/155/97" ext-link-type="uri">http://rs.mfu.ac.th/ojs/index.php/vacana/article/view/155/97</ext-link> (Accessed August 23, 2022).</citation></ref>
<ref id="ref3"><citation citation-type="other"><person-group person-group-type="author"><name><surname>Aldrich</surname> <given-names>A.</given-names></name></person-group> (<year>2020</year>). Adult early-bilingual speech rhythm: evidence from Spanish and English. In <italic>Proceedings of the 10th International Conference on Speech Prosody 2020</italic> (pp. 528&#x2013;532). Tokyo, Japan. doi:<pub-id pub-id-type="doi">10.21437/SpeechProsody.2020-108</pub-id></citation></ref>
<ref id="ref4"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Amengual</surname> <given-names>M.</given-names></name></person-group> (<year>2016</year>). <article-title>The perception of language-specific phonetic categories does not guarantee accurate phonological representations in the lexicon of early bilinguals</article-title>. <source>Appl. Psycholinguist.</source> <volume>37</volume>, <fpage>1221</fpage>&#x2013;<lpage>1251</lpage>. doi: <pub-id pub-id-type="doi">10.1017/S0142716415000557</pub-id></citation></ref>
<ref id="ref5"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Antoniou</surname> <given-names>M.</given-names></name> <name><surname>Tyler</surname> <given-names>M. D.</given-names></name> <name><surname>Best</surname> <given-names>C. T.</given-names></name></person-group> (<year>2012</year>). <article-title>Two ways to listen: do l2-dominant bilinguals perceive stop voicing according to language mode?</article-title> <source>J. Phon.</source> <volume>40</volume>, <fpage>582</fpage>&#x2013;<lpage>594</lpage>. doi: <pub-id pub-id-type="doi">10.1016/j.wocn.2012.05.005</pub-id>, PMID: <pub-id pub-id-type="pmid">22844163</pub-id></citation></ref>
<ref id="ref6"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Baayen</surname> <given-names>R. H.</given-names></name> <name><surname>Davidson</surname> <given-names>D. J.</given-names></name> <name><surname>Bates</surname> <given-names>D. M.</given-names></name></person-group> (<year>2008</year>). <article-title>Mixed-effects modeling with crossed random effects for participants and items</article-title>. <source>J. Mem. Lang.</source> <volume>59</volume>, <fpage>390</fpage>&#x2013;<lpage>412</lpage>. doi: <pub-id pub-id-type="doi">10.1016/j.jml.2007.12.005</pub-id></citation></ref>
<ref id="ref7"><citation citation-type="other"><person-group person-group-type="author"><name><surname>Barto&#x0144;</surname> <given-names>K.</given-names></name></person-group> (<year>2015</year>). MuMIn: multi-model inference. R package version 1.15.1. Available at: <ext-link xlink:href="http://CRAN.R-project.org/package=MuMIn" ext-link-type="uri">http://CRAN.R-project.org/package=MuMIn</ext-link> (Accessed August 20, 2022).</citation></ref>
<ref id="ref8"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Bates</surname> <given-names>D.</given-names></name> <name><surname>Maechler</surname> <given-names>M.</given-names></name> <name><surname>Bolker</surname> <given-names>B.</given-names></name> <name><surname>Walker</surname> <given-names>S.</given-names></name></person-group> (<year>2015</year>). <article-title>Fitting linear mixed-effects models using lme4</article-title>. <source>J. Stat. Softw.</source> <volume>67</volume>, <fpage>1</fpage>&#x2013;<lpage>48</lpage>. doi: <pub-id pub-id-type="doi">10.18637/jss.v067.i01</pub-id></citation></ref>
<ref id="ref9"><citation citation-type="other"><person-group person-group-type="author"><name><surname>Birdsong</surname> <given-names>D.</given-names></name> <name><surname>Gertken</surname> <given-names>L. M.</given-names></name> <name><surname>Amengual</surname> <given-names>M.</given-names></name></person-group> (<year>2012</year>). Bilingual language profile: an easy-to-use instrument to assess bilingualism. COERLL, <italic>University of Texas at Austin</italic>. Web. January 20, 2012. Available at: <ext-link xlink:href="https://sites.la.utexas.edu/bilingual/" ext-link-type="uri">https://sites.la.utexas.edu/bilingual/</ext-link> (Accessed August 20, 2022).</citation></ref>
<ref id="ref10"><citation citation-type="other"><person-group person-group-type="author"><name><surname>Boersma</surname> <given-names>P.</given-names></name> <name><surname>Weenink</surname> <given-names>D.</given-names></name></person-group> (<year>2014</year>). Praat: doing phonetics by computer. Available at: <ext-link xlink:href="http://www.praat.org" ext-link-type="uri">http://www.praat.org</ext-link> (Accessed September 14, 2020).</citation></ref>
<ref id="ref11"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Bradlow</surname> <given-names>A. R.</given-names></name> <name><surname>Alexander</surname> <given-names>J. A.</given-names></name></person-group> (<year>2007</year>). <article-title>Semantic and phonetic enhancements for speech-in-noise recognition by native and non-native listeners</article-title>. <source>J. Acoust. Soc. Am.</source> <volume>121</volume>, <fpage>2339</fpage>&#x2013;<lpage>2349</lpage>. doi: <pub-id pub-id-type="doi">10.1121/1.2642103</pub-id>, PMID: <pub-id pub-id-type="pmid">17471746</pub-id></citation></ref>
<ref id="ref12"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Braun</surname> <given-names>B.</given-names></name> <name><surname>Johnson</surname> <given-names>E. K.</given-names></name></person-group> (<year>2011</year>). <article-title>Question or tone 2? How language experience and linguistic function guide pitch processing</article-title>. <source>J. Phon.</source> <volume>39</volume>, <fpage>585</fpage>&#x2013;<lpage>594</lpage>. doi: <pub-id pub-id-type="doi">10.1016/j.wocn.2011.06.002</pub-id></citation></ref>
<ref id="ref13"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Byers-Heinlein</surname> <given-names>K.</given-names></name> <name><surname>Fennell</surname> <given-names>C. T.</given-names></name></person-group> (<year>2014</year>). <article-title>Perceptual narrowing in the context of increased variation: insights from bilingual infants</article-title>. <source>Dev. Psychobiol.</source> <volume>56</volume>, <fpage>274</fpage>&#x2013;<lpage>291</lpage>. doi: <pub-id pub-id-type="doi">10.1002/dev.21167</pub-id>, PMID: <pub-id pub-id-type="pmid">24114364</pub-id></citation></ref>
<ref id="ref14"><citation citation-type="other"><person-group person-group-type="author"><collab id="coll1">Census and Statistics Department of HKSAR</collab></person-group> (<year>2017</year>). 2016 population census thematic report: ethnic minorities. Retrieved from website of the census and statistics department. Available at: <ext-link xlink:href="https://www.statistics.gov.hk/pub/B11201002016XXXXB0100.pdf" ext-link-type="uri">https://www.statistics.gov.hk/pub/B11201002016XXXXB0100.pdf</ext-link> (Accessed August 20, 2022).</citation></ref>
<ref id="ref15"><citation citation-type="other"><person-group person-group-type="author"><collab id="coll2">Census and Statistics Department of HKSAR</collab></person-group> (<year>2021</year>). Population census report&#x2013;summary results. Retrieved from website of the census and statistics department. Available at: <ext-link xlink:href="https://www.census2021.gov.hk/tc/main_tables.html" ext-link-type="uri">https://www.census2021.gov.hk/tc/main_tables.html</ext-link> (Accessed August 20, 2022).</citation></ref>
<ref id="ref16"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Chen</surname> <given-names>N. F.</given-names></name> <name><surname>Wee</surname> <given-names>D.</given-names></name> <name><surname>Tong</surname> <given-names>R.</given-names></name> <name><surname>Ma</surname> <given-names>B.</given-names></name> <name><surname>Li</surname> <given-names>H.</given-names></name></person-group> (<year>2016</year>). <article-title>Large-scale characterization of non-native mandarin Chinese spoken by speakers of European origin: analysis on iCALL</article-title>. <source>Speech Comm.</source> <volume>84</volume>, <fpage>46</fpage>&#x2013;<lpage>56</lpage>. doi: <pub-id pub-id-type="doi">10.1016/j.specom.2016.07.005</pub-id></citation></ref>
<ref id="ref17"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Chen</surname> <given-names>S.</given-names></name> <name><surname>Zhu</surname> <given-names>Y.</given-names></name> <name><surname>Wayland</surname> <given-names>R.</given-names></name></person-group> (<year>2017</year>). <article-title>Effects of stimulus duration and vowel quality in cross-linguistic categorical perception of pitch directions</article-title>. <source>PLoS One</source> <volume>12</volume>:<fpage>e0180656</fpage>. doi: <pub-id pub-id-type="doi">10.1371/journal.pone.0180656</pub-id>, PMID: <pub-id pub-id-type="pmid">28671991</pub-id></citation></ref>
<ref id="ref18"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Cheung</surname> <given-names>K. C. K.</given-names></name> <name><surname>Chou</surname> <given-names>K. L.</given-names></name></person-group> (<year>2018</year>). <article-title>Child poverty among Hong Kong ethnic minorities</article-title>. <source>Soc. Indic. Res.</source> <volume>137</volume>, <fpage>93</fpage>&#x2013;<lpage>112</lpage>. doi: <pub-id pub-id-type="doi">10.1007/s11205-017-1599-z</pub-id></citation></ref>
<ref id="ref19"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Costa</surname> <given-names>A.</given-names></name> <name><surname>Hern&#x00E1;ndez</surname> <given-names>M.</given-names></name> <name><surname>Costa-Faidella</surname> <given-names>J.</given-names></name> <name><surname>Sebasti&#x00E1;n-Gall&#x00E9;s</surname> <given-names>N.</given-names></name></person-group> (<year>2009</year>). <article-title>On the bilingual advantage in conflict processing: now you see it, now you don&#x2019;t</article-title>. <source>Cognition</source> <volume>113</volume>, <fpage>135</fpage>&#x2013;<lpage>149</lpage>. doi: <pub-id pub-id-type="doi">10.1016/j.cognition.2009.08.001</pub-id>, PMID: <pub-id pub-id-type="pmid">19729156</pub-id></citation></ref>
<ref id="ref20"><citation citation-type="other"><person-group person-group-type="author"><name><surname>Courtney</surname> <given-names>L. M.</given-names></name></person-group> (<year>2014</year>). Moving from primary to secondary education: an investigation into the effect of primary to secondary transition on motivation for language learning and foreign language proficiency (Doctoral dissertation, University of Southampton).</citation></ref>
<ref id="ref21"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Crandell</surname> <given-names>C. C.</given-names></name> <name><surname>Smaldino</surname> <given-names>J. J.</given-names></name></person-group> (<year>2000</year>). <article-title>Classroom acoustics for children with normal hearing and with hearing impairment</article-title>. <source>Lang. Speech Hear. Serv. Sch.</source> <volume>31</volume>, <fpage>362</fpage>&#x2013;<lpage>370</lpage>. doi: <pub-id pub-id-type="doi">10.1044/0161-1461.3104.362</pub-id>, PMID: <pub-id pub-id-type="pmid">27764475</pub-id></citation></ref>
<ref id="ref22"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Datta</surname> <given-names>H.</given-names></name> <name><surname>Hestvik</surname> <given-names>A.</given-names></name> <name><surname>Vidal</surname> <given-names>N.</given-names></name> <name><surname>Tessel</surname> <given-names>C.</given-names></name> <name><surname>Hisagi</surname> <given-names>M.</given-names></name> <name><surname>Wr&#x00F3;blewski</surname> <given-names>M.</given-names></name> <etal/></person-group>. (<year>2020</year>). <article-title>Automaticity of speech processing in early bilingual adults and children</article-title>. <source>Biling. Lang. Congn.</source> <volume>23</volume>, <fpage>429</fpage>&#x2013;<lpage>445</lpage>. doi: <pub-id pub-id-type="doi">10.1017/S1366728919000099</pub-id>, PMID: <pub-id pub-id-type="pmid">32905492</pub-id></citation></ref>
<ref id="ref23"><citation citation-type="book"><person-group person-group-type="author"><name><surname>De Houwer</surname> <given-names>A.</given-names></name></person-group> (<year>1990</year>). <source>The Acquisition of Two Languages From Birth: A Case Study</source>. <publisher-loc>Cambridge</publisher-loc>: <publisher-name>Cambridge University Press</publisher-name>. doi: <volume>91</volume>, <fpage>86</fpage>&#x2013;<lpage>97.</lpage></citation></ref>
<ref id="ref24"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Dollmann</surname> <given-names>J.</given-names></name> <name><surname>Kogan</surname> <given-names>I.</given-names></name> <name><surname>Wei&#x00DF;mann</surname> <given-names>M.</given-names></name></person-group> (<year>2020</year>). <article-title>Speaking accent-free in L2 beyond the critical period: the compensatory role of individual abilities and opportunity structures</article-title>. <source>Appl. Linguis.</source> <volume>41</volume>, <fpage>787</fpage>&#x2013;<lpage>809</lpage>. doi: <pub-id pub-id-type="doi">10.1093/applin/amz029</pub-id></citation></ref>
<ref id="ref25"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Dong</surname> <given-names>Y.</given-names></name> <name><surname>Peng</surname> <given-names>C. Y. J.</given-names></name></person-group> (<year>2013</year>). <article-title>Principled missing data methods for researchers</article-title>. <source>Springerplus</source> <volume>2</volume>, <fpage>1</fpage>&#x2013;<lpage>17</lpage>. doi: <pub-id pub-id-type="doi">10.1186/2193-1801-2-222</pub-id>, PMID: <pub-id pub-id-type="pmid">23853744</pub-id></citation></ref>
<ref id="ref26"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Dupoux</surname> <given-names>E.</given-names></name> <name><surname>Peperkamp</surname> <given-names>S.</given-names></name> <name><surname>Sebasti&#x00E1;n-Gall&#x00E9;s</surname> <given-names>N.</given-names></name></person-group> (<year>2010</year>). <article-title>Limits on bilingualism revisited: stress &#x2018;deafness&#x2019; in simultaneous French&#x2013;Spanish bilinguals</article-title>. <source>Cognition</source> <volume>114</volume>, <fpage>266</fpage>&#x2013;<lpage>275</lpage>. doi: <pub-id pub-id-type="doi">10.1016/j.cognition.2009.10.001</pub-id>, PMID: <pub-id pub-id-type="pmid">19896647</pub-id></citation></ref>
<ref id="ref27"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Dupoux</surname> <given-names>E.</given-names></name> <name><surname>Sebasti&#x00E1;n-Gall&#x00E9;s</surname> <given-names>N.</given-names></name> <name><surname>Navarrete</surname> <given-names>E.</given-names></name> <name><surname>Peperkamp</surname> <given-names>S.</given-names></name></person-group> (<year>2008</year>). <article-title>Persistent stress &#x2018;deafness&#x2019;: the case of French learners of Spanish</article-title>. <source>Cognition</source> <volume>106</volume>, <fpage>682</fpage>&#x2013;<lpage>706</lpage>. doi: <pub-id pub-id-type="doi">10.1016/j.cognition.2007.04.001</pub-id>, PMID: <pub-id pub-id-type="pmid">17592731</pub-id></citation></ref>
<ref id="ref28"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Ellis</surname> <given-names>R.</given-names></name></person-group> (<year>2006</year>). <article-title>Current issues in the teaching of grammar: an SLA perspective</article-title>. <source>TESOL Q.</source> <volume>40</volume>, <fpage>83</fpage>&#x2013;<lpage>107</lpage>. doi: <pub-id pub-id-type="doi">10.2307/40264512</pub-id></citation></ref>
<ref id="ref29"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Figueiredo</surname> <given-names>S. A. D. B.</given-names></name> <name><surname>Da Silva</surname> <given-names>C. F.</given-names></name></person-group> (<year>2009</year>). <article-title>Cognitive differences in second language learners and the critical period effects</article-title>. <source>L1-Educ. Stud. Lang. Lit.</source> <volume>9</volume>, <fpage>157</fpage>&#x2013;<lpage>178</lpage>. doi: <pub-id pub-id-type="doi">10.17239/L1ESLL-2009.09.04.05</pub-id></citation></ref>
<ref id="ref30"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Francis</surname> <given-names>A. L.</given-names></name> <name><surname>Ciocca</surname> <given-names>V.</given-names></name> <name><surname>Chit Ng</surname> <given-names>B. K.</given-names></name></person-group> (<year>2003</year>). <article-title>On the (non) categorical perception of lexical tones</article-title>. <source>Percept. Psychophys.</source> <volume>65</volume>, <fpage>1029</fpage>&#x2013;<lpage>1044</lpage>. doi: <pub-id pub-id-type="doi">10.3758/bf03194832</pub-id></citation></ref>
<ref id="ref31"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Francis</surname> <given-names>A. L.</given-names></name> <name><surname>Nusbaum</surname> <given-names>H. C.</given-names></name></person-group> (<year>2002</year>). <article-title>Selective attention and the acquisition of the acquisition of new phonetic categories</article-title>. <source>J. Exp. Psychol. Hum. Percept. Perform.</source> <volume>28</volume>, <fpage>349</fpage>&#x2013;<lpage>366</lpage>. doi: <pub-id pub-id-type="doi">10.1037/0096-1523.28.2.349</pub-id>, PMID: <pub-id pub-id-type="pmid">11999859</pub-id></citation></ref>
<ref id="ref32"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Garc&#x00ED;a</surname> <given-names>P. B.</given-names></name> <name><surname>Froud</surname> <given-names>K.</given-names></name></person-group> (<year>2018</year>). <article-title>Perception of American English vowels by sequential Spanish&#x2013;English bilinguals</article-title>. <source>Biling. Lang. Congn.</source> <volume>21</volume>, <fpage>80</fpage>&#x2013;<lpage>103</lpage>. doi: <pub-id pub-id-type="doi">10.1017/S1366728916000808</pub-id>, PMID: <pub-id pub-id-type="pmid">29449782</pub-id></citation></ref>
<ref id="ref33"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Hall&#x00E9;</surname> <given-names>P. A.</given-names></name> <name><surname>Chang</surname> <given-names>Y. C.</given-names></name> <name><surname>Best</surname> <given-names>C. T.</given-names></name></person-group> (<year>2004</year>). <article-title>Identification and discrimination of mandarin Chinese tones by mandarin Chinese vs. French listeners</article-title>. <source>J. Phon.</source> <volume>32</volume>, <fpage>395</fpage>&#x2013;<lpage>421</lpage>. doi: <pub-id pub-id-type="doi">10.1016/S0095-4470(03)00016-0</pub-id></citation></ref>
<ref id="ref34"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Hazan</surname> <given-names>V.</given-names></name> <name><surname>Barrett</surname> <given-names>S.</given-names></name></person-group> (<year>2000</year>). <article-title>The development of phonemic categorization in children aged 6&#x2013;12</article-title>. <source>J. Phon.</source> <volume>28</volume>, <fpage>377</fpage>&#x2013;<lpage>396</lpage>. doi: <pub-id pub-id-type="doi">10.1006/jpho.2000.0121</pub-id></citation></ref>
<ref id="ref35"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Hendry</surname> <given-names>A.</given-names></name> <name><surname>Jones</surname> <given-names>E. J.</given-names></name> <name><surname>Charman</surname> <given-names>T.</given-names></name></person-group> (<year>2016</year>). <article-title>Executive function in the first three years of life: precursors, predictors and patterns</article-title>. <source>Dev. Rev.</source> <volume>42</volume>, <fpage>1</fpage>&#x2013;<lpage>33</lpage>. doi: <pub-id pub-id-type="doi">10.1016/j.dr.2016.06.005</pub-id></citation></ref>
<ref id="ref36"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Hisagi</surname> <given-names>M.</given-names></name> <name><surname>Garrido-Nag</surname> <given-names>K.</given-names></name> <name><surname>Datta</surname> <given-names>H.</given-names></name> <name><surname>Shafer</surname> <given-names>V. L.</given-names></name></person-group> (<year>2015</year>). <article-title>ERP indices of vowel processing in Spanish&#x2013;English bilinguals</article-title>. <source>Biling. Lang. Congn.</source> <volume>18</volume>, <fpage>271</fpage>&#x2013;<lpage>289</lpage>. doi: <pub-id pub-id-type="doi">10.1017/S1366728914000170</pub-id></citation></ref>
<ref id="ref37"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Hothorn</surname> <given-names>T.</given-names></name> <name><surname>Bretz</surname> <given-names>F.</given-names></name> <name><surname>Westfall</surname> <given-names>P.</given-names></name></person-group> (<year>2008</year>). <article-title>Simultaneous inference in general parametric models</article-title>. <source>Biom. J.</source> <volume>50</volume>, <fpage>346</fpage>&#x2013;<lpage>363</lpage>. doi: <pub-id pub-id-type="doi">10.1002/bimj.200810425</pub-id>, PMID: <pub-id pub-id-type="pmid">18481363</pub-id></citation></ref>
<ref id="ref38"><citation citation-type="other"><person-group person-group-type="author"><name><surname>Jabeen</surname> <given-names>F.</given-names></name></person-group> (<year>2019</year>). Interpretation of LH intonation contour in Urdu/Hindi. In <italic>Proceedings of International Congress of Phonetic Science</italic>, Melbourne.</citation></ref>
<ref id="ref39"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Jabeen</surname> <given-names>F.</given-names></name> <name><surname>Hussain</surname> <given-names>S.</given-names></name></person-group> (<year>2012</year>). <article-title>The pitch contour of declarative sentences in Urdu language</article-title>. <source>Lang. Technol.</source> <volume>17</volume>, <fpage>17</fpage>&#x2013;<lpage>27</lpage>. Avaliable at: <ext-link xlink:href="http://www.assta.org/proceedings/ICPhS2019/papers/ICPhS_3871.pdf" ext-link-type="uri">http://www.assta.org/proceedings/ICPhS2019/papers/ICPhS_3871.pdf</ext-link> (Accessed August 23, 2022). </citation></ref>
<ref id="ref40"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Jakobsen</surname> <given-names>J. C.</given-names></name> <name><surname>Gluud</surname> <given-names>C.</given-names></name> <name><surname>Wetterslev</surname> <given-names>J.</given-names></name> <name><surname>Winkle</surname> <given-names>P.</given-names></name></person-group> (<year>2017</year>). <article-title>When and how should multiple imputation be used for handling missing data in randomised clinical trials &#x2013; a practical guide with flowcharts</article-title>. <source>BMC Med. Res. Methodol.</source> <volume>17</volume>, <fpage>162</fpage>&#x2013;<lpage>110</lpage>. doi: <pub-id pub-id-type="doi">10.1186/s12874-017-0442-1</pub-id>, PMID: <pub-id pub-id-type="pmid">29207961</pub-id></citation></ref>
<ref id="ref41"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Kalashnikova</surname> <given-names>M.</given-names></name> <name><surname>Pejovic</surname> <given-names>J.</given-names></name> <name><surname>Carreiras</surname> <given-names>M.</given-names></name></person-group> (<year>2021</year>). <article-title>The effects of bilingualism on attentional processes in the first year of life</article-title>. <source>Dev. Sci.</source> <volume>25</volume>:<fpage>e13011</fpage>. doi: <pub-id pub-id-type="doi">10.1111/desc.13139</pub-id>, PMID: <pub-id pub-id-type="pmid">34235805</pub-id></citation></ref>
<ref id="ref42"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Kim</surname> <given-names>H. Y.</given-names></name></person-group> (<year>2013</year>). <article-title>Statistical notes for clinical researchers: assessing normal distribution (2) using skewness and kurtosis</article-title>. <source>Restor. Dent. Endod.</source> <volume>38</volume>, <fpage>52</fpage>&#x2013;<lpage>54</lpage>. doi: <pub-id pub-id-type="doi">10.5395/rde.2013.38.1.52</pub-id>, PMID: <pub-id pub-id-type="pmid">23495371</pub-id></citation></ref>
<ref id="ref43"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Kohl</surname> <given-names>P. K.</given-names></name></person-group> (<year>1993</year>). <article-title>Early linguistic experience and phonetic perception: implications for theories of developmental speech perception</article-title>. <source>J. Phon.</source> <volume>21</volume>, <fpage>125</fpage>&#x2013;<lpage>139</lpage>. doi: <pub-id pub-id-type="doi">10.1016/S0095-4470(19)31326-9</pub-id></citation></ref>
<ref id="ref44"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Kuhl</surname> <given-names>P. K.</given-names></name> <name><surname>Williams</surname> <given-names>K. A.</given-names></name> <name><surname>Lacerda</surname> <given-names>F.</given-names></name> <name><surname>Stevens</surname> <given-names>K. N.</given-names></name> <name><surname>Lindblom</surname> <given-names>B.</given-names></name></person-group> (<year>1992</year>). <article-title>Linguistic experience alters phonetic perception in infants by 6 months of age</article-title>. <source>Science</source> <volume>255</volume>, <fpage>606</fpage>&#x2013;<lpage>608</lpage>. doi: <pub-id pub-id-type="doi">10.1126/science.1736364</pub-id>, PMID: <pub-id pub-id-type="pmid">1736364</pub-id></citation></ref>
<ref id="ref45"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Kuznetsova</surname> <given-names>A.</given-names></name> <name><surname>Brockhoff</surname> <given-names>P. B.</given-names></name> <name><surname>Christensen</surname> <given-names>R. H. B.</given-names></name></person-group> (<year>2017</year>). <article-title>lmerTest package: tests in linear mixed effects models</article-title>. <source>J. Stat. Softw.</source> <volume>82</volume>, <fpage>1</fpage>&#x2013;<lpage>26</lpage>. doi: <pub-id pub-id-type="doi">10.18637/jss.v082.i13</pub-id></citation></ref>
<ref id="ref46"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Lin</surname> <given-names>M.</given-names></name> <name><surname>Francis</surname> <given-names>A. L.</given-names></name></person-group> (<year>2014</year>). <article-title>Effects of language experience and expectations on attention to consonants and tones in English and mandarin Chinese</article-title>. <source>J. Acoust. Am.</source> <volume>136</volume>, <fpage>2827</fpage>&#x2013;<lpage>2838</lpage>. doi: <pub-id pub-id-type="doi">10.1121/1.4898047</pub-id>, PMID: <pub-id pub-id-type="pmid">25373982</pub-id></citation></ref>
<ref id="ref47"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Liu</surname> <given-names>Y.</given-names></name> <name><surname>Ning</surname> <given-names>J. H.</given-names></name></person-group> (<year>2021</year>). <article-title>The effect of language dominance on the selective attention of segments and tones in Urdu-Cantonese speakers</article-title>. <source>Front. Psychol.</source> <volume>12</volume>:<fpage>710713</fpage>. doi: <pub-id pub-id-type="doi">10.3389/fpsyg.2021.710713</pub-id></citation></ref>
<ref id="ref48"><citation citation-type="book"><person-group person-group-type="author"><name><surname>MacWhinney</surname> <given-names>B.</given-names></name></person-group> (<year>2018</year>). &#x201C;<article-title>A unified model of first and second language learning</article-title>,&#x201D; in <source>Sources of Variation in First Language Acquisition: Languages, Contexts and Learners</source>, <volume>Vol. 22.</volume> eds. <person-group person-group-type="editor"><name><surname>Hickmann</surname> <given-names>M.</given-names></name> <name><surname>Veneziano</surname> <given-names>E.</given-names></name> <name><surname>Jisa</surname> <given-names>H.</given-names></name></person-group> (<publisher-loc>Amsterdam</publisher-loc>: <publisher-name>John Benjamins</publisher-name>), <fpage>287</fpage>&#x2013;<lpage>312</lpage>.</citation></ref>
<ref id="ref49"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Marslen-Wilson</surname> <given-names>W.</given-names></name> <name><surname>Warren</surname> <given-names>P.</given-names></name></person-group> (<year>1994</year>). <article-title>Levels of perceptual representation and process in lexical access: words, phonemes, and features</article-title>. <source>Psychol. Rev.</source> <volume>101</volume>, <fpage>653</fpage>&#x2013;<lpage>675</lpage>. doi: <pub-id pub-id-type="doi">10.1037/0033-295X.101.4.653</pub-id>, PMID: <pub-id pub-id-type="pmid">7984710</pub-id></citation></ref>
<ref id="ref50"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>McCandliss</surname> <given-names>B. D.</given-names></name> <name><surname>Yoncheva</surname> <given-names>Y.</given-names></name></person-group> (<year>2011</year>). <article-title>Integration of left-lateralized neural systems supporting skilled reading. Developmental dyslexia: early precursors</article-title>. <source>Neurobehav. Mark. Biol. Subs.</source> <volume>8</volume>, <fpage>241</fpage>&#x2013;<lpage>256</lpage>. doi: <pub-id pub-id-type="doi">10.1207/s1532799xssr0803_4</pub-id></citation></ref>
<ref id="ref51"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Meisel</surname> <given-names>J. M.</given-names></name></person-group> (<year>2004</year>). &#x201C;<article-title>The bilingual child</article-title>&#x201D;. <source>The Handbook of Bilingualism</source>. eds. <person-group person-group-type="editor"><name><surname>Bhatia</surname> <given-names>T. K.</given-names></name> <name><surname>Ritchie</surname> <given-names>W. C.</given-names></name></person-group> (<publisher-loc>palatino</publisher-loc>: <publisher-name>Blackwell Press</publisher-name>), <fpage>91</fpage>&#x2013;<lpage>113</lpage>.</citation></ref>
<ref id="ref52"><citation citation-type="book"><person-group person-group-type="author"><name><surname>Meng</surname> <given-names>N.</given-names></name></person-group> (<year>2021</year>). &#x201C;<article-title>&#x201C;Repeat after me&#x201D;: is there a better way to correct tone errors in teaching mandarin Chinese as a second language?</article-title>,&#x201D; in <source>The Acquisition of Chinese as a Second Language Pronunciation</source>. ed. <person-group person-group-type="editor"><name><surname>Yang</surname> <given-names>C. S.</given-names></name></person-group> (<publisher-loc>Singapore</publisher-loc>: <publisher-name>Springer</publisher-name>), <fpage>163</fpage>&#x2013;<lpage>173</lpage>.</citation></ref>
<ref id="ref53"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Mok</surname> <given-names>P.</given-names></name> <name><surname>Lee</surname> <given-names>C. W.</given-names></name> <name><surname>Yu</surname> <given-names>A. C.</given-names></name></person-group> (<year>2018</year>). <article-title>Perception and production of Cantonese tones by South Asians in Hong Kong</article-title>. <source>In Proceedings of Speech Prosody</source> (pp. <fpage>458</fpage>&#x2013;<lpage>462</lpage>). Pozna&#x0144;, Poland. doi: <pub-id pub-id-type="doi">10.21437/SpeechProsody.2018-93</pub-id></citation></ref>
<ref id="ref54"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Mok</surname> <given-names>P. P.</given-names></name> <name><surname>Zuo</surname> <given-names>D.</given-names></name> <name><surname>Wong</surname> <given-names>P. W.</given-names></name></person-group> (<year>2013</year>). <article-title>Production and perception of a sound change in progress: tone merging in Hong Kong Cantonese</article-title>. <source>Lang. Var. Chang.</source> <volume>25</volume>, <fpage>341</fpage>&#x2013;<lpage>370</lpage>. doi: <pub-id pub-id-type="doi">10.1017/S0954394513000161</pub-id></citation></ref>
<ref id="ref55"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Molnar</surname> <given-names>M.</given-names></name> <name><surname>Polka</surname> <given-names>L.</given-names></name> <name><surname>Baum</surname> <given-names>S.</given-names></name> <name><surname>Menard</surname> <given-names>L.</given-names></name> <name><surname>Steinhauer</surname> <given-names>K.</given-names></name></person-group> (<year>2014</year>). <article-title>Vowel categorization of monolingual and simultaneous bilingual speakers of English and French: effects of language experience and language mode</article-title>. <source>Biling. Lang. Congn.</source> <volume>17</volume>, <fpage>526</fpage>&#x2013;<lpage>541</lpage>. doi: <pub-id pub-id-type="doi">10.1121/1.4784785</pub-id></citation></ref>
<ref id="ref56"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Munro</surname> <given-names>M. J.</given-names></name></person-group> (<year>2008</year>). <article-title>Foreign accent and speech intelligibility</article-title>. <source>Phonology Second Lang. Acquis.</source> <volume>36</volume>, <fpage>193</fpage>&#x2013;<lpage>218</lpage>. doi: <pub-id pub-id-type="doi">10.1075/sibil.36.10mun</pub-id></citation></ref>
<ref id="ref001"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Nakagawa</surname> <given-names>S.</given-names></name> <name><surname>Schielzeth</surname> <given-names>H.</given-names></name></person-group> (<year>2013</year>). <article-title>A general and simple method for obtaining R2 from generalized linear mixed-effects models</article-title>. <source>Methods in ecology and evolution</source> <volume>4</volume>, <fpage>133</fpage>&#x2013;<lpage>142</lpage>. doi: <pub-id pub-id-type="doi">10.1111/j.2041-210x.2012.00261.x</pub-id></citation></ref>
<ref id="ref57"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Navarra</surname> <given-names>J.</given-names></name> <name><surname>Sebasti&#x00E1;n-Gall&#x00E9;s</surname> <given-names>N.</given-names></name> <name><surname>Soto-Faraco</surname> <given-names>S.</given-names></name></person-group> (<year>2005</year>). <article-title>The perception of second language sounds in early bilinguals: new evidence from an implicit measure</article-title>. <source>J. Exp. Psychol. Hum. Percept. Perform.</source> <volume>31</volume>, <fpage>912</fpage>&#x2013;<lpage>918</lpage>. doi: <pub-id pub-id-type="doi">10.1037/0096-1523.31.5.912</pub-id>, PMID: <pub-id pub-id-type="pmid">16262488</pub-id></citation></ref>
<ref id="ref58"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Pallier</surname> <given-names>C.</given-names></name> <name><surname>Colom&#x00E9;</surname> <given-names>A.</given-names></name> <name><surname>Sebasti&#x00E1;n-Gall&#x00E9;s</surname> <given-names>N.</given-names></name></person-group> (<year>2001</year>). <article-title>The influence of native language phonology on lexical access: exemplar-based versus abstract lexical entries</article-title>. <source>Psychol. Sci.</source> <volume>12</volume>, <fpage>445</fpage>&#x2013;<lpage>449</lpage>. doi: <pub-id pub-id-type="doi">10.1111/1467-9280.00383</pub-id>, PMID: <pub-id pub-id-type="pmid">11760129</pub-id></citation></ref>
<ref id="ref59"><citation citation-type="book"><person-group person-group-type="author"><name><surname>Pelzl</surname> <given-names>E.</given-names></name></person-group> (<year>2021</year>). &#x201C;<article-title>Foreign accent in second language mandarin Chinese</article-title>,&#x201D; in <source>The Acquisition of Chinese as a Second Language Pronunciation</source>. ed. <person-group person-group-type="editor"><name><surname>Yang</surname> <given-names>C. S.</given-names></name></person-group> (<publisher-loc>Singapore</publisher-loc>: <publisher-name>Springer</publisher-name>), <fpage>257</fpage>&#x2013;<lpage>279</lpage>.</citation></ref>
<ref id="ref60"><citation citation-type="other"><person-group person-group-type="author"><name><surname>Qin</surname> <given-names>Z.</given-names></name> <name><surname>Mok</surname> <given-names>P.</given-names></name></person-group> (<year>2011</year>). Perception of Cantonese tones by mandarin, English and French speakers. Paper Presented at the 17th International Congress of Phonetic Sciences (pp. 122&#x2013;1657). Hong Kong. doi:<pub-id pub-id-type="doi">10.5539/elt.v9n1p122</pub-id></citation></ref>
<ref id="ref61"><citation citation-type="web"><person-group person-group-type="author"><collab id="coll3">R Development Core Team</collab></person-group> (<year>2008</year>). <source>R: A Language and Environment for Statistical Computing</source>. <publisher-name>R Foundation for Statistical Computing</publisher-name>, <publisher-loc>Vienna, Austria</publisher-loc>. Available at: <ext-link xlink:href="http://www.R-project.org" ext-link-type="uri">http://www.R-project.org</ext-link> (Accessed August 20, 2022).</citation></ref>
<ref id="ref62"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Renwick</surname> <given-names>M. E.</given-names></name> <name><surname>Nadeu</surname> <given-names>M.</given-names></name></person-group> (<year>2019</year>). <article-title>A survey of phonological mid vowel intuitions in central Catalan</article-title>. <source>Lang. Speech</source> <volume>62</volume>, <fpage>164</fpage>&#x2013;<lpage>204</lpage>. doi: <pub-id pub-id-type="doi">10.1177/0023830917749275</pub-id>, PMID: <pub-id pub-id-type="pmid">29313414</pub-id></citation></ref>
<ref id="ref63"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Repp</surname> <given-names>B. H.</given-names></name> <name><surname>Lin</surname> <given-names>H. B.</given-names></name></person-group> (<year>1990</year>). <article-title>Integration of segmental and tonal information in speech perception: a cross-linguistic study</article-title>. <source>J. Phon.</source> <volume>18</volume>, <fpage>481</fpage>&#x2013;<lpage>495</lpage>. doi: <pub-id pub-id-type="doi">10.1016/S0095-4470(19)30410-3</pub-id></citation></ref>
<ref id="ref64"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Saiegh-Haddad</surname> <given-names>E.</given-names></name> <name><surname>Geva</surname> <given-names>E.</given-names></name></person-group> (<year>2008</year>). <article-title>Morphological awareness, phonological awareness, and reading in English&#x2013;Arabic bilingual children</article-title>. <source>Read. Writ.</source> <volume>21</volume>, <fpage>481</fpage>&#x2013;<lpage>504</lpage>. doi: <pub-id pub-id-type="doi">10.1007/s11145-007-9074-x</pub-id></citation></ref>
<ref id="ref65"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Schirmer</surname> <given-names>A.</given-names></name> <name><surname>Tang</surname> <given-names>S. L.</given-names></name> <name><surname>Penney</surname> <given-names>T. B.</given-names></name> <name><surname>Gunter</surname> <given-names>T. C.</given-names></name> <name><surname>Chen</surname> <given-names>H. C.</given-names></name></person-group> (<year>2005</year>). <article-title>Brain responses to segmentally and tonally induced semantic violations in Cantonese</article-title>. <source>J. Cogn. Neurosci.</source> <volume>17</volume>, <fpage>1</fpage>&#x2013;<lpage>12</lpage>. doi: <pub-id pub-id-type="doi">10.1162/0898929052880057</pub-id>, PMID: <pub-id pub-id-type="pmid">15701235</pub-id></citation></ref>
<ref id="ref66"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Sebasti&#x00E1;n-Gall&#x00E9;s</surname> <given-names>N.</given-names></name> <name><surname>Echeverr&#x00ED;a</surname> <given-names>S.</given-names></name> <name><surname>Bosch</surname> <given-names>L.</given-names></name></person-group> (<year>2005</year>). <article-title>The influence of initial exposure on lexical representation: comparing early and simultaneous bilinguals</article-title>. <source>J. Mem. Lang.</source> <volume>52</volume>, <fpage>240</fpage>&#x2013;<lpage>255</lpage>. doi: <pub-id pub-id-type="doi">10.1016/j.jml.2004.11.001</pub-id></citation></ref>
<ref id="ref67"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Shafer</surname> <given-names>V. L.</given-names></name> <name><surname>Yan</surname> <given-names>H. Y.</given-names></name> <name><surname>Datta</surname> <given-names>H.</given-names></name></person-group> (<year>2011</year>). <article-title>The development of English vowel perception in monolingual and bilingual infants: neurophysiological correlates</article-title>. <source>J. Phon.</source> <volume>39</volume>, <fpage>527</fpage>&#x2013;<lpage>545</lpage>. doi: <pub-id pub-id-type="doi">10.1016/j.wocn.2010.11.010</pub-id>, PMID: <pub-id pub-id-type="pmid">22046059</pub-id></citation></ref>
<ref id="ref68"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>So</surname> <given-names>C. K.</given-names></name> <name><surname>Best</surname> <given-names>C. T.</given-names></name></person-group> (<year>2011</year>). <article-title>Categorizing mandarin tones into listeners&#x2019; native prosodic categories: The role of phonetic properties</article-title>. <source>Pozna&#x0144; Stud. Contemp. Linguist.</source> <volume>47</volume>, <fpage>133</fpage>&#x2013;<lpage>145</lpage>. doi: <pub-id pub-id-type="doi">10.2478/psicl-2011-0011</pub-id></citation></ref>
<ref id="ref69"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Steinhauer</surname> <given-names>K.</given-names></name></person-group> (<year>2014</year>). <article-title>Event-related potentials (ERPs) in second language research: a brief introduction to the technique, a selected review, and an invitation to reconsider critical periods in L2</article-title>. <source>Appl. Linguis.</source> <volume>35</volume>, <fpage>393</fpage>&#x2013;<lpage>417</lpage>. doi: <pub-id pub-id-type="doi">10.1093/applin/amu028</pub-id></citation></ref>
<ref id="ref70"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Steinhauer</surname> <given-names>K.</given-names></name> <name><surname>White</surname> <given-names>E. J.</given-names></name> <name><surname>Drury</surname> <given-names>J. E.</given-names></name></person-group> (<year>2009</year>). <article-title>Temporal dynamics of late second language acquisition: evidence from event-related brain potentials</article-title>. <source>Second. Lang. Res.</source> <volume>25</volume>, <fpage>13</fpage>&#x2013;<lpage>41</lpage>. doi: <pub-id pub-id-type="doi">10.1177/0267658308098995</pub-id></citation></ref>
<ref id="ref71"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Strange</surname> <given-names>W.</given-names></name></person-group> (<year>2011</year>). <article-title>Automatic selective perception (asp) of first and second language speech: a working model</article-title>. <source>J. Phon.</source> <volume>39</volume>, <fpage>456</fpage>&#x2013;<lpage>466</lpage>. doi: <pub-id pub-id-type="doi">10.1016/j.wocn.2010.09.001</pub-id></citation></ref>
<ref id="ref72"><citation citation-type="book"><person-group person-group-type="author"><name><surname>Strange</surname> <given-names>W.</given-names></name> <name><surname>Shafer</surname> <given-names>V. L.</given-names></name></person-group> (<year>2008</year>). &#x201C;<article-title>Speech perception in second language learners: the re-education of selective perception</article-title>,&#x201D; in <source>Phonology and Second Language Acquisition</source>. eds. <person-group person-group-type="editor"><name><surname>Hansen Edwards</surname> <given-names>J. G.</given-names></name> <name><surname>Zampini</surname> <given-names>M. L.</given-names></name></person-group> (<publisher-loc>Philadelphia</publisher-loc>: <publisher-name>John Benjamins</publisher-name>), <fpage>153</fpage>&#x2013;<lpage>191</lpage>. </citation></ref>
<ref id="ref73"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Tong</surname> <given-names>Y.</given-names></name> <name><surname>Francis</surname> <given-names>A. L.</given-names></name> <name><surname>Gandour</surname> <given-names>J. T.</given-names></name></person-group> (<year>2008</year>). <article-title>Processing dependencies between segmental and suprasegmental features in mandarin Chinese</article-title>. <source>Lang. Cognit. Process.</source> <volume>23</volume>, <fpage>689</fpage>&#x2013;<lpage>708</lpage>. doi: <pub-id pub-id-type="doi">10.1080/01690960701728261</pub-id></citation></ref>
<ref id="ref74"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Trofimovich</surname> <given-names>P.</given-names></name></person-group> (<year>2008</year>). <article-title>What do second language listeners know about spoken words? Effects of experience and attention in spoken word processing</article-title>. <source>J. Psycholinguist. Res.</source> <volume>37</volume>, <fpage>309</fpage>&#x2013;<lpage>329</lpage>. doi: <pub-id pub-id-type="doi">10.1007/s10936-008-9069-z</pub-id>, PMID: <pub-id pub-id-type="pmid">18330706</pub-id></citation></ref>
<ref id="ref75"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Verbeek</surname> <given-names>L.</given-names></name> <name><surname>Vissers</surname> <given-names>C.</given-names></name> <name><surname>Blumenthal</surname> <given-names>M.</given-names></name> <name><surname>Verhoeven</surname> <given-names>L.</given-names></name></person-group> (<year>2022</year>). <article-title>Cross-language transfer and attentional control in early bilingual speech</article-title>. <source>J. Speech Lang. Hear. Res.</source> <volume>65</volume>, <fpage>450</fpage>&#x2013;<lpage>468</lpage>. doi: <pub-id pub-id-type="doi">10.1044/2021_JSLHR-21-00104</pub-id>, PMID: <pub-id pub-id-type="pmid">35021020</pub-id></citation></ref>
<ref id="ref76"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Whalen</surname> <given-names>D. H.</given-names></name> <name><surname>Xu</surname> <given-names>Y.</given-names></name></person-group> (<year>1992</year>). <article-title>Information for mandarin tones in the amplitude contour and in brief segments</article-title>. <source>Phonetica</source> <volume>49</volume>, <fpage>25</fpage>&#x2013;<lpage>47</lpage>. doi: <pub-id pub-id-type="doi">10.1159/000261901</pub-id>, PMID: <pub-id pub-id-type="pmid">1603839</pub-id></citation></ref>
<ref id="ref77"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>White</surname> <given-names>E. J.</given-names></name> <name><surname>Titone</surname> <given-names>D.</given-names></name> <name><surname>Genesee</surname> <given-names>F.</given-names></name> <name><surname>Steinhauer</surname> <given-names>K.</given-names></name></person-group> (<year>2017</year>). <article-title>Phonological processing in late second language learners: the effects of proficiency and task</article-title>. <source>Biling. Lang. Congn.</source> <volume>20</volume>, <fpage>162</fpage>&#x2013;<lpage>183</lpage>. doi: <pub-id pub-id-type="doi">10.1017/S1366728915000620</pub-id></citation></ref>
<ref id="ref78"><citation citation-type="other"><person-group person-group-type="author"><name><surname>Wong</surname> <given-names>J. W. S.</given-names></name> <name><surname>Arai</surname> <given-names>T.</given-names></name></person-group> (<year>2020</year>). The effects of tonal experience on the categorization of Cantonese lexical tones into Japanese native pitch accent categories. In <italic>Proceedings of 10th International Conference on Speech Prosody 2020</italic> (pp. 484&#x2013;488). doi:10.21437/SpeechProsody.2020-99</citation></ref>
<ref id="ref79"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Wong</surname> <given-names>P.</given-names></name> <name><surname>Leung</surname> <given-names>C. T. T.</given-names></name></person-group> (<year>2018</year>). <article-title>Suprasegmental features are not acquired early: perception and production of monosyllabic Cantonese lexical tones in 4-to 6-year-old preschool children</article-title>. <source>J. Speech Lang. Hear. Res.</source> <volume>61</volume>, <fpage>1070</fpage>&#x2013;<lpage>1085</lpage>. doi: <pub-id pub-id-type="doi">10.1044/2018_JSLHR-S-17-0288</pub-id></citation></ref>
<ref id="ref80"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Wood</surname> <given-names>C. C.</given-names></name></person-group> (<year>1974</year>). <article-title>Parallel processing of auditory and phonetic information in speech discrimination</article-title>. <source>Percept. Psychophys.</source> <volume>15</volume>, <fpage>501</fpage>&#x2013;<lpage>508</lpage>. doi: <pub-id pub-id-type="doi">10.3758/BF03199292</pub-id></citation></ref>
<ref id="ref81"><citation citation-type="book"><person-group person-group-type="author"><name><surname>Wooditch</surname> <given-names>A.</given-names></name> <name><surname>Johnson</surname> <given-names>N. J.</given-names></name> <name><surname>Solymosi</surname> <given-names>R.</given-names></name> <name><surname>Medina</surname> <given-names>J. J.</given-names></name> <name><surname>Langton</surname> <given-names>S.</given-names></name></person-group> (<year>2021</year>). <source>A Beginner&#x2019;s Guide to Statistics for Criminology and Criminal Justice Using R</source>. <publisher-loc>New York</publisher-loc>: <publisher-name>Springer International Publishing</publisher-name>. doi:<pub-id pub-id-type="doi">10.1007/978-3-030-50625-4_6</pub-id></citation></ref>
<ref id="ref82"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Yan</surname> <given-names>H. Y.</given-names></name> <name><surname>Tessel</surname> <given-names>C.</given-names></name> <name><surname>Han</surname> <given-names>X.</given-names></name> <name><surname>Campanelli</surname> <given-names>L.</given-names></name> <name><surname>Vidal</surname> <given-names>N.</given-names></name> <name><surname>Gerometta</surname> <given-names>J.</given-names></name> <etal/></person-group>. (<year>2019</year>). <article-title>Neural indices of vowel discrimination in monolingual and bilingual infants and children</article-title>. <source>Ear Hear.</source> <volume>40</volume>, <fpage>1376</fpage>&#x2013;<lpage>1390</lpage>. doi: <pub-id pub-id-type="doi">10.1097/AUD.0000000000000726</pub-id>, PMID: <pub-id pub-id-type="pmid">31033699</pub-id></citation></ref>
<ref id="ref83"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Yao</surname> <given-names>Y.</given-names></name> <name><surname>Chan</surname> <given-names>A.</given-names></name> <name><surname>Fung</surname> <given-names>R.</given-names></name> <name><surname>Wu</surname> <given-names>W. L.</given-names></name> <name><surname>Leung</surname> <given-names>N.</given-names></name> <name><surname>Lee</surname> <given-names>S.</given-names></name> <etal/></person-group>. (<year>2020</year>). <article-title>Cantonese tone production in pre-school Urdu&#x2013;Cantonese bilingual minority children</article-title>. <source>Int. J. Biling.</source> <volume>24</volume>, <fpage>767</fpage>&#x2013;<lpage>782</lpage>. doi: <pub-id pub-id-type="doi">10.1177/1367006919884659</pub-id></citation></ref>
<ref id="ref84"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Zou</surname> <given-names>T.</given-names></name> <name><surname>Chen</surname> <given-names>Y.</given-names></name> <name><surname>Caspers</surname> <given-names>J.</given-names></name></person-group> (<year>2017</year>). <article-title>The developmental trajectories of attention distribution and segment-tone integration in Dutch learners of mandarin tones</article-title>. <source>Biling. Lang. Congn.</source> <volume>20</volume>, <fpage>1017</fpage>&#x2013;<lpage>1029</lpage>. doi: <pub-id pub-id-type="doi">10.1017/S1366728916000791</pub-id></citation></ref>
</ref-list>
</back>
</article>