<?xml version="1.0" encoding="UTF-8" standalone="no"?>
<!DOCTYPE article PUBLIC "-//NLM//DTD Journal Publishing DTD v2.3 20070202//EN" "journalpublishing.dtd">
<article xml:lang="EN" xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink" article-type="research-article">
<front>
<journal-meta>
<journal-id journal-id-type="publisher-id">Front. Psychol.</journal-id>
<journal-title>Frontiers in Psychology</journal-title>
<abbrev-journal-title abbrev-type="pubmed">Front. Psychol.</abbrev-journal-title>
<issn pub-type="epub">1664-1078</issn>
<publisher>
<publisher-name>Frontiers Media S.A.</publisher-name>
</publisher>
</journal-meta>
<article-meta>
<article-id pub-id-type="doi">10.3389/fpsyg.2022.945973</article-id>
<article-categories>
<subj-group subj-group-type="heading">
<subject>Psychology</subject>
<subj-group>
<subject>Original Research</subject>
</subj-group>
</subj-group>
</article-categories>
<title-group>
<article-title>Phonetic Realizations of Metrical Structure in Tone Languages: Evidence From Chinese Dialects</article-title>
</title-group>
<contrib-group>
<contrib contrib-type="author">
<name><surname>Guo</surname> <given-names>Chengyu</given-names></name>
<uri xlink:href="http://loop.frontiersin.org/people/1777432/overview"/>
</contrib>
<contrib contrib-type="author" corresp="yes">
<name><surname>Chen</surname> <given-names>Fei</given-names></name>
<xref ref-type="corresp" rid="c001"><sup>&#x0002A;</sup></xref>
<uri xlink:href="http://loop.frontiersin.org/people/1026740/overview"/>
</contrib>
</contrib-group>
<aff><institution>School of Foreign Languages, Hunan University</institution>, <addr-line>Changsha</addr-line>, <country>China</country></aff>
<author-notes>
<fn fn-type="edited-by"><p>Edited by: John Archibald, University of Victoria, Canada</p></fn>
<fn fn-type="edited-by"><p>Reviewed by: Jie Deng, Shandong University, China; Chao Zhou, University of Minho, Portugal</p></fn>
<corresp id="c001">&#x0002A;Correspondence: Fei Chen <email>chenfeianthony&#x00040;gmail.com</email></corresp>
<fn fn-type="other" id="fn001"><p>This article was submitted to Language Sciences, a section of the journal Frontiers in Psychology</p></fn></author-notes>
<pub-date pub-type="epub">
<day>13</day>
<month>07</month>
<year>2022</year>
</pub-date>
<pub-date pub-type="collection">
<year>2022</year>
</pub-date>
<volume>13</volume>
<elocation-id>945973</elocation-id>
<history>
<date date-type="received">
<day>17</day>
<month>05</month>
<year>2022</year>
</date>
<date date-type="accepted">
<day>22</day>
<month>06</month>
<year>2022</year>
</date>
</history>
<permissions>
<copyright-statement>Copyright &#x000A9; 2022 Guo and Chen.</copyright-statement>
<copyright-year>2022</copyright-year>
<copyright-holder>Guo and Chen</copyright-holder>
<license xlink:href="http://creativecommons.org/licenses/by/4.0/"><p>This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.</p></license> </permissions>
<abstract>
<p>In tone languages, some case studies showed that the word-level tonal representation was closely related to the underlying metrical pattern. Based on different tonal patterns in prosodic units, the metrical structures could generally be divided into the left- and right-dominant types in Chinese dialects. Yet the cross-dialectal phonetic realizations (e.g., duration and pitch) between or within these two metrical structures were still unrevealed. The current study investigated the duration and pitch realizations of disyllabic prosodic words in Changsha and Chengdu dialects (the left-dominant structure), and in Fuzhou and Xiamen dialects (the right-dominant structure). Results showed that not all the duration patterns across four Chinese dialects were sensitive to different metrical structures, indicating that the duration might not be the universal cue for metrical prominence in Chinese dialects. In terms of pitch realization across all the four Chinese dialects, level tones (sometimes falling tones) generally appeared in the metrically weak unit, while underlying pitch forms appeared in the metrically strong unit. Compared with duration, pitch might be more robust for prosodic realizations of metrical structures in Chinese dialects. Furthermore, there was an interaction between duration and pitch patterns in Chinese dialects, which could shed new light on the phenomenon of &#x0201C;metrical tone sandhi&#x0201D;. Meanwhile, this study also provides some references for the judgment of the metrical stress and prosodic realizations in other Chinese dialects.</p></abstract>
<kwd-group>
<kwd>metrical structure</kwd>
<kwd>Chinese dialects</kwd>
<kwd>pitch</kwd>
<kwd>duration</kwd>
<kwd>metrical tone sandhi</kwd>
</kwd-group>
<contract-sponsor id="cn001">Humanities and Social Sciences Youth Foundation, Ministry of Education of the People&#x0027;s Republic of China<named-content content-type="fundref-id">10.13039/501100017630</named-content></contract-sponsor>
<counts>
<fig-count count="7"/>
<table-count count="7"/>
<equation-count count="1"/>
<ref-count count="74"/>
<page-count count="17"/>
<word-count count="11653"/>
</counts>
</article-meta>
</front>
<body>
<sec sec-type="intro" id="s1">
<title>Introduction</title>
<p>According to the function of prosodic elements at the word level, it is proposed that the world languages can be divided into different types such as tone languages and stress accent languages (Hyman, <xref ref-type="bibr" rid="B29">2006</xref>, <xref ref-type="bibr" rid="B30">2009</xref>). In tone languages like Chinese, the issues concerning tonal inventory and word stress, especially in Mandarin, have been extensively discussed in previous studies (Chao, <xref ref-type="bibr" rid="B8">1968</xref>; Cheng, <xref ref-type="bibr" rid="B13">1973</xref>; Duanmu, <xref ref-type="bibr" rid="B20">2007</xref>; Zhang, <xref ref-type="bibr" rid="B69">2016</xref>; Feng, <xref ref-type="bibr" rid="B21">2017</xref>). However, the metrical structure (Liberman and Prince, <xref ref-type="bibr" rid="B40">1977</xref>) and its phonetic realizations in different Chinese dialects were not fully understood. The current study presents phonetic realizations (i.e., duration and pitch) of left- and right-dominant metrical structures in four Chinese dialects with a cross-dialectal perspective.</p>
<p>It is not uncommon that the metrical structure is closely related to the tonal manifestation at the word level. To be more specific, the high tone (H tone) is more often linked to the metrically strong unit in tone languages such as Ayutla Mixtec (de Lacy, <xref ref-type="bibr" rid="B14">2002</xref>), Kera (Pearce, <xref ref-type="bibr" rid="B47">2006</xref>), and Moro (Jenks and Rose, <xref ref-type="bibr" rid="B32">2011</xref>), as well as in pitch-accent languages such as Nguni (Downing, <xref ref-type="bibr" rid="B16">1990</xref>) and Serbo-Croatian (Inkelas and Zec, <xref ref-type="bibr" rid="B31">1988</xref>). Likewise, in Chinese dialects, some case studies consistently reported that the surface representation of lexical tones might be sensitive to prosodic prominence. Specifically, the surface tone in stressed syllables could be fully realized as its underlying pitch form (Lee, <xref ref-type="bibr" rid="B36">1997</xref>; Kochanski et al., <xref ref-type="bibr" rid="B34">2003</xref>; Sui, <xref ref-type="bibr" rid="B55">2016</xref>), while in the prosodic weak unit undergoes pitch lowering or leveling in Chinese dialects such as Changsha dialect (Zhong, <xref ref-type="bibr" rid="B72">2003</xref>), Chongming dialect (Chen, <xref ref-type="bibr" rid="B11">2000</xref>), Fuzhou dialect (Wright, <xref ref-type="bibr" rid="B59">1983</xref>), and Suzhou dialect (Zhu, <xref ref-type="bibr" rid="B74">2021</xref>). More importantly, this pitch change might result in the reduction of the underlying tonal features, i.e., register and contour (Yip, <xref ref-type="bibr" rid="B65">2002</xref>), and even trigger tonal merger in metrically weak mora or syllable. This tonal alternation could be named &#x0201C;metrical tone sandhi&#x0201D; (Zeng and Niu, <xref ref-type="bibr" rid="B68">2006</xref>). Accordingly, the metrical structure of Chinese dialects could be categorized into the left- and right-dominant types based on the different pitch forms (citation/sandhi form) in metrically weak and strong positions (Yue-Hashimoto, <xref ref-type="bibr" rid="B67">1987</xref>; Zhang, <xref ref-type="bibr" rid="B70">2007</xref>).</p>
<p>Overall, the consensus reached in previous studies showed that the surface pitch realization was the key correlate of metrical structure in Chinese dialects. Besides, the duration might also act as a phonetic parameter indicating the prosodic strength in Mandarin (Chen and Xu, <xref ref-type="bibr" rid="B12">2006</xref>; Xu, <xref ref-type="bibr" rid="B62">2009</xref>). Although there was a dispute about the existence of prosodic contrast in Mandarin disyllabic words (Hoa, <xref ref-type="bibr" rid="B26">1983</xref>; Duanmu, <xref ref-type="bibr" rid="B18">1993</xref>; Xu and Wang, <xref ref-type="bibr" rid="B63">2009</xref>; Zhang, <xref ref-type="bibr" rid="B69">2016</xref>, <xref ref-type="bibr" rid="B71">2021</xref>; Feng, <xref ref-type="bibr" rid="B21">2017</xref>), the typical left-dominant (strong-weak) pattern was found in neutral-toned words with a long-short duration pattern (see 1a). The acoustic cue of the neutral-toned syllable in Mandarin is comparable to that of the unstressed syllable in English (Chen and Xu, <xref ref-type="bibr" rid="B12">2006</xref>; Xu, <xref ref-type="bibr" rid="B62">2009</xref>). In contrast, under the right-dominant structure, the duration might exhibit a short-long pattern (see 1b), since this duration pattern has been mentioned in the right-dominant dialects with impressionistic descriptions (Wright, <xref ref-type="bibr" rid="B59">1983</xref>; Chan, <xref ref-type="bibr" rid="B6">1985</xref>). Additionally, a short-long duration pattern is the canonical type for the iambic foot, according to the Iambic/Trochaic Law (Hayes, <xref ref-type="bibr" rid="B25">1995</xref>). The question then arises whether or not the duration pattern in disyllabic words is also sensitive to the metrical structures across other Chinese dialects. To answer this question, more comprehensive research is needed to validate whether different duration patterns in Chinese dialects correspond with different metrical structures.</p>
<p><inline-graphic xlink:href="fpsyg-13-945973-i0001.tif"/></p>
<p><italic>Note</italic>. &#x0201C;&#x02014;&#x0201D; stands for a relatively longer duration, and &#x0201C;-&#x0201D; stands for a relatively shorter duration; &#x0201C;T&#x0201D; stands for the underlying pitch form, and &#x0201C;t&#x0201D; stands for the sandhi form.</p>
<p>Theoretically, the phonetic realization we speculated in (1) is seemingly symmetrical between two types of metrical structures. In terms of pitch realization, the underlying pitch form and sandhi form generally appear at &#x003C3;(1) and &#x003C3;(2), respectively, under the left-dominant structure, while occurring at &#x003C3;(2) and &#x003C3;(1) under the right-dominant structure. Likewise, in terms of duration realization, a long-short pattern and a short-long pattern symmetrically occur in the left- and right-dominant structures, respectively, according to our prediction (1). However, the actual phonetic realizations across Chinese dialects might be complex and diverse. It was reported that the underlying tone in the initial metrically strong syllable might spread rightwards to the weak syllable in the left-dominant structure [e.g., /k&#x003B1; s&#x000E6;/ (artificial mountain) [51 33] &#x02192; [53 31] in Tangsic dialect; Kennedy, <xref ref-type="bibr" rid="B33">1953</xref>]. In this case, the surface tonal representation of two syllables in the left-dominant structure also could be the sandhi form. In other words, the pitch realizations between two metrical structures could be asymmetrical in Chinese dialects (Duanmu, <xref ref-type="bibr" rid="B19">1995</xref>; Zhang, <xref ref-type="bibr" rid="B70">2007</xref>). Furthermore, some cross-linguistic research has detected the durational asymmetry (Hayes, <xref ref-type="bibr" rid="B25">1995</xref>; Gordon et al., <xref ref-type="bibr" rid="B23">2018</xref>), that is, equally matched duration pattern (i.e., long-long or short-short pattern) under the trochaic foot (left-dominant structure), whereas the short-long duration pattern under the iambic foot (right-dominant structure). Overall, (1a) and (1b) are the ideally symmetrical realizations between two metrical types based on our assumption. Nevertheless, given various phonetic realizations for the left-dominant structure as reported in the literature, we wonder whether there are other diverse phonetic realizations for the right-dominant structure. Thus, to reveal the diversity of phonetic realizations for two types of metrical structures in Chinese dialects, a cross-dialectal investigation was conducted in the current study.</p>
<p>As mentioned above, different surface tonal representations between metrically weak and strong units have been seen as the key indicator of the metrical structure in Chinese dialects. The term &#x0201C;metrical tone sandhi&#x0201D; (Zeng and Niu, <xref ref-type="bibr" rid="B68">2006</xref>) seems to be suitable to depict the phonological phenomenon that tone sandhi usually occurs in the prosodic weak unit. Still, some related issues remained understudied. For instance, it is unknown how the underlying tone interacts with different prosodic units specifically. Besides, the analyses of metrical tone sandhi in Chinse dialects were generally based on perceptual judgment and phonological description (Yue-Hashimoto, <xref ref-type="bibr" rid="B67">1987</xref>; Zeng and Niu, <xref ref-type="bibr" rid="B68">2006</xref>; Zhang, <xref ref-type="bibr" rid="B70">2007</xref>). To tackle these problems, more empirical research and phonetic analyses should be carried out. Recently, a fine-grained method [Growth Curve Analysis (hereafter &#x0201C;GCA&#x0201D;); Mirman, <xref ref-type="bibr" rid="B45">2014</xref>] of analyzing pitch contour has been introduced (Shi et al., <xref ref-type="bibr" rid="B53">2020</xref>). The GCA could be used to compare the fine-grained differences over time in terms of pitch height, pitch slope, and pitch curvature. Therefore, GCA offers us a valuable chance to validate pitch realizations of metrical tone sandhi at the phonetic level.</p>
<p>The current study aimed at illustrating diverse phonetic realizations (i.e., duration and pitch) of disyllabic words under two metrical structures in Chinese by cross-dialectal comparisons. Two dialects under each left- and right-dominant structure (four dialects in total) were chosen since previous studies have proposed their metrical structures according to the tonal representation. Specifically, the representatives of the left-dominant structure were Changsha dialect (Zhong, <xref ref-type="bibr" rid="B72">2003</xref>; Lin, <xref ref-type="bibr" rid="B42">2011</xref>) and Chengdu dialect (Lin, <xref ref-type="bibr" rid="B41">2006</xref>; Qin, <xref ref-type="bibr" rid="B49">2012</xref>). In these two dialects, the underlying tone usually undergoes tone sandhi in the final position of disyllabic words. It should be noted that Changsha dialect also seems to show the right-dominant structure, with the tonal process occurring at the first syllable. However, this pattern is only limited to a few grammatical categories and beyond the scope of our study. Moreover, the chosen dialects under the right-dominant structure were Fuzhou dialect (Wright, <xref ref-type="bibr" rid="B59">1983</xref>; Chan, <xref ref-type="bibr" rid="B6">1985</xref>) and Xiamen dialect (Yue-Hashimoto, <xref ref-type="bibr" rid="B66">1986</xref>; Hsieh, <xref ref-type="bibr" rid="B27">2005</xref>), in which tone sandhi occurs at the initial position of disyllabic words.</p>
<p>It should be noted that the genetic classification is different among the four Chinese dialects. According to Li et al. (<xref ref-type="bibr" rid="B38">1987</xref>) and Kurpaska (<xref ref-type="bibr" rid="B35">2010</xref>), Chengdu dialect belongs to the Southwestern Mandarin group, while the other three are classified into the southern Chinese dialects. To be more specific, Changsha dialect belongs to the Xiang dialect group; Fuzhou and Xiamen dialects belong to the Min dialect group. The classification and geographic distribution of the four dialects are shown in <xref ref-type="fig" rid="F1">Figure 1</xref>.</p>
<fig id="F1" position="float">
<label>Figure 1</label>
<caption><p>The classification <bold>(A)</bold> and geographic distribution <bold>(B)</bold> of four Chinese dialects in this study.</p></caption>
<graphic mimetype="image" mime-subtype="tiff" xlink:href="fpsyg-13-945973-g0001.tif"/>
</fig>
<p>Furthermore, the tone inventories of the four dialects are quite different (see <xref ref-type="table" rid="T1">Table 1</xref>). Chengdu dialect has four lexical tones (Qin, <xref ref-type="bibr" rid="B49">2012</xref>); Changsha dialect has six (Zhong, <xref ref-type="bibr" rid="B72">2003</xref>); both Fuzhou (Donohue, <xref ref-type="bibr" rid="B15">2013</xref>) and Xiamen dialects (Chen, <xref ref-type="bibr" rid="B10">1987</xref>) have seven. According to previous studies of phonological description (Chen, <xref ref-type="bibr" rid="B10">1987</xref>; Zhong, <xref ref-type="bibr" rid="B72">2003</xref>; Qin, <xref ref-type="bibr" rid="B49">2012</xref>; Donohue, <xref ref-type="bibr" rid="B15">2013</xref>), the relative tone values of the four dialects can be seen in <xref ref-type="table" rid="T1">Table 1</xref>. In Changsha and Chengdu dialects, their tonal inventories only include contour tones (i.e., rising or falling tones). Tone 6 and Tone 7 in Fuzhou and Xiamen dialects are checked tones, which are shorter than other lexical tones perceptually. In addition to contour tones, there is only one level tone in Fuzhou dialect, but two in Xiamen dialect. Apart from the checked tones, the syllable structures of Changsha, Fuzhou, and Xiamen dialects are similar, and tone-bearing units in these dialects are generally biomoraic rimes (Duanmu, <xref ref-type="bibr" rid="B17">1990</xref>). The digits in <xref ref-type="table" rid="T1">Table 1</xref> refer to lexical tone transcriptions in Chao&#x00027;s five-scale tone letters (Chao, <xref ref-type="bibr" rid="B7">1930</xref>), with 5 being the highest and 1 being the lowest &#x0201C;relative&#x0201D; pitch level of a speaker&#x00027;s normalized pitch range.</p>
<table-wrap position="float" id="T1">
<label>Table 1</label>
<caption><p>The five-scale tone letters and tone features of lexical tone inventories among four Chinese dialects.</p></caption>
<table frame="hsides" rules="groups">
<thead><tr>
<th valign="top" align="left"><bold>Dialects</bold></th>
<th valign="top" align="center"><bold>Tone 1</bold></th>
<th valign="top" align="center"><bold>Tone 2</bold></th>
<th valign="top" align="center"><bold>Tone 3</bold></th>
<th valign="top" align="center"><bold>Tone 4</bold></th>
<th valign="top" align="center"><bold>Tone 5</bold></th>
<th valign="top" align="center"><bold>Tone 6</bold></th>
<th valign="top" align="center"><bold>Tone 7</bold></th>
</tr>
</thead>
<tbody>
<tr>
<td valign="top" align="left">Chengdu</td>
<td valign="top" align="center">35</td>
<td valign="top" align="center">31</td>
<td valign="top" align="center">53</td>
<td valign="top" align="center">13</td>
<td/>
<td/>
<td/>
</tr>
<tr>
<td/>
<td valign="top" align="center">High-rising</td>
<td valign="top" align="center">Low-falling</td>
<td valign="top" align="center">High-falling</td>
<td valign="top" align="center">Low-rising</td>
<td/>
<td/>
<td/>
</tr>
<tr>
<td valign="top" align="left">Changsha</td>
<td valign="top" align="center">34</td>
<td valign="top" align="center">13</td>
<td valign="top" align="center">42</td>
<td valign="top" align="center">45</td>
<td valign="top" align="center">21</td>
<td valign="top" align="center"><underline>14</underline></td>
<td/>
</tr>
<tr>
<td/>
<td valign="top" align="center">Mid-rising</td>
<td valign="top" align="center">Low-rising</td>
<td valign="top" align="center">High-falling</td>
<td valign="top" align="center">High-rising</td>
<td valign="top" align="center">Low-falling</td>
<td valign="top" align="center">Low-rising</td>
<td/>
</tr>
<tr>
<td valign="top" align="left">Fuzhou</td>
<td valign="top" align="center">44</td>
<td valign="top" align="center">51</td>
<td valign="top" align="center">32</td>
<td valign="top" align="center">21</td>
<td valign="top" align="center">231</td>
<td valign="top" align="center"><underline>23</underline></td>
<td valign="top" align="center"><underline>55</underline></td>
</tr>
<tr>
<td/>
<td valign="top" align="center">High-level</td>
<td valign="top" align="center">High-falling</td>
<td valign="top" align="center">Mid-falling</td>
<td valign="top" align="center">Low-falling</td>
<td valign="top" align="center">Low-peaking</td>
<td valign="top" align="center">Mid-short</td>
<td valign="top" align="center">High-short</td>
</tr>
<tr>
<td valign="top" align="left">Xiamen</td>
<td valign="top" align="center">44</td>
<td valign="top" align="center">24</td>
<td valign="top" align="center">53</td>
<td valign="top" align="center">21</td>
<td valign="top" align="center">22</td>
<td valign="top" align="center"><underline>32</underline></td>
<td valign="top" align="center"><underline>44</underline></td>
</tr>
<tr>
<td/>
<td valign="top" align="center">High-level</td>
<td valign="top" align="center">Mid-rising</td>
<td valign="top" align="center">High-falling</td>
<td valign="top" align="center">Low-falling</td>
<td valign="top" align="center">Low-level</td>
<td valign="top" align="center">Mid-short</td>
<td valign="top" align="center">High-short</td>
</tr>
</tbody>
</table>
<table-wrap-foot>
<p><italic>The underline indicates the short duration of the checked tone</italic>.</p>
</table-wrap-foot>
</table-wrap>
<p>In the current study, disyllabic prosodic words in four Chinese dialects were investigated to manifest the binary contrast of metrically weak and strong units, since two syllables could constitute the most natural and standard foot in Chinese (Feng, <xref ref-type="bibr" rid="B21">2017</xref>). Overall, the present study aims to answer the following research questions: (a) Beyond the correlate of pitch, is the duration of disyllabic words sensitive to metrical prominence in Chinese dialects? (b) Are there cross-dialectal differences in duration and pitch realizations under the same metrical structure (Left-dominant: Changsha dialect vs. Chengdu dialect; Right-dominant: Fuzhou dialect vs. Xiamen dialect)? (c) Could the previously proposed metrical tone sandhi among these four Chinese dialects be validated by the fine-grained analysis of GCA?</p>
<p>Accordingly, we proposed three predictions based on the previous studies as follows: Hypothesis 1 (H1): In the left-dominant structure, a long-short duration pattern may be found, similar to the pattern of neutral tones in Mandarin. Besides, the right-dominant structure might exhibit a short-long duration pattern. Hypothesis 2 (H2): Given the diversity of Chinese dialects, the pitch and duration patterns within the same metrical structure might be generally similar, but not identical. Hypothesis 3 (H3): The statistical result of GCA might corroborate the previous impressionistic description of the metrical tone sandhi.</p></sec>
<sec sec-type="methods" id="s2">
<title>Methods</title>
<sec>
<title>Participants</title>
<p>Five local participants in each dialect were recruited as the representative speakers of Changsha dialect (<italic>M</italic><sub><italic>age</italic></sub> = 62.00 yrs., <italic>SD</italic> = 6.36 yrs.; 2 females, 3 males), Chengdu dialect (<italic>M</italic><sub><italic>age</italic></sub> = 58.60 yrs., <italic>SD</italic> = 7.33 yrs.; 1 female, 4 males), Fuzhou dialect (<italic>M</italic><sub><italic>age</italic></sub> = 62.40 yrs., <italic>SD</italic> = 6.50 yrs.; 2 females, 3 males), and Xiamen dialect (<italic>M</italic><sub><italic>age</italic></sub> = 61.40 yrs., <italic>SD</italic> = 8.73 yrs.; 2 females, 3 males). In total, 20 participants took part in this experiment. Consistent with the traditional manner of in-depth field investigation, we only chose the participants aged 50 and older as representatives of each dialect. The reason is that the phonology of nowadays young people is often greatly influenced by Beijing Mandarin (Yao and Chang, <xref ref-type="bibr" rid="B64">2016</xref>).</p>
<p>All participants were born and raised in the downtown of the local cities without the experience of traveling outside for over 6 months. According to the self-report, they only acquired their native dialect without the experience of other Chinese dialects or foreign languages. This effectively avoided potentially prosodic influences from other dialects/languages (Archibald, <xref ref-type="bibr" rid="B1">2009</xref>). Besides, they did not self-report any speech disorder or hearing impairment. Before the elicitation task <italic>via</italic> text prompts, we confirmed that all participants could read the text of Chinese characters normally. After the experiment, each participant was paid the equivalent of 15 USD in local currency for their travel and time.</p></sec>
<sec>
<title>Stimuli</title>
<p>The experimental stimuli, listed in <xref ref-type="supplementary-material" rid="SM1">Supplementary Table 1</xref> (due to its overlength), were disyllabic prosodic words partially chosen from the word list (Guo, <xref ref-type="bibr" rid="B24">2020</xref>) designed to investigate tone realizations from a cross-dialectal perspective. The lexical items were compound words including nouns, verbs, and adjectives, which were frequently spoken words [such as &#x0201C;&#x05DE5;&#x04EBA;&#x0201D; (worker)] across all the four dialects. Besides, we also selected some colloquial words in each dialect from previous studies as supplements, such as /ts<sup>h</sup>u53 ting53/ (rooftop) used in Xiamen dialect only (Chen, <xref ref-type="bibr" rid="B10">1987</xref>). Therefore, in the current study, we used a mixed word list with both dialect-universal and dialect-specific words.</p>
<p>To make sure that the tone sandhi could be comprehensively analyzed, the disyllabic lexical items contained all the possible tonal combinations in each dialect, and five lexical items were chosen under each tonal combination as tokens (<xref ref-type="supplementary-material" rid="SM1">Supplementary Table 1</xref>). Since the number of lexical tones is different in the four dialects, the number of tonal combinations is also different. For instance, in the four-toned Chengdu dialect, the number of tonal combinations is 4<sup>&#x0002A;</sup>4 = 16, with a total of 16<sup>&#x0002A;</sup>5 = 80 lexical items. As for Changsha dialect, the total experimental items were 6<sup>&#x0002A;</sup>6<sup>&#x0002A;</sup>5 = 180 words. In Fuzhou and Xiamen dialects, however, the Tones 6, 7 belong to the checked tones (or &#x0201C;Ru Tones&#x0201D;). First, they are naturally shorter than other lexical tones (Tones 1&#x02013;5) in terms of duration. Moreover, the checked tones are also different from other lexical tones with unique coda such as glottal stop /-<inline-graphic xlink:href="fpsyg-13-945973-i0008.tif"/>/ or consonant stops (/-p/, /-t/, and /-k/). When they precede other lexical tones, its coda may drop which causes the compensatory lengthening of the duration (Chen and Norman, <xref ref-type="bibr" rid="B9">1965</xref>). To control variables and make duration analysis more comparable, the checked tones (Tone 6 and Tone 7) in Fuzhou and Xiamen dialects were not included in the current study. Thus, there were 5<sup>&#x0002A;</sup>5<sup>&#x0002A;</sup>5 = 125 lexical items for both Fuzhou and Xiamen dialects.</p></sec>
<sec>
<title>Recording Procedures</title>
<p>The recordings were conducted in quiet rooms located in Changsha, Chengdu, Fuzhou, and Xiamen city, respectively. To record high-quality audio samples, we used the cardioid microphone (AKG-C554L) connected to a USB audio interface (iCON4 nano VST). The recording for each dialect was conducted with a relatively low-level environmental noise (under 30 dB SPL). Before the formal recording, all the lexical items were shown to participants to familiarize them with the recording materials. Besides, a pilot investigation was carried out to confirm that five disyllabic prosodic words under each tonal combination showed the same tonal representation.</p>
<p>To control the pronunciation variables, a carrier sentence &#x0201C;&#x08FD9;&#x0662F;__ (This is __)&#x0201D; was used before the target word. Participants were asked to produce the pre-target carrier sentence and the target word in a natural manner. All the target words were presented in a random order among participants. Specifically, participants could see the Chinese characters and their related lexical meanings of the target word on a laptop screen. Then, they spontaneously produced both the carrier and target word three times based on these prompts. All the participants of each dialect correctly uttered the target words, and the signals were saved in a WAV format with a sampling rate of 44.1 kHz.</p>
<p>After the collection of raw data, all the sound files were processed by the Praat (v.6.0.26) (Boersma and Weenink, <xref ref-type="bibr" rid="B5">2021</xref>) on a PC laptop. Due to the occasionally poor sound quality (i.e., creaky voice) spoken by some participants, the pitch tracking might drop out throughout the syllable. To obtain more reliable results, only the best sound file with continuous pitch contours (exhibited in Praat) for each lexical item was included in the statistical analysis. In some cases, all the three recorded samples of the target word showed pitch-tracking failure, then we manually fixed the pitch tracking by the pitch tier function in Praat (Styler, <xref ref-type="bibr" rid="B54">2011</xref>). In sum, there were 900 analyzed tokens in Changsha dialect (5 speakers &#x000D7; 180 words), 400 tokens in Chengdu dialect (5 speakers &#x000D7; 80 words), and 625 tokens in both Fuzhou dialect (5 speakers &#x000D7; 125 words) and Xiamen dialect (5 speakers &#x000D7; 125 words), with 2,550 analyzed tokens in total.</p></sec>
<sec>
<title>Measurement and Data Analysis</title>
<p>The duration of each syllable was measured from the finals such as vowel (V), nasal rhyme (VN), and rhyme with glide (GV), since these different finals in Mandarin would not cause a significant duration difference (Wu and Kenstowicz, <xref ref-type="bibr" rid="B60">2015</xref>). These finals were manually identified based on the spectrogram information in Praat, i.e., the onset and offset of the second formant (F2) within finals (Turk et al., <xref ref-type="bibr" rid="B57">2006</xref>). Then, the raw duration was normalized for each participant with the z-score method according to Rose (<xref ref-type="bibr" rid="B51">1987</xref>). Since the duration is often skewed in distribution, the normalized duration was log-transformed as the dependent variable when entering the statistical models. In addition, to compare duration contrast among four dialects, we also calculated the &#x003C3;(1) to &#x003C3;(2) mean ratio of the absolute duration.</p>
<p>The fundamental frequency (F0) was extracted from each manually labeled syllable, in which 11 equal-distance points for the pitch trajectory were outputted. These F0 points were further checked and manually corrected for any &#x0201C;pitch-halving&#x0201D; or &#x0201C;pitch-doubling&#x0201D; errors which are detected when the determined F0 value is 20% higher or lower than the reference F0 value (Sun, <xref ref-type="bibr" rid="B56">2002</xref>). Then, the raw F0 values (in Hz) by each participant were transformed into the logarithmic z-score values to eliminate individual differences in pitch range (Zhu, <xref ref-type="bibr" rid="B73">2010</xref>).</p>
<p>All acoustic data were analyzed using R (R Core Team, <xref ref-type="bibr" rid="B50">2020</xref>). To compare the duration between two consecutive syllables, a one-way ANOVA was conducted for each dialect. Moreover, linear mixed-effect models were conducted with the lme4 package (Bates et al., <xref ref-type="bibr" rid="B4">2015</xref>). The <italic>p</italic>-values of fixed factors and their interaction were obtained with a type-II ANOVA using Wald chi-square tests <italic>via</italic> the car package (Fox and Weisberg, <xref ref-type="bibr" rid="B22">2019</xref>). Furthermore, second-order orthogonal polynomials were built to compare three parameters of pitch contours (Mirman, <xref ref-type="bibr" rid="B45">2014</xref>): the intercept term (i.e., overall pitch height), the first-order linear term (i.e., pitch slope), and the second-order quadratic term (i.e., pitch curvature).</p>
<p>The random slopes and intercept were incorporated in all models to make it generalizable across data maximally (Barr et al., <xref ref-type="bibr" rid="B2">2013</xref>). Then, model comparisons were conducted to find out the best-fit model based on the Akaike information criterion (AIC) using the MuMIn package (Barto&#x00144;, <xref ref-type="bibr" rid="B3">2022</xref>). When a significant main effect of a multilevel factor or a significant interaction effect was found, <italic>post-hoc</italic> pairwise comparisons were performed by using the lsmeans package (Lenth, <xref ref-type="bibr" rid="B37">2016</xref>) with Tukey adjustment.</p></sec></sec>
<sec sec-type="results" id="s3">
<title>Results</title>
<p><xref ref-type="fig" rid="F2">Figure 2</xref> shows the distribution of normalized duration of &#x003C3;(1) and &#x003C3;(2) in Four dialects, and <xref ref-type="fig" rid="F3">Figure 3</xref> further shows the normalized duration under different tonal categories classified by &#x003C3;(2) (Changsha and Chengdu dialects) and &#x003C3;(1) (Fuzhou and Xiamen dialect). The specific values from <xref ref-type="fig" rid="F3">Figure 3</xref> are listed in <xref ref-type="table" rid="T2">Table 2</xref>.</p>
<fig id="F2" position="float">
<label>Figure 2</label>
<caption><p>The normalized duration of &#x003C3;(1) and &#x003C3;(2) in disyllabic words of Changsha <bold>(A)</bold>, Chengdu <bold>(B)</bold>, Fuzhou <bold>(C)</bold>, and Xiamen dialect <bold>(D)</bold>.</p></caption>
<graphic mimetype="image" mime-subtype="tiff" xlink:href="fpsyg-13-945973-g0002.tif"/>
</fig>
<fig id="F3" position="float">
<label>Figure 3</label>
<caption><p>The normalized duration of &#x003C3;(1) and &#x003C3;(2) with different tonal categories in Changsha <bold>(A)</bold>, Chengdu <bold>(B)</bold>, Fuzhou <bold>(C)</bold>, and Xiamen dialect <bold>(D)</bold>. Asterisks (&#x0002A;&#x0002A;&#x0002A;) stand for <italic>p</italic> &#x0003C; 0.001. The &#x0201C;n.s.&#x0201D; stands for a <italic>p</italic>-value higher than 0.05.</p></caption>
<graphic mimetype="image" mime-subtype="tiff" xlink:href="fpsyg-13-945973-g0003.tif"/>
</fig>
<table-wrap position="float" id="T2">
<label>Table 2</label>
<caption><p>The mean normalized duration (in z-score) and standard deviation (in bracket) with different tonal categories in &#x003C3;(2) in Changsha and Chengdu dialects, whereas tonal categories in &#x003C3;(1) in Fuzhou and Xiamen dialects.</p></caption>
<table frame="hsides" rules="groups">
<thead><tr>
<th valign="top" align="left"><bold>Tonal category</bold></th>
<th valign="top" align="center" colspan="2"><bold>Changsha</bold></th>
<th valign="top" align="center" colspan="2"><bold>Chengdu</bold></th>
<th valign="top" align="center"><bold>Tonal category</bold></th>
<th valign="top" align="center" colspan="2"><bold>Fuzhou</bold></th>
<th valign="top" align="center" colspan="2"><bold>Xiamen</bold></th>
</tr>
<tr>
<th/>
<th valign="top" align="center"><bold>&#x003C3;(1)</bold></th>
<th valign="top" align="center"><bold>&#x003C3;(2)</bold></th>
<th valign="top" align="center"><bold>&#x003C3;(1)</bold></th>
<th valign="top" align="center"><bold>&#x003C3;(1)</bold></th>
<th/>
<th valign="top" align="center"><bold>&#x003C3;(1)</bold></th>
<th valign="top" align="center"><bold>&#x003C3;(2)</bold></th>
<th valign="top" align="center"><bold>&#x003C3;(1)</bold></th>
<th valign="top" align="center"><bold>&#x003C3;(2)</bold></th>
</tr>
</thead>
<tbody>
<tr>
<td valign="top" align="left">TX-T1</td>
<td valign="top" align="center">0.68 (0.70)</td>
<td valign="top" align="center">&#x02212;0.57 (0.57)</td>
<td valign="top" align="center">0.10 (1.15)</td>
<td valign="top" align="center">0.03 (1.04)</td>
<td valign="top" align="center">T1-TX</td>
<td valign="top" align="center">&#x02212;0.92 (0.35)</td>
<td valign="top" align="center">0.73 (0.54)</td>
<td valign="top" align="center">&#x02212;0.75 (0.63)</td>
<td valign="top" align="center">0.43 (0.92)</td>
</tr>
<tr>
<td valign="top" align="left">TX-T2</td>
<td valign="top" align="center">0.85 (0.75)</td>
<td valign="top" align="center">&#x02212;0.41 (0.67)</td>
<td valign="top" align="center">0.13 (0.86)</td>
<td valign="top" align="center">&#x02212;0.12 (0.93)</td>
<td valign="top" align="center">T2-TX</td>
<td valign="top" align="center">&#x02212;0.70 (0.43)</td>
<td valign="top" align="center">1.05 (0.49)</td>
<td valign="top" align="center">&#x02212;0.51 (0.81)</td>
<td valign="top" align="center">0.67 (0.92)</td>
</tr>
<tr>
<td valign="top" align="left">TX-T3</td>
<td valign="top" align="center">0.62 (0.74)</td>
<td valign="top" align="center">&#x02212;0.67 (0.49)</td>
<td valign="top" align="center">&#x02212;0.09 (1.04)</td>
<td valign="top" align="center">&#x02212;0.20 (0.90)</td>
<td valign="top" align="center">T3-TX</td>
<td valign="top" align="center">&#x02212;0.82 (0.48)</td>
<td valign="top" align="center">1.03 (0.52)</td>
<td valign="top" align="center">&#x02212;0.56 (0.74)</td>
<td valign="top" align="center">0.46 (1.01)</td>
</tr>
<tr>
<td valign="top" align="left">TX-T4</td>
<td valign="top" align="center">0.80 (0.75)</td>
<td valign="top" align="center">&#x02212;1.05 (0.51)</td>
<td valign="top" align="center">0.13 (1.06)</td>
<td valign="top" align="center">0.04 (0.97)</td>
<td valign="top" align="center">T4-TX</td>
<td valign="top" align="center">&#x02212;0.92 (0.42)</td>
<td valign="top" align="center">0.80 (0.59)</td>
<td valign="top" align="center">&#x02212;0.55 (0.65)</td>
<td valign="top" align="center">0.47 (0.85)</td>
</tr>
<tr>
<td valign="top" align="left">TX-T5</td>
<td valign="top" align="center">0.72 (0.84)</td>
<td valign="top" align="center">&#x02212;0.88 (0.59)</td>
<td/>
<td/>
<td valign="top" align="center">T5-TX</td>
<td valign="top" align="center">&#x02212;1.01 (0.37)</td>
<td valign="top" align="center">0.76 (0.51)</td>
<td valign="top" align="center">&#x02212;0.39 (0.64)</td>
<td valign="top" align="center">0.73 (1.02)</td>
</tr>
<tr>
<td valign="top" align="left">TX-T6</td>
<td valign="top" align="center">0.71 (0.72)</td>
<td valign="top" align="center">&#x02212;0.78 (0.61)</td>
<td/>
<td/>
<td/>
<td/>
<td/>
<td/>
<td/>
</tr>
</tbody>
</table>
<table-wrap-foot>
<p><italic>&#x003C3;(1) = the initial syllable; &#x003C3;(2) = the final syllable</italic>.</p>
</table-wrap-foot>
</table-wrap>
<sec>
<title>Left-Dominant Structure: Changsha Dialect</title>
<sec>
<title>Duration Realization</title>
<p>The distribution of normalized duration of both syllables is shown in <xref ref-type="fig" rid="F2">Figure 2A</xref>, indicating that the initial syllable &#x003C3;(1) tended to be longer than the final syllable &#x003C3;(2) in Changsha dialect. One-way ANOVA showed a significant difference in duration between two syllables [<italic>F</italic><sub>(1, 1, 798)</sub> = 2,033, <italic>p</italic> &#x0003C; 0.001]. Specifically, the &#x003C3;(1) to &#x003C3;(2) mean ratio of the absolute duration was about 1.53 (<italic>SD</italic> = 0.48) in Changsha dialect.</p>
<p>Furthermore, <xref ref-type="table" rid="T2">Table 2</xref> shows the mean values and standard deviations of normalized durations in two syllables when the &#x003C3;(2) carries different tonal categories. For example, TX-T1 stands for 6 tonal combinations: T1-T1, T2-T1, T3-T1, T4-T1, T5-T1, and T6-T1 (&#x0201C;T1&#x0201D; stands for &#x0201C;Tone 1,&#x0201D; etc. Abbreviations will be used below). As can be seen, all the mean normalized durations of &#x003C3;(2) were negative values, while those of &#x003C3;(1) were positive. Thus, compared with &#x003C3;(1), the duration in &#x003C3;(2) was phonetically reduced.</p>
<p>A linear mixed-effect regression model was constructed to test the normalized duration (logarithmic scale) difference of two syllables among the 6 tonal categories in &#x003C3;(2). There were two fixed factors; one was <italic>syllable</italic> [&#x003C3;(1) and &#x003C3;(2)], and the other was <italic>tonal category</italic> (TX-T1, TX-T2, TX-T3, TX-T4, TX-T5, and TX-T6). The <italic>participant</italic> (5 individuals) and <italic>word</italic> (180 words) were included as the random factors. The model comparison only showed a significant main effect of <italic>syllable</italic> [&#x003C7;<sup>2</sup>(1) = 12.91, <italic>p</italic> &#x0003C; 0.001], while the effect of <italic>tonal category</italic> [&#x003C7;<sup>2</sup>(5) = 2.63, <italic>p</italic> = 0.756] and the interaction effect of <italic>syllable</italic> &#x000D7; <italic>tonal category</italic> [&#x003C7;<sup>2</sup>(5) = 4.46, <italic>p</italic> = 0.486] failed to reach significance. These results indicated that the duration contrast was significant regardless of tonal categories in Changsha dialect (see <xref ref-type="fig" rid="F3">Figure 3A</xref>).</p></sec>
<sec>
<title>Pitch Realization</title>
<p>The pitch realizations of all the tonal combinations in Changsha dialect are listed in <xref ref-type="table" rid="T3">Table 3</xref>. As can be seen, the sandhi form usually appeared in &#x003C3;(2), while the &#x003C3;(1) generally maintained the original pitch values (except for T3 [42] &#x02192; [44]). In most cases, the &#x003C3;(1) retained the underlying pitch form, while the &#x003C3;(2) lost its original contour and became a level tone in Changsha dialect.</p>
<table-wrap position="float" id="T3">
<label>Table 3</label>
<caption><p>The relative pitch values of lexical tones in Changsha disyllabic words.</p></caption>
<table frame="hsides" rules="groups">
<thead><tr>
<th valign="top" align="left"><bold>&#x003C3;(1)/&#x003C3;(2)</bold></th>
<th valign="top" align="center"><bold>Tone1[34]</bold></th>
<th valign="top" align="center"><bold>Tone2[13]</bold></th>
<th valign="top" align="center"><bold>Tone3[42]</bold></th>
<th valign="top" align="center"><bold>Tone4[45]</bold></th>
<th valign="top" align="center"><bold>Tone5[21]</bold></th>
<th valign="top" align="center"><bold>Tone6[14]</bold></th>
</tr>
</thead>
<tbody>
<tr>
<td valign="top" align="left">Tone1[34]</td>
<td valign="top" align="center">[34.<bold>33</bold>]</td>
<td valign="top" align="center">[34.<bold>33</bold>]</td>
<td valign="top" align="center">[34.<bold>44</bold>]</td>
<td valign="top" align="center">[34.<bold>44</bold>]</td>
<td valign="top" align="center">[34.21]</td>
<td valign="top" align="center">[34.<bold>44</bold>]</td>
</tr>
<tr>
<td valign="top" align="left">Tone2[13]</td>
<td valign="top" align="center">[13.<bold>33</bold>]</td>
<td valign="top" align="center">[13.<bold>33</bold>]</td>
<td valign="top" align="center">[13.<bold>44</bold>]</td>
<td valign="top" align="center">[13.<bold>44</bold>]</td>
<td valign="top" align="center">[13.21]</td>
<td valign="top" align="center">[13.<bold>44</bold>]</td>
</tr>
<tr>
<td valign="top" align="left">Tone3[42]</td>
<td valign="top" align="center">[<bold>44</bold>.<bold>33</bold>]</td>
<td valign="top" align="center">[<bold>44</bold>.<bold>33</bold>]</td>
<td valign="top" align="center">[<bold>44</bold>.<bold>44</bold>]</td>
<td valign="top" align="center">[<bold>44</bold>.<bold>44</bold>]</td>
<td valign="top" align="center">[<bold>44</bold>.21]</td>
<td valign="top" align="center">[<bold>44</bold>.<bold>44</bold>]</td>
</tr>
<tr>
<td valign="top" align="left">Tone4[45]</td>
<td valign="top" align="center">[45.<bold>33</bold>]</td>
<td valign="top" align="center">[45.<bold>33</bold>]</td>
<td valign="top" align="center">[45.<bold>44</bold>]</td>
<td valign="top" align="center">[45.<bold>44</bold>]</td>
<td valign="top" align="center">[45.21]</td>
<td valign="top" align="center">[45.<bold>44</bold>]</td>
</tr>
<tr>
<td valign="top" align="left">Tone5[21]</td>
<td valign="top" align="center">[21.<bold>33</bold>]</td>
<td valign="top" align="center">[21.<bold>33</bold>]</td>
<td valign="top" align="center">[21.<bold>44</bold>]</td>
<td valign="top" align="center">[21.<bold>44</bold>]</td>
<td valign="top" align="center">[21.21]</td>
<td valign="top" align="center">[21.<bold>44</bold>]</td>
</tr>
<tr>
<td valign="top" align="left">Tone6[14]</td>
<td valign="top" align="center">[24.<bold>33</bold>]</td>
<td valign="top" align="center">[24.<bold>33</bold>]</td>
<td valign="top" align="center">[24.<bold>44</bold>]</td>
<td valign="top" align="center">[24.<bold>44</bold>]</td>
<td valign="top" align="center">[24.21]</td>
<td valign="top" align="center">[24.<bold>44</bold>]</td>
</tr>
</tbody>
</table>
<table-wrap-foot>
<p><italic>The dot &#x0201C;.&#x0201D; stands for syllable break, and numbers in bold indicate sandhi forms</italic>.</p>
</table-wrap-foot>
</table-wrap>
<p>In Changsha dialect, the sandhi form mainly emerged in the &#x003C3;(2), such as T1[34] &#x02192; [33], T2[13] &#x02192; [33], T3[42] &#x02192; [44], T4[45] &#x02192; [44], and T6[14] &#x02192; [44]. Only T5 in &#x003C3;(2) was realized as an underlying form. The consistent trend of tonal processes was that the underlying tone in &#x003C3;(2) lost its original contour, and was realized as a level tone. Moreover, the pitch value of these surface (level) tones was well below the highest pitch value of underlying rising tones. The pitch realizations in Changsha dialect are presented in <xref ref-type="fig" rid="F4">Figure 4</xref>.</p>
<fig id="F4" position="float">
<label>Figure 4</label>
<caption><p>The pitch realizations of &#x003C3;(1) and &#x003C3;(2) in Changsha dialect with tonal combinations of TX-T1 <bold>(A)</bold>, TX-T2 <bold>(B)</bold>, TX-T3 <bold>(C)</bold>, TX-T4 <bold>(D)</bold>, TX-T5 <bold>(E)</bold>, and TX-T6 <bold>(F)</bold>.</p></caption>
<graphic mimetype="image" mime-subtype="tiff" xlink:href="fpsyg-13-945973-g0004.tif"/>
</fig>
<p>From <xref ref-type="fig" rid="F4">Figure 4</xref>, we can see that the surface level tones in &#x003C3;(2) were largely blended together regardless of the preceding tonal contexts (T1 to T6). To test the dependability of tonal aggregations in &#x003C3;(2), we built 6 linear mixed regression models with second-order orthogonal polynomials. The fixed factor was <italic>tonal context</italic>, and the random factors were <italic>participant</italic> and <italic>word</italic>. We predicted that the different tonal contexts in &#x003C3;(1) would not exert a significant influence on the intercept, slope, and curvature of target pitch contours in the &#x003C3;(2).</p>
<p>For TX-T1 in Changsha dialect, <italic>tonal context</italic> showed no significant effect on both pitch intercept and curvature in &#x003C3;(2) (<italic>ps</italic> &#x0003E; 0.05), but exerted a significant effect on pitch slope [&#x003C7;<sup>2</sup>(5) = 12.43, <italic>p</italic> &#x0003C; 0.05]. However, <italic>post-hoc</italic> pairwise comparisons did not show a slope difference in the surface tones in &#x003C3;(2) (<italic>ps</italic> &#x0003E; 0.05; <xref ref-type="supplementary-material" rid="SM1">Supplementary Table 2a</xref>). In addition, the <italic>tonal context</italic> in &#x003C3;(1) exerted no significant effect on the pitch intercept, slope, and curvature in &#x003C3;(2) of both TX-T2 and TX-T3 in Changsha dialect (<italic>ps</italic> &#x0003E; 0.05).</p>
<p>For TX-T4 in Changsha dialect, the <italic>tonal context</italic> showed no significant influence on the pitch intercept and curvature in &#x003C3;(2) (<italic>ps</italic> &#x0003E; 0.05), but a significant effect on pitch slope [&#x003C7;<sup>2</sup>(5) = 29.28, <italic>p</italic> &#x0003C; 0.001]. Specifically, the pitch contour in &#x003C3;(2) of T5-T4 had a more rising trend than that of T1-T4, T3-T4, T4-T4, and T6-T4 (<italic>ps</italic> &#x0003C; 0.05; <xref ref-type="supplementary-material" rid="SM1">Supplementary Table 2b</xref>).</p>
<p>Moreover, for TX-T5 in Changsha dialect, the <italic>tonal context</italic> only showed a significant effect on the pitch slope in &#x003C3;(2) [&#x003C7;<sup>2</sup>(5) = 11.35, <italic>p</italic> &#x0003C; 0.05]. <italic>Post-hoc</italic> pairwise analysis suggested that the pitch contour in &#x003C3;(2) of T5-T5 had a more moderate falling trend than that of T4-T5 (&#x003B2; = 0.94, <italic>SE</italic> = 0.31, <italic>t</italic> = 3.00, <italic>p</italic> &#x0003C; 0.05). Similarly, for TX-T6 in Changsha dialect, the <italic>tonal context</italic> only showed a significant influence on pitch slope [&#x003C7;<sup>2</sup>(5) = 36.05, <italic>p</italic> &#x0003C; 0.001]. <italic>Post-hoc</italic> pairwise analysis indicated that the pitch contour in &#x003C3;(2) of T5-T6 had a more rising trend than that under any other tonal contexts, <italic>ps</italic> &#x0003C; 0.01 (<xref ref-type="supplementary-material" rid="SM1">Supplementary Table 2c</xref>).</p>
<p>To conclude, the fine-grained analyses revealed that, although the preceding tonal context of T5 ([21]) mainly caused the subtly different pitch slope in &#x003C3;(2), the majority of pitch height and curvature in &#x003C3;(2), as indicated by the intercept and quadratic term, showed no significant differences under different tonal contexts. These results indicated that surface tones in &#x003C3;(2) were largely overlapping level tones with similar pitch height in Changsha dialect.</p></sec></sec>
<sec>
<title>Left-Dominant Structure: Chengdu Dialect</title>
<sec>
<title>Duration Realization</title>
<p>The distribution of normalized duration between &#x003C3;(1) and &#x003C3;(2) is shown in <xref ref-type="fig" rid="F2">Figure 2B</xref>. The result of one-way ANOVA indicated that the normalized duration difference between &#x003C3;(1) and &#x003C3;(2) was non-significant [<italic>F</italic><sub>(1, 798)</sub> = 3.39, <italic>p</italic> = 0.066]. Moreover, the mean absolute duration ratio of &#x003C3;(1) to &#x003C3;(2) was 1.03 (<italic>SD</italic> = 0.15) in Chengdu dialect.</p>
<p>To investigate the duration difference of two syllables when the &#x003C3;(2) carries different tonal categories, the mean normalized durations, and standard deviations are listed in <xref ref-type="table" rid="T2">Table 2</xref>. Due to the four-tone inventory of Chengdu dialect, TX-T1 represents all the tonal combinations such as T1-T1, T2-T1, T3-T1, and T4-T1.</p>
<p>A linear mixed-effect model was built to test the normalized duration (logarithmic scale) difference of two syllables among the 4 tonal categories in Chengdu dialect. The two fixed factors were <italic>syllable</italic> [&#x003C3;(1) and &#x003C3;(2)] and <italic>tonal category</italic> (TX-T1, TX-T2, TX-T3, and TX-T4), and the random factors were <italic>participant</italic> (5 individuals) and <italic>word</italic> (80 words). After model comparisons, neither the main effects of <italic>syllable</italic> [&#x003C7;<sup>2</sup>(1) = 3.76, <italic>p</italic> = 0.053] and <italic>tonal category</italic> [&#x003C7;<sup>2</sup>(3) = 4.91, <italic>p</italic> = 0.179], nor the interaction effect of <italic>syllable</italic> &#x000D7; <italic>tonal category</italic> reached significance [&#x003C7;<sup>2</sup>(3) = 1.61, <italic>p</italic> = 0.658]. Therefore, the duration between &#x003C3;(1) and &#x003C3;(2) in Chengdu dialect was generally comparable among different tonal categories (see <xref ref-type="fig" rid="F3">Figure 3B</xref>).</p></sec>
<sec>
<title>Pitch Realization</title>
<p>The pitch realizations of all the tonal combinations in Chengdu dialect are shown in <xref ref-type="table" rid="T4">Table 4</xref>. The sandhi forms appeared in both &#x003C3;(1) and &#x003C3;(2). To be specific, in &#x003C3;(1), the surface tonal representation of T2 ([31]) was [33], and the surface form of T3 ([53]) was [45]. Furthermore, for the sandhi forms in &#x003C3;(2), T1, T2 (in the tonal sequence of T2-T2), and T4 lost their underlying contours and became level tones on the surface. For example, T1 underwent from [35] to [33], T2 (when preceded by another T2) underwent from [31] to [33], and T4 underwent from [23] to [22].</p>
<table-wrap position="float" id="T4">
<label>Table 4</label>
<caption><p>The relative pitch values of lexical tones in Chengdu disyllabic words.</p></caption>
<table frame="hsides" rules="groups">
<thead><tr>
<th valign="top" align="left"><bold>&#x003C3;(1)/&#x003C3;(2)</bold></th>
<th valign="top" align="center"><bold>Tone1[35]</bold></th>
<th valign="top" align="center"><bold>Tone2[31]</bold></th>
<th valign="top" align="center"><bold>Tone3[53]</bold></th>
<th valign="top" align="center"><bold>Tone4[23]</bold></th>
</tr>
</thead>
<tbody>
<tr>
<td valign="top" align="left">Tone1[35]</td>
<td valign="top" align="center">[35.<bold>33</bold>]</td>
<td valign="top" align="center">[35.31]</td>
<td valign="top" align="center">[35.53]</td>
<td valign="top" align="center">[35.<bold>22</bold>]</td>
</tr>
<tr>
<td valign="top" align="left">Tone2[31]</td>
<td valign="top" align="center">[<bold>33.33</bold>]</td>
<td valign="top" align="center">[<bold>33.33</bold>]</td>
<td valign="top" align="center">[<bold>33</bold>.<bold>31</bold>]</td>
<td valign="top" align="center">[<bold>33</bold>.<bold>22</bold>]</td>
</tr>
<tr>
<td valign="top" align="left">Tone3[53]</td>
<td valign="top" align="center">[<bold>45.33</bold>]</td>
<td valign="top" align="center">[<bold>45</bold>.31]</td>
<td valign="top" align="center">[<bold>45</bold>.53]</td>
<td valign="top" align="center">[<bold>45</bold>.<bold>22</bold>]</td>
</tr>
<tr>
<td valign="top" align="left">Tone4[23]</td>
<td valign="top" align="center">[23.<bold>33</bold>]</td>
<td valign="top" align="center">[23.31]</td>
<td valign="top" align="center">[23. 53]</td>
<td valign="top" align="center">[23.<bold>22</bold>]</td>
</tr>
</tbody>
</table>
<table-wrap-foot>
<p><italic>The dot &#x0201C;.&#x0201D; indicates the syllable break, and numbers in bold indicate sandhi forms</italic>.</p>
</table-wrap-foot>
</table-wrap>
<p>The pitch contours of different tonal combinations in Chengdu dialect are drawn in <xref ref-type="fig" rid="F5">Figure 5</xref>. To verify the reliability of pitch values in <xref ref-type="table" rid="T4">Table 4</xref>, we built four linear mixed regression models with second-order polynomials to compare all the pitch contours in &#x003C3;(2). The fixed factor of the models was the <italic>tonal context</italic> (T1-T4), and the random factors were <italic>participant</italic> and <italic>word</italic>.</p>
<fig id="F5" position="float">
<label>Figure 5</label>
<caption><p>The pitch realizations of &#x003C3;(1) and &#x003C3;(2) in Chengdu dialect with tonal combinations of TX-T1 <bold>(A)</bold>, TX-T2 <bold>(B)</bold>, TX-T3 <bold>(C)</bold>, and TX-T4 <bold>(D)</bold>.</p></caption>
<graphic mimetype="image" mime-subtype="tiff" xlink:href="fpsyg-13-945973-g0005.tif"/>
</fig>
<p>For TX-T1 in Chengdu dialect, the model comparisons showed that the <italic>tonal context</italic> did not affect the pitch intercept, slope, or curvature in &#x003C3;(2) significantly (<italic>ps</italic> &#x0003E; 0.05). Besides, for TX-T2 in Chengdu dialect, there was a significant main effect of <italic>tonal context</italic> on both pitch intercept [&#x003C7;<sup>2</sup>(3) = 45.38, <italic>p</italic> &#x0003C; 0.001] and slope [&#x003C7;<sup>2</sup>(3) = 47.02, <italic>p</italic> &#x0003C; 0.001] in &#x003C3;(2). In terms of the pitch intercept, the <italic>post-hoc</italic> test indicated that the overall pitch height ([33]) in &#x003C3;(2) of T2-T2 was higher than that ([31]) of T4-T2 (&#x003B2; = 0.60, <italic>SE</italic> = 0.17, <italic>t</italic> = 3.51, <italic>p</italic> &#x0003C; 0.01). As for the pitch slope in &#x003C3;(2), the surface pitch contour ([33]) of T2-T2 had a flatter pitch contour than that ([31]) of T1-T2, T3-T2, and T4-T2 (<italic>ps</italic> &#x0003C; 0.001; <xref ref-type="supplementary-material" rid="SM1">Supplementary Table 3a</xref>).</p>
<p>For TX-T3 in Chengdu dialect, the results showed that the <italic>tonal context</italic> exerted impacts on both pitch intercept [&#x003C7;<sup>2</sup>(3) = 8.26, <italic>p</italic> &#x0003C; 0.05], and pitch slope [&#x003C7;<sup>2</sup>(3) = 14.72, <italic>p</italic> &#x0003C; 0.01]. <italic>Post-hoc</italic> pairwise comparisons on the pitch intercept were carried out, and results showed that the pitch height ([31]) in &#x003C3;(2) of T2-T3 was lower than that ([53]) of T1-T3 (&#x003B2; = &#x02212;0.60, <italic>SE</italic> = 0.21, <italic>t</italic> = &#x02212;2.91, <italic>p</italic> &#x0003C; 0.05). Besides, the <italic>post-hoc</italic> test on the pitch slope indicated that the pitch contour ([31]) in &#x003C3;(2) of T2-T3 had a more moderate falling trend than that in other tonal contexts (<italic>ps</italic> &#x0003C; 0.05; <xref ref-type="supplementary-material" rid="SM1">Supplementary Table 3b</xref>). Furthermore, for TX-T4 in Chengdu dialect, the model comparisons showed that the <italic>tonal context</italic> only exerted a significant influence on pitch slope in &#x003C3;(2) [&#x003C7;<sup>2</sup>(3) = 8.69, <italic>p</italic> &#x0003C; 0.05]. However, the <italic>post-hoc</italic> pairwise analysis did not show the pitch slope differences in &#x003C3;(2) (<italic>ps</italic> &#x0003E; 0.05; <xref ref-type="supplementary-material" rid="SM1">Supplementary Table 3c</xref>).</p>
<p>In a nutshell, the results of GCA indicated that the surface form in &#x003C3;(2) of T1 and T4 were realized as level tones (i.e., [33] and [22], respectively), and the tonal representation in &#x003C3;(2) of T2-T2 was also a level tone [33] in Chengdu dialect. Furthermore, the other underlying tones of &#x003C3;(2) were realized as falling tones with different pitch heights in Chengdu dialect.</p></sec></sec>
<sec>
<title>Right-Dominant Structure: Fuzhou Dialect</title>
<sec>
<title>Duration Realization</title>
<p>The distribution of normalized duration between two syllables in Fuzhou dialect is shown in <xref ref-type="fig" rid="F2">Figure 2C</xref>. The result of one-way ANOVA showed that the difference in duration between &#x003C3;(1) and &#x003C3;(2) was significant [<italic>F</italic><sub>(1, 1, 248)</sub> = 3,990, <italic>p</italic> &#x0003C; 0.001]. In addition, the mean &#x003C3;(1) to &#x003C3;(2) ratio of the absolute duration was around 0.57 (<italic>SD</italic> = 0.11) in Fuzhou dialect, indicating that the duration of &#x003C3;(2) was significantly longer than that of &#x003C3;(1).</p>
<p><xref ref-type="table" rid="T2">Table 2</xref> lists the mean values and standard deviations of the normalized duration in different tonal categories of &#x003C3;(1) in Fuzhou dialect. The T1-TX represents tonal combinations of T1-T1, T1-T2, T1-T3, T1-T4, and T1-T5. Generally, the mean normalized durations of &#x003C3;(1) were negative values, while those of &#x003C3;(2) were positive values.</p>
<p>Then we built a linear mixed-effect model to test the difference in normalized duration (logarithmic scale) statistically. The fixed factors were <italic>syllable</italic> [&#x003C3;(1) and &#x003C3;(2)] and <italic>tonal category</italic> (T1-TX, T2-TX, T3-TX, T4-TX, and T5-TX). In addition, <italic>participant</italic> (5 individuals) and <italic>word</italic> (125 words) were included as the random factors. The model comparison only showed a significant main effect of <italic>syllable</italic> [&#x003C7;<sup>2</sup>(1) = 12.69, <italic>p</italic> &#x0003C; 0.001]. Both the effect of <italic>tonal category</italic> [&#x003C7;<sup>2</sup>(4) = 5.09, <italic>p</italic> = 0.279] and the interaction effect of <italic>syllable</italic> &#x000D7; <italic>tonal category</italic> [&#x003C7;<sup>2</sup>(4) = 3.01, <italic>p</italic> = 0.556] were not found. Thus, the duration contrast between two syllables in Fuzhou dialect was significant across different tonal categories (see <xref ref-type="fig" rid="F3">Figure 3C</xref>).</p></sec>
<sec>
<title>Pitch Realization</title>
<p>The surface tones of all the tonal combinations of disyllabic words in Fuzhou dialect are presented in <xref ref-type="table" rid="T5">Table 5</xref>. There is a noteworthy phenomenon in the pitch realization of Fuzhou dialect, that is, the surface tonal representations of certain tonal combinations are the same. Specifically, apart from checked tones, the pitch values of surface tones in T1-TX, T4-TX, and T5-TX were similar.</p>
<table-wrap position="float" id="T5">
<label>Table 5</label>
<caption><p>The relative pitch values of lexical tones in Fuzhou disyllabic words.</p></caption>
<table frame="hsides" rules="groups">
<thead><tr>
<th valign="top" align="left"><bold>&#x003C3;(1)/&#x003C3;(2)</bold></th>
<th valign="top" align="center"><bold>Tone1[44]</bold></th>
<th valign="top" align="center"><bold>Tone2[51]</bold></th>
<th valign="top" align="center"><bold>Tone3[32]</bold></th>
<th valign="top" align="center"><bold>Tone4[21]</bold></th>
<th valign="top" align="center"><bold>Tone5[231]</bold></th>
</tr>
</thead>
<tbody>
<tr>
<td valign="top" align="left">Tone1[44]</td>
<td valign="top" align="center">[44.44]</td>
<td valign="top" align="center">[44.51]</td>
<td valign="top" align="center">[<bold>52</bold>.32]</td>
<td valign="top" align="center">[<bold>52</bold>.21]</td>
<td valign="top" align="center">[<bold>52</bold>.231]</td>
</tr>
<tr>
<td valign="top" align="left">Tone2[51]</td>
<td valign="top" align="center">[<bold>44</bold>.44]</td>
<td valign="top" align="center">[<bold>32</bold>.51]</td>
<td valign="top" align="center">[<bold>32</bold>.32]</td>
<td valign="top" align="center">[<bold>21</bold>.21]</td>
<td valign="top" align="center">[<bold>21</bold>.231]</td>
</tr>
<tr>
<td valign="top" align="left">Tone3[32]</td>
<td valign="top" align="center">[32.44]</td>
<td valign="top" align="center">[32.51]</td>
<td valign="top" align="center">[<bold>24</bold>.32]</td>
<td valign="top" align="center">[<bold>44</bold>.21]</td>
<td valign="top" align="center">[<bold>44</bold>.231]</td>
</tr>
<tr>
<td valign="top" align="left">Tone4[21]</td>
<td valign="top" align="center">[<bold>44</bold>.44]</td>
<td valign="top" align="center">[<bold>44</bold>.51]</td>
<td valign="top" align="center">[<bold>52</bold>.32]</td>
<td valign="top" align="center">[<bold>52</bold>.21]</td>
<td valign="top" align="center">[<bold>52</bold>.231]</td>
</tr>
<tr>
<td valign="top" align="left">Tone5[231]</td>
<td valign="top" align="center">[<bold>44</bold>.44]</td>
<td valign="top" align="center">[<bold>44</bold>.51]</td>
<td valign="top" align="center">[<bold>52</bold>.32]</td>
<td valign="top" align="center">[<bold>52</bold>.21]</td>
<td valign="top" align="center">[<bold>52</bold>.231]</td>
</tr>
</tbody>
</table>
<table-wrap-foot>
<p><italic>The dot &#x0201C;.&#x0201D; stands for syllable break, and numbers in bold indicate sandhi forms</italic>.</p>
</table-wrap-foot>
</table-wrap>
<p>When T1, T4, and T5 were in the &#x003C3;(1), their sandhi forms could be described as (2a). The tone pattern was that the pitch in &#x003C3;(2) was affected by the onset pitch height in &#x003C3;(2). If the onset pitch height in &#x003C3;(2) was low ([32]/[21]/[231]), the sandhi form of &#x003C3;(1) was [52], ending with a low pitch value [2] accordingly. If the onset pitch height of &#x003C3;(2) was high ([44]/[51]), then the sandhi form of &#x003C3;(1) was [44], ending with a high pitch value [4]. In addition, the sandhi forms of T2 in &#x003C3;(1) could be described as (2b). The pitch height of &#x003C3;(1) was determined by the onset pitch height of &#x003C3;(2) to a large degree. Overall, the direction of tonal assimilation was leftwards in Fuzhou dialect, the pitch of the right syllable &#x003C3;(2) was likely to determine the pitch forms of &#x003C3;(1) on the left.</p>
<p><inline-graphic xlink:href="fpsyg-13-945973-i0002.tif"/></p>
<p><xref ref-type="fig" rid="F6">Figure 6</xref> depicts the pitch contours of tonal combinations of T1-TX, T2-TX, T3-TX, T4-TX, and T5-TX in Fuzhou dialect. Then, five linear mixed-effect models with second-order orthogonal polynomials were constructed to compare the pitch of the &#x003C3;(1). The fixed factor was the <italic>tonal context</italic>, and the random factors were <italic>participant</italic> and <italic>word</italic>. It is assumed that for T1, T4, and T5 in &#x003C3;(1), their sandhi forms [52] and [44] might be different in terms of both pitch intercept and slope. Another possible result was that three sandhi forms of T2 (i.e., [21]/[32]/[44]) in &#x003C3;(1) might be different in both pitch intercept and slope.</p>
<fig id="F6" position="float">
<label>Figure 6</label>
<caption><p>The pitch realizations of &#x003C3;(1) and &#x003C3;(2) in Fuzhou dialect with tonal combinations of T1-TX <bold>(A)</bold>, T2-TX <bold>(B)</bold>, T3-TX <bold>(C)</bold>, T4-TX <bold>(D)</bold>, and T5-TX <bold>(E)</bold>.</p></caption>
<graphic mimetype="image" mime-subtype="tiff" xlink:href="fpsyg-13-945973-g0006.tif"/>
</fig>
<p>For T1-TX in Fuzhou dialect, the model comparison showed that pitch contours in &#x003C3;(1) were different in terms of pitch slope [&#x003C7;<sup>2</sup>(4) = 32.51, <italic>p</italic> &#x0003C; 0.001] and pitch curvature [&#x003C7;<sup>2</sup>(4) = 17.43, <italic>p</italic> &#x0003C; 0.01]. <italic>Post-hoc</italic> pairwise analysis on pitch slope showed that the pitch contour ([44]) in &#x003C3;(1) of T1-T1 and T1-T2 had a flatter trend than that ([52]) of T1-T3, T1-T4, and T1-T5 (<italic>ps</italic> &#x0003C; 0.01; <xref ref-type="supplementary-material" rid="SM1">Supplementary Table 4a</xref>).</p>
<p>Similarly, for T4-TX in Fuzhou dialect, the <italic>tonal context</italic> exerted a significant effect on both pitch slope [&#x003C7;<sup>2</sup>(4) = 48.71, <italic>p</italic> &#x0003C; 0.001] and pitch curvature [&#x003C7;<sup>2</sup>(4) = 20.21, <italic>p</italic> &#x0003C; 0.001] in &#x003C3;(1). To be more specific, <italic>post-hoc</italic> pairwise comparisons on pitch slope showed that the pitch contour ([44]) in &#x003C3;(1) of T4-T1 and T4-T2 had a flatter trend than that ([52]) of T4-T3, T4-T4, and T4-T5 (<italic>ps</italic> &#x0003C; 0.05; <xref ref-type="supplementary-material" rid="SM1">Supplementary Table 4b</xref>).</p>
<p>Moreover, for T5-TX in Fuzhou dialect, the main effects of the <italic>tonal context</italic> on pitch slope [&#x003C7;<sup>2</sup>(4) = 20.38, <italic>p</italic> &#x0003C; 0.001] and pitch curvature [&#x003C7;<sup>2</sup>(4) = 26.28, <italic>p</italic> &#x0003C; 0.001] were found again. <italic>Post-hoc</italic> pairwise comparisons on pitch slope were carried out (<xref ref-type="supplementary-material" rid="SM1">Supplementary Table 4c</xref>). Compared with T5-T3, T5-T4, and T5-T5, the pitch contour ([44]) in &#x003C3;(1) of T5-T1 and T5-T2 was more significantly flatter (<italic>ps</italic> &#x0003C; 0.05).</p>
<p>In addition, the model comparison showed that T2&#x00027;s sandhi forms ([21]/[32]/[44]) in Fuzhou dialect differentiated from each other in both pitch intercept [&#x003C7;<sup>2</sup>(4) = 45.28, <italic>p</italic> &#x0003C; 0.001] and pitch slope [&#x003C7;<sup>2</sup>(4) = 16.11, <italic>p</italic> &#x0003C; 0.01]. The <italic>post-hoc</italic> pairwise analysis on pitch intercept in &#x003C3;(1) was carried out (see <xref ref-type="supplementary-material" rid="SM1">Supplementary Table 4d</xref>). Results showed that compared with T2-T3, T2-T4, and T2-T5, the pitch height in &#x003C3;(1) of T2-T1 ([44]) was significantly higher (<italic>ps</italic> &#x0003C; 0.01).</p>
<p>To conclude, in Fuzhou dialect, statistical results suggested that two sandhi forms ([52]/[44]) of T1, T4, and T5 in &#x003C3;(1) were conditioned by following tonal contexts mainly in terms of pitch slope, rather than pitch height. Furthermore, the tonal representation of T2 ([21]/[32]/[44]) in the surface was strictly modulated by the pitch height of the following tonal contexts.</p></sec></sec>
<sec>
<title>Right-Dominant Structure: Xiamen Dialect</title>
<sec>
<title>Duration Realization</title>
<p>The distribution of normalized duration in Xiamen dialect is drawn in <xref ref-type="fig" rid="F2">Figure 2D</xref>. The result of one-way ANOVA showed that the difference in duration in two syllabic positions was significant [<italic>F</italic><sub>(1, 1, 248)</sub> = 542.60, <italic>p</italic> &#x0003C; 0.001]. Besides, the mean duration ratio in Xiamen dialect was about 0.83 (<italic>SD</italic> = 0.14).</p>
<p>Divided by tone categories in &#x003C3;(1), the mean normalized durations and standard deviations of two syllables are listed in <xref ref-type="table" rid="T2">Table 2</xref>. The mean normalized durations of &#x003C3;(1) were negative values, while those of &#x003C3;(2) were positive values. Furthermore, the standard deviations in &#x003C3;(2) were greater than those in &#x003C3;(1) in Xiamen dialect.</p>
<p>A linear mixed-effect model was constructed to test the duration pattern (logarithmic scale) across 5 tonal categories. The <italic>syllable</italic> [&#x003C3;(1) and &#x003C3;(2)], <italic>tonal category</italic> (T1-TX, T2-TX, T3-TX, T4-TX, and T5-TX) were set as the fixed factors, and the <italic>participant</italic> (5 individuals) and <italic>word</italic> (125 words) were included as the random factors. The result showed a significant effect of <italic>syllable</italic> [&#x003C7;<sup>2</sup>(1) = 31.25, <italic>p</italic> &#x0003C; 0.001]. However, <italic>tonal category</italic> did not show an effect on duration [&#x003C7;<sup>2</sup>(4) = 7.29, <italic>p</italic> = 0.121]. The interaction effect of <italic>syllable</italic> &#x000D7; <italic>tonal category</italic> was not found [&#x003C7;<sup>2</sup>(4) = 9.34, <italic>p</italic> =0.053]. Results indicated that the duration difference between &#x003C3;(1) and &#x003C3;(2) was significant among tonal categories in Xiamen dialect (see <xref ref-type="fig" rid="F3">Figure 3D</xref>).</p></sec>
<sec>
<title>Pitch Realization</title>
<p>The pitch realizations of all tonal combinations in Xiamen dialect are listed in <xref ref-type="table" rid="T6">Table 6</xref>. As can be seen, the &#x003C3;(2) maintained its underlying pitch form, yet the underlying tone in &#x003C3;(1) was realized as its sandhi form, similar to that in Fuzhou dialect.</p>
<table-wrap position="float" id="T6">
<label>Table 6</label>
<caption><p>The relative pitch values of lexical tones in Xiamen disyllabic words.</p></caption>
<table frame="hsides" rules="groups">
<thead><tr>
<th valign="top" align="left"><bold>&#x003C3;(1)/&#x003C3;(2)</bold></th>
<th valign="top" align="center"><bold>Tone1[44]</bold></th>
<th valign="top" align="center"><bold>Tone2[24]</bold></th>
<th valign="top" align="center"><bold>Tone3[53]</bold></th>
<th valign="top" align="center"><bold>Tone4[21]</bold></th>
<th valign="top" align="center"><bold>Tone5[22]</bold></th>
</tr>
</thead>
<tbody>
<tr>
<td valign="top" align="left">Tone1[44]</td>
<td valign="top" align="center">[<bold>33</bold>.44]</td>
<td valign="top" align="center">[<bold>33</bold>.24]</td>
<td valign="top" align="center">[<bold>33</bold>.53]</td>
<td valign="top" align="center">[<bold>33</bold>.21]</td>
<td valign="top" align="center">[<bold>33</bold>.22]</td>
</tr>
<tr>
<td valign="top" align="left">Tone2[24]</td>
<td valign="top" align="center">[<bold>33</bold>.44]</td>
<td valign="top" align="center">[<bold>33</bold>.24]</td>
<td valign="top" align="center">[<bold>33</bold>.53]</td>
<td valign="top" align="center">[<bold>33</bold>.21]</td>
<td valign="top" align="center">[<bold>33</bold>.22]</td>
</tr>
<tr>
<td valign="top" align="left">Tone3[53]</td>
<td valign="top" align="center">[<bold>44</bold>.44]</td>
<td valign="top" align="center">[<bold>44</bold>.24]</td>
<td valign="top" align="center">[<bold>44</bold>.53]</td>
<td valign="top" align="center">[<bold>44</bold>.21]</td>
<td valign="top" align="center">[<bold>44</bold>.22]</td>
</tr>
<tr>
<td valign="top" align="left">Tone4[21]</td>
<td valign="top" align="center">[<bold>53</bold>.44]</td>
<td valign="top" align="center">[<bold>53</bold>.24]</td>
<td valign="top" align="center">[<bold>53</bold>.53]</td>
<td valign="top" align="center">[<bold>53</bold>.21]</td>
<td valign="top" align="center">[<bold>53</bold>.22]</td>
</tr>
<tr>
<td valign="top" align="left">Tone5[22]</td>
<td valign="top" align="center">[<bold>21</bold>.44]</td>
<td valign="top" align="center">[<bold>21</bold>.24]</td>
<td valign="top" align="center">[<bold>21</bold>.53]</td>
<td valign="top" align="center">[<bold>21</bold>.21]</td>
<td valign="top" align="center">[<bold>21</bold>.22]</td>
</tr>
</tbody>
</table>
<table-wrap-foot>
<p><italic>The dot &#x0201C;.&#x0201D; stands for syllable break, and numbers in bold indicate sandhi forms</italic>.</p>
</table-wrap-foot>
</table-wrap>
<p>The sandhi form ([33]) of T1 in &#x003C3;(1) was lower than its underlying form ([44]). For T2 in Xiamen dialect, its underlying pitch form ([24]) in &#x003C3;(1) lost rising contour and became level tones ([33]). Likewise, the sandhi form ([53]) of T3 was also a level tone ([44]). Therefore, T1, T2, and T3 probably underwent tonal neutralization, since their surface tones in &#x003C3;(1) generally approached the mid-level tone [33] (see <xref ref-type="fig" rid="F7">Figure 7</xref>). In addition, the sandhi forms of T4 and T5 in &#x003C3;(1) were falling tones, [53] and [21], respectively. Then, 5 second-order orthogonal polynomial models were built to compare pitch forms in &#x003C3;(1) across all tonal contexts (T1-TX, T2-TX, T3-TX, T4-TX, and T5-TX). The fixed factor of each model was <italic>tonal context</italic>, and the random factors were <italic>participant</italic> and <italic>word</italic>. It was presumed that <italic>tonal context</italic> exerted no significant effect on the pitch intercept, pitch slope, or pitch curvature of surface tones in &#x003C3;(1).</p>
<fig id="F7" position="float">
<label>Figure 7</label>
<caption><p>The pitch realizations of &#x003C3;(1) and &#x003C3;(2) in Xiamen dialect with tonal combinations of T1-TX <bold>(A)</bold>, T2-TX <bold>(B)</bold>, T3-TX <bold>(C)</bold>, T4-TX <bold>(D)</bold>, and T5-TX <bold>(E)</bold>.</p></caption>
<graphic mimetype="image" mime-subtype="tiff" xlink:href="fpsyg-13-945973-g0007.tif"/>
</fig>
<p>For T1-TX in Xiamen dialect, the model comparisons showed that <italic>tonal context</italic> did not affect pitch intercept, slope, and curvature (<italic>ps</italic> &#x0003E; 0.05). Besides, the <italic>tonal context</italic> did not affect the pitch intercept, slope, and curvature of &#x003C3;(1) in T2-TX either (<italic>ps</italic> &#x0003E; 0.05). Thus, the sandhi form [33] of underlying T1 ([44]) or T2 ([24]) was stable irrespective of the following tonal contexts in &#x003C3;(2). Likewise, we did not find the main effect of <italic>tonal context</italic> on the pitch intercept, slope, and curvature in the &#x003C3;(1) of T3-TX (<italic>ps</italic> &#x0003E; 0.05), suggesting that the sandhi form ([44]) was not affected by the following lexical tones in &#x003C3;(2).</p>
<p>Moreover, the results of T4-TX and T5-TX also met our expectations, the <italic>tonal context</italic> exerted no impact on the intercept, slope, and curvature of pitch forms in &#x003C3;(1) (<italic>ps</italic> &#x0003E; 0.05). Results indicated that the sandhi form of T4 in &#x003C3;(1) was uniform, free from the influence of following tonal contexts, and so was T5.</p>
<p>To conclude, results indicated that in Xiamen dialect, the surface tone in &#x003C3;(1) showed a uniform representation for each underlying tone, regardless of the tonal contexts in &#x003C3;(2). Especially for T1, T2, and T3, they lowered pitch heights and lost underlying contours.</p></sec></sec></sec>
<sec sec-type="discussion" id="s4">
<title>Discussion</title>
<sec>
<title>The Limited Role of Duration in the Manifestation of Metrical Structure in Chinese Dialects</title>
<p>Generally speaking, the duration could be used as a phonetic cue measuring the syllable weight, since a longer duration might indicate a heavier syllable weight (Maddieson, <xref ref-type="bibr" rid="B43">1993</xref>; Hubbard, <xref ref-type="bibr" rid="B28">1994</xref>). Especially in the weight-sensitive languages, the heavy syllables carrying two morae exhibit a longer duration than the monomoraic light syllables, and also are easier to attract the metrical stress (Selkirk, <xref ref-type="bibr" rid="B52">1982</xref>; McCarthy and Prince, <xref ref-type="bibr" rid="B44">1994</xref>). The relation between the syllable weight and the word stress could be summarized as the &#x0201C;weight to stress principle&#x0201D; (Prince, <xref ref-type="bibr" rid="B48">1990</xref>). The present study revealed the diversity of duration patterns between two types of metrical structure as shown in four Chinese dialects, and the results were discussed as follows.</p>
<p>According to our prediction (H1), the left-dominant structure might exhibit a long-short pattern in duration. Yet, our finding does not support the H1, since although the duration between two syllables was significantly different in Changsha dialect, that was evenly arranged in Chengdu dialect. Specifically, in Changsha dialect, the duration in &#x003C3;(2) was significantly shorter than that in &#x003C3;(1) regardless of tonal categories. This duration pattern was supported by the impressionistic description according to Zhong (<xref ref-type="bibr" rid="B72">2003</xref>). The mean duration ratio in Changsha dialect was about 1.53: 1, approaching the ratio (1.7: 1) of the neutral-tone words in Mandarin (Chen and Xu, <xref ref-type="bibr" rid="B12">2006</xref>). It is worth noting that lexical items in our study were daily spoken words that did not belong to the category of the neutral-tone words used in a small number of Mandarin words (about 7% according to Li, <xref ref-type="bibr" rid="B39">1981</xref>). However, the disyllabic words in Chengdu dialect exhibited comparable duration between two syllables (with a ratio of 1.03: 1), which pointed to the absence of a phonetic reduction in &#x003C3;(2). Thus, the syllable weight indicated by duration was quite evenly matched in disyllabic words of Chengdu dialect. Given the left-dominant structure in Chengdu dialect according to Qin (<xref ref-type="bibr" rid="B49">2012</xref>), the duration may not act as the phonetic correlate of metrical prominence robustly. Potentially, the difference in duration pattern in these two dialects could be attributed to the dialect differences, since Changsha and Chengdu dialects belong to the Xiang dialect group and the Mandarin supergroup, respectively (see <xref ref-type="fig" rid="F1">Figure 1A</xref>). It was reported that disyllabic words in the Yiyang dialect (Xiang dialect group) also showed a long-short duration pattern (Xia, <xref ref-type="bibr" rid="B61">2018</xref>). More cross-dialectal research is needed to demonstrate the diverse duration patterns of the left-dominant metrical structure.</p>
<p>On the contrary, a short-long duration pattern was consistently found under the right-dominant structure of both Fuzhou and Xiamen dialects. Such duration pattern might serve as the robust phonetic evidence for the right-dominant structure, indicating that Fuzhou and Xiamen dialects were weight-sensitive dialects. This finding is consistent with our prediction (H1) for the right-dominant structure, and is also in line with the &#x0201C;Iambic/Trochaic Law&#x0201D; (Hayes, <xref ref-type="bibr" rid="B25">1995</xref>), suggesting that a short-long duration pattern commonly exists in the iamb. Although Fuzhou and Xiamen dialects shared a similar duration pattern, the duration ratio difference between two syllables was not exactly comparable in these two dialects. To be more specific, the duration contrast between &#x003C3;(1) and &#x003C3;(2) in Fuzhou dialect (a ratio of 0.57: 1) was generally greater than that in Xiamen dialect (a ratio of 0.83: 1). When focusing on the duration contrast under different tonal categories, as shown in <xref ref-type="table" rid="T2">Table 2</xref>, we could also observe a greater duration difference between &#x003C3;(1) and &#x003C3;(2) in Fuzhou dialect. A deeper understanding of this phonetic difference in the right-dominant structure requires further investigation.</p>
<p>In general, duration patterns in three out of four dialects corresponded with the underlying metrical structure. Thus, the metrical prominence in disyllabic words of Chinese dialects might not always be manifested as a longer duration. Such finding only supported our prediction (H1) regarding the right-dominant structure. In addition, consistent with the &#x0201C;Iambic/Trochaic Law&#x0201D; (Hayes, <xref ref-type="bibr" rid="B25">1995</xref>), the duration pattern could reliably reflect the &#x0201C;relative relation&#x0201D; (i.e., light or heavy) of the syllable weight in the right-dominant structure of Chinese dialects. Future research needs to be carried out to investigate whether there is a final-lengthening effect on duration realization by putting the target words at different prosodic positions of the carrier sentence.</p></sec>
<sec>
<title>The Robust Role of Pitch in the Manifestation of Metrical Structure in Chinese Dialects</title>
<p>It was proposed that pitch realizations between metrically weak and strong positions in Chinese have opposite tendencies. One is that, in the metrically weak unit such as &#x003C3;(w), the lexical tone tends to lose its underlying contour (Chen and Xu, <xref ref-type="bibr" rid="B12">2006</xref>), or further undergoes tonal alternation (Yue-Hashimoto, <xref ref-type="bibr" rid="B67">1987</xref>; Chen, <xref ref-type="bibr" rid="B11">2000</xref>); The other is that the syllable with more prosodic strength [&#x003C3;(s)] would fully exhibit the underlying tonal representation and keep its original tone features (Kochanski et al., <xref ref-type="bibr" rid="B34">2003</xref>; Zeng and Niu, <xref ref-type="bibr" rid="B68">2006</xref>). The current study investigated the pitch realizations of four Chinese dialects by using the second-order orthogonal polynomials models to compare the pitch intercept, slope, and curvature.</p>
<p><xref ref-type="table" rid="T7">Table 7</xref> shows tonal alternations in the metrically weak position across the four dialects. Specifically, there were generally three trends of tonal changes according to underlying pitch contours: (1) The underlying rising tone was generally realized as a level tone on the surface (i.e., [44]/[33]/[22]). Moreover, the pitch height of the surface level tone was below the peak pitch of the underlying form, except for Fuzhou dialect (no underlying rising tones in its tone inventory); (2) For underlying level tones, they were prone to change into falling tones with similar pitch height. Note that there are no underlying level tones in the tone inventories of Chengdu and Changsha dialects; (3) The underlying falling tones tended to be realized as level tones with similar pitch heights (i.e., [33]/[44]), or sometimes be realized with original falling contour but different pitch heights. In summary, a notably consistent tonal pattern was that underlying contour tones (falling or rising) were generally realized as level tones on the surface. In other words, the level tone was a more common surface representation in the metrically weak position.</p>
<table-wrap position="float" id="T7">
<label>Table 7</label>
<caption><p>The tonal alternations in the metrically weak position across the four dialects.</p></caption>
<table frame="hsides" rules="groups">
<thead><tr>
<th valign="top" align="left"><bold>Underlying</bold></th>
<th valign="top" align="center"><bold>Changsha</bold></th>
<th valign="top" align="center"><bold>Chengdu</bold></th>
<th valign="top" align="center"><bold>Fuzhou</bold></th>
<th valign="top" align="center"><bold>Xiamen</bold></th>
</tr>
<tr>
<th valign="top" align="left"><bold>contour</bold></th>
<th/>
<th/>
<th/>
<th/>
</tr>
</thead>
<tbody>
<tr>
<td valign="top" align="left">Rising</td>
<td valign="top" align="center">[45] &#x02192; [44];</td>
<td/>
<td/>
<td/>
</tr>
<tr>
<td/>
<td valign="top" align="center">[34] &#x02192; [33]</td>
<td valign="top" align="center">[35] &#x02192; [33]</td>
<td/>
<td valign="top" align="center">[24] &#x02192; [33]</td>
</tr>
<tr>
<td/>
<td valign="top" align="center">[14] &#x02192; [44];</td>
<td valign="top" align="center">[23] &#x02192; [22]</td>
<td/>
<td/>
</tr>
<tr>
<td/>
<td valign="top" align="center">[13] &#x02192; [33]</td>
<td/>
<td/>
<td/>
</tr>
<tr>
<td valign="top" align="left">Level</td>
<td/>
<td/>
<td valign="top" align="center">[44] &#x02192; [52]</td>
<td valign="top" align="center">[44] &#x02192; [33]</td>
</tr>
<tr>
<td/>
<td/>
<td/>
<td/>
<td valign="top" align="center">[22] &#x02192; [21]</td>
</tr>
<tr>
<td valign="top" align="left">Falling</td>
<td valign="top" align="center">[42] &#x02192; [44]</td>
<td valign="top" align="center">[31] &#x02192; [33]</td>
<td valign="top" align="center">[51] &#x02192; [21], [32] or [44]</td>
<td valign="top" align="center">[53] &#x02192; [44]</td>
</tr>
<tr>
<td/>
<td/>
<td/>
<td valign="top" align="center">[32] &#x02192; [24] or [44]</td>
<td valign="top" align="center">[21] &#x02192; [53]</td>
</tr>
<tr>
<td/>
<td/>
<td/>
<td valign="top" align="center">[21] &#x02192; [52] or [44]</td>
<td/>
</tr>
</tbody>
</table>
</table-wrap>
<p>Moreover, it should be noted that these surface tones were sometimes context-dependent in the four dialects. In other words, the tone sandhi in a metrically weak position might be conditioned by the preceding/following tonal context (of metrically strong position). For instance, the pitch realizations of T2-T2 in Chengdu dialect and most tonal alternations in Fuzhou dialect were contextually conditioned. Specifically, in Chengdu dialect, the surface representation of T2 in &#x003C3;(2) was realized as [33] only when preceded by another T2 in sequence (see <xref ref-type="fig" rid="F5">Figure 5</xref>). Likewise, the surface pitch form of T2 in &#x003C3;(1) was decided by the following tonal context in Fuzhou dialect (see <xref ref-type="fig" rid="F6">Figure 6</xref>). These tonal changes are overall comparable to the neutral tones in Mandarin, where tone deletion occurs, leaving a vacant tone-bearing unit in the target position which awaits a proper pitch realization from the preceding tonal context (Wang, <xref ref-type="bibr" rid="B58">2008</xref>). Likewise, the metrical weak unit might contain a single mora. Additionally, in Changsha and Xiamen dialects, the surface form in the metrically weak position of each underlying tone was uniform, regardless of tonal contexts (see <xref ref-type="fig" rid="F4">Figures 4</xref>, <xref ref-type="fig" rid="F7">7</xref>). Besides, the underlying T1 and T4 in Chengdu dialect were also free from the contextual effect, realized as [33] and [22] respectively. The contextual dependency in the four dialects could be summarized as (3).</p>
<p><inline-graphic xlink:href="fpsyg-13-945973-i0003.tif"/></p>
<p>On the contrary, the metrically strong units generally maintained their underlying pitch forms. This indicates they might have two morae that could bear contour tones. It should be noted that there were two exceptions, namely the T2/T3 in Chengdu dialect and T3 in Changsha dialect. In Chengdu dialect, the tonal alternations in &#x003C3;(1) showed that T2 ([31]) and T3 ([53]) underwent [31] &#x02192; [33] and [53] &#x02192; [45], respectively. Many other factors might influence the surface form of lexical tones. For example, it has been generally deemed that the tonal merger (or simplification) is mainstream in tonal development in contemporary Chinese dialects (Pan, <xref ref-type="bibr" rid="B46">1982</xref>). One possibility is that the exceptions in Chengdu dialects were triggered by tonal mergers, and that all surface tones of T3-TX approached T1([35])-TX. This kind of tonal merger could be supported by the case of Fuzhou dialect. In Fuzhou dialect, the surface tones of two syllables among T1-TX, T4-TX, and T5-TX were identical (see <xref ref-type="table" rid="T5">Table 5</xref>), indicating a completed tonal merger. However, the reasons why underlying tones changed in &#x003C3;(1) of T2 in Chengdu dialect and T3 in Changsha dialect are still unknown and need more future studies to find out.</p>
<p>In a nutshell, despite a few exceptions, the four Chinese dialects we investigated exhibited consistent tonal representations of a level pitch contour in the metrically weak position while the underlying pitch form in a strong position. Therefore, this finding corroborated many case studies across Chinese dialects (Yue-Hashimoto, <xref ref-type="bibr" rid="B67">1987</xref>; Chen, <xref ref-type="bibr" rid="B11">2000</xref>), and the above tonal representations were generally confirmed by the fine-grained analysis (GCA) as we predicted (H3). Furthermore, this surface tone of the metrically weak position under the same metrical structure might be classified into context-independent and context-dependent types. These different types also supported our prediction (H2) that the specific pitch realization might be not identical within a certain metical pattern.</p></sec>
<sec>
<title>The &#x0201C;Metrical Tone Sandhi&#x0201D;: The Interaction Between Duration and Pitch Realizations</title>
<p>To date, although the tone sandhi in the metrically weak unit has been termed as the &#x0201C;metrical tone sandhi&#x0201D; (Zeng and Niu, <xref ref-type="bibr" rid="B68">2006</xref>), many puzzles are still lingering. For instance, it is still unclear whether metrically tonal alternation is related to the duration pattern. We also wonder whether underlying tone would be realized as different surface forms according to different prosodic units. In the present study, we examined four Chinese dialects by drawing on the phonetic parameters of both duration and pitch, which offered a valuable chance to analyze their interaction. The &#x0201C;metrical tone sandhi&#x0201D; among the four Chinese dialects would be discussed below.</p>
<p>First, in Changsha dialect, the metrical structure belongs to the strong-weak pattern, exemplified by the shorter duration and surface level tones in &#x003C3;(2). In terms of duration, the duration ratio in Changsha disyllabic words (1.53: 1) is close to that in Mandarin neutral-toned words (1.70: 1; Chen and Xu, <xref ref-type="bibr" rid="B12">2006</xref>). We could assume that the &#x003C3;(2) in Changsha dialect is monomoraic like the neutral-toned syllable in Mandarin (Duanmu, <xref ref-type="bibr" rid="B18">1993</xref>), while the &#x003C3;(1) is the heavy syllable containing two morae. Based on this pattern [&#x003C3;(&#x003BC;&#x003BC;) &#x003C3;(&#x003BC;)], the surface pitch realization of underlying T1, T2, T4, and T6 could be realized as (4) with T1 as the preceding context. Thus, the bimoraic rime of &#x003C3;(1) in Changsha dialect could bear the underlying contour, while the surface tone of &#x003C3;(2) could be realized as the level tone probably due to the limited capacity of tone bearing.</p>
<p><inline-graphic xlink:href="fpsyg-13-945973-i0004.tif"/></p>
<p>As for the disyllabic words in Chengdu dialect, the duration between two syllables is evenly arranged. The metrical pattern might be a syllabic trochee [&#x003C3;(s) &#x003C3;(w)]. Like Changsha dialect, the pitch contour of &#x003C3;(2) in Chengdu dialect also undergoes a similar phonetic reduction, resulting in a level tone as the surface representation. For instance, the surface tone of T1 and T4 in &#x003C3;(2) could be described as (5) when preceded by T1. Although the duration is not a phonetic correlate of the underlying metrical structure in Chengdu dialect, the strong-weak metrical pattern could be manifested by this tonal alternation in &#x003C3;(2) (Qin, <xref ref-type="bibr" rid="B49">2012</xref>).</p>
<p><inline-graphic xlink:href="fpsyg-13-945973-i0005.tif"/></p>
<p>Then, for Fuzhou dialect, the duration in &#x003C3;(2) is significantly longer than that in &#x003C3;(1), indicating that &#x003C3;(1) and &#x003C3;(2) belong to the light and heavy syllables, respectively [&#x003C3;(&#x003BC;) &#x003C3;(&#x003BC;&#x003BC;)]. Under this circumstance, the underlying T1, T2, T4, and T5 in &#x003C3;(1) lose their original tone features and are assimilated by the following tonal contexts. These tonal alternations could be illustrated as (6) with T1 as the following tonal context. The tonal processes of tone deletion in &#x003C3;(1) and leftward tone spreading from &#x003C3;(2) could be seen in Fuzhou dialect.</p>
<p><inline-graphic xlink:href="fpsyg-13-945973-i0006.tif"/></p>
<p>Similarly, the duration in &#x003C3;(2) is also longer than the &#x003C3;(1) counterpart in Xiamen dialect. Thus, the syllable weight between two syllables in Xiamen dialect is different. Given this pattern of syllable weight [&#x003C3;(&#x003BC;) &#x003C3;(&#x003BC;&#x003BC;)], the tonal neutralization occurs in the &#x003C3;(1) in terms of underlying T1, T2, and T3. When these lexical tones are followed by T1, the tone sandhi could be seen as (7). The tonal neutralization at &#x003C3;(1) could be attributed to the limited tone-bearing capacity in the monomoraic syllable. Based on the duration and pitch realizations, we might confirm the metrically weak-strong structure in Xiamen dialect.</p>
<p><inline-graphic xlink:href="fpsyg-13-945973-i0007.tif"/></p>
<p>From the cross-dialectal perspective, the tonal process in the left-dominant structure contains contour reduction, apart from rightward tone spreading (Zhang, <xref ref-type="bibr" rid="B70">2007</xref>). In the right-dominant, both leftward tone spreading and tonal neutralization could occur in the surface tonal representation. Although the above tonal processes are manifested in different manners, the core driving force is the underlying metrical structure in these four dialects. That is, the metrically weak syllable undergoes tone sandhi, while the strong syllable, where the metrical stress lies, is fully realized as the underlying pitch form (Duanmu, <xref ref-type="bibr" rid="B18">1993</xref>; Chen, <xref ref-type="bibr" rid="B11">2000</xref>; Zeng and Niu, <xref ref-type="bibr" rid="B68">2006</xref>). In general, &#x0201C;metrical tone sandhi&#x0201D; enlightens us re-think the interaction between duration and pitch realizations in Chinese dialects. More cross-dialectal research, however, is needed to explore the fundamental effect of metrical structure on the tone sandhi more clearly.</p>
<p>Meanwhile, it is noteworthy that not all the tone sandhi in the four Chinese dialects could be interpreted as the &#x0201C;metrical tone sandhi.&#x0201D; For instance, the pitch change of the &#x003C3;(1) in Chengdu dialect and the T4 ([21] &#x02192; [53]) in Xiamen dialect are beyond the phenomenon of the &#x0201C;metrical tone sandhi.&#x0201D; To some extent, it might be more appropriate to regard them as dialect-specific tonal alternations.</p></sec></sec>
<sec sec-type="conclusions" id="s5">
<title>Conclusion</title>
<p>The current study presented the diverse phonetic realizations under two metrical structures across four Chinese dialects. Specifically, we examined the duration and pitch realizations of disyllabic prosodic words in Changsha and Chengdu dialects under the left-dominant structure, and in Fuzhou and Xiamen dialects under the right-dominant structure.</p>
<p>The results of cross-dialectal comparisons indicated that the duration patterns in four Chinese dialects were not always sensitive to different metrical structures, given that the duration contrast in Chengdu dialect was not significant. Therefore, the phonetic correlate of duration alone did not play a universal role in manifesting the metrical prominence. Moreover, the GCA was adopted to examine the pitch realization in the metrically weak position. The general tendency was that the main surface form in the prosodic weak element became level tones (sometimes falling tones). Compared with duration realization, pitch realization might be more robust as an indicator of metrically binary contrast in Chinese dialects. Furthermore, there might be interactions between duration and pitch realizations in Chinese dialects, thus the nature of &#x0201C;metrical tone sandhi&#x0201D; could unfold more clearly when combining pitch realization with the duration pattern.</p></sec>
<sec sec-type="data-availability" id="s6">
<title>Data Availability Statement</title>
<p>The raw data supporting the conclusions of this article will be made available by the authors, without undue reservation.</p></sec>
<sec id="s7">
<title>Ethics Statement</title>
<p>The studies involving human participants were reviewed and approved by Hunan University, School of Foreign Languages. The patients/participants provided their written informed consent to participate in this study.</p></sec>
<sec id="s8">
<title>Author Contributions</title>
<p>CG and FC conceived and designed the study, participated in the statistical analysis, interpreted the data, and wrote the first draft of the manuscript. CG collected the data. Both authors contributed to and have approved the final manuscript.</p></sec>
<sec sec-type="funding-information" id="s9">
<title>Funding</title>
<p>This study was supported by the MOE (Ministry of Education of China) Youth Fund Project of Humanities and Social Sciences Research (Grant No. 21YJC740015).</p></sec>
<sec sec-type="COI-statement" id="conf1">
<title>Conflict of Interest</title>
<p>The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.</p></sec>
<sec sec-type="disclaimer" id="s10">
<title>Publisher&#x00027;s Note</title>
<p>All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article, or claim that may be made by its manufacturer, is not guaranteed or endorsed by the publisher.</p></sec> </body>
<back><sec sec-type="supplementary-material" id="s11">
<title>Supplementary Material</title>
<p>The Supplementary Material for this article can be found online at: <ext-link ext-link-type="uri" xlink:href="https://www.frontiersin.org/articles/10.3389/fpsyg.2022.945973/full#supplementary-material">https://www.frontiersin.org/articles/10.3389/fpsyg.2022.945973/full#supplementary-material</ext-link></p>
<supplementary-material xlink:href="Data_Sheet_1.docx" id="SM1" mimetype="application/vnd.openxmlformats-officedocument.wordprocessingml.document" xmlns:xlink="http://www.w3.org/1999/xlink"/></sec>
<ref-list>
<title>References</title>
<ref id="B1">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Archibald</surname> <given-names>J.</given-names></name></person-group> (<year>2009</year>). Phonological feature re-assembly and the importance of phonetic cues. Second <source>Lang. Res.</source> <volume>25</volume>, <fpage>231</fpage>&#x02013;<lpage>233</lpage>. <pub-id pub-id-type="doi">10.1177/0267658308100284</pub-id></citation>
</ref>
<ref id="B2">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Barr</surname> <given-names>D. J.</given-names></name> <name><surname>Levy</surname> <given-names>R.</given-names></name> <name><surname>Scheepers</surname> <given-names>C.</given-names></name> <name><surname>Tily</surname> <given-names>H. J.</given-names></name></person-group> (<year>2013</year>). <article-title>Random effects structure for confirmatory hypothesis testing: keep it maximal</article-title>. <source>J. Mem. Lang.</source> <volume>68</volume>, <fpage>255</fpage>&#x02013;<lpage>278</lpage>. <pub-id pub-id-type="doi">10.1016/j.jml.2012.11.001</pub-id><pub-id pub-id-type="pmid">24403724</pub-id></citation></ref>
<ref id="B3">
<citation citation-type="web"><person-group person-group-type="author"><name><surname>Barto&#x00144;</surname> <given-names>K.</given-names></name></person-group> (<year>2022</year>). <source>MuMIn: Multi-Model Inference (1.46.0). R Package</source>. Available online at: <ext-link ext-link-type="uri" xlink:href="https://CRAN.R-project.org/package=MuMIn">https://CRAN.R-project.org/package=MuMIn</ext-link></citation>
</ref>
<ref id="B4">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Bates</surname> <given-names>D.</given-names></name> <name><surname>M&#x000E4;chler</surname> <given-names>M.</given-names></name> <name><surname>Bolker</surname> <given-names>B.</given-names></name> <name><surname>Walker</surname> <given-names>S.</given-names></name></person-group> (<year>2015</year>). <article-title>Fitting linear mixed-effects models using lme4</article-title>. <source>J. Stat. Softw.</source> <volume>67</volume>, <fpage>1</fpage>&#x02013;<lpage>48</lpage>. <pub-id pub-id-type="doi">10.18637/jss.v067.i01</pub-id></citation>
</ref>
<ref id="B5">
<citation citation-type="web"><person-group person-group-type="author"><name><surname>Boersma</surname> <given-names>P.</given-names></name> <name><surname>Weenink</surname> <given-names>D.</given-names></name></person-group> (<year>2021</year>). <source>Praat: Doing Phonetics by Computer [Computer program] (Version 6.1.56)</source>. Available online at: <ext-link ext-link-type="uri" xlink:href="https://www.praat.org">https://www.praat.org</ext-link>.</citation>
</ref>
<ref id="B6">
<citation citation-type="book"><person-group person-group-type="author"><name><surname>Chan</surname> <given-names>M. K. M.</given-names></name></person-group> (<year>1985</year>). <source>Fuzhou Phonology: A Non-Linear Analysis of Tone and Stress.</source> <publisher-loc>Washington, DC</publisher-loc>: <publisher-name>University of Washington</publisher-name>.</citation>
</ref>
<ref id="B7">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Chao</surname> <given-names>Y. R.</given-names></name></person-group> (<year>1930</year>). <article-title>A system of tone-letters</article-title>. <source>Le Ma&#x000EE;tre Phon&#x000E9;tique</source> <volume>8</volume>, <fpage>24</fpage>&#x02013;<lpage>27</lpage>.</citation>
</ref>
<ref id="B8">
<citation citation-type="book"><person-group person-group-type="author"><name><surname>Chao</surname> <given-names>Y. R.</given-names></name></person-group> (<year>1968</year>). <source>A Grammar of Spoken Chinese</source>. <publisher-loc>Berkeley, CA</publisher-loc>: <publisher-name>University of California Press</publisher-name>.</citation>
</ref>
<ref id="B9">
<citation citation-type="book"><person-group person-group-type="author"><name><surname>Chen</surname> <given-names>L.</given-names></name> <name><surname>Norman</surname> <given-names>J.</given-names></name></person-group> (<year>1965</year>). <source>An Introduction to the Foochow Dialect</source>. San Francisco State College.</citation>
</ref>
<ref id="B10">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Chen</surname> <given-names>M. Y.</given-names></name></person-group> (<year>1987</year>). <article-title>The syntax of Xiamen tone sandhi</article-title>. <source>Phonology Yearbook</source> <volume>4</volume>, <fpage>109</fpage>&#x02013;<lpage>149</lpage>. <pub-id pub-id-type="doi">10.1017/S0952675700000798</pub-id></citation>
</ref>
<ref id="B11">
<citation citation-type="book"><person-group person-group-type="author"><name><surname>Chen</surname> <given-names>M. Y.</given-names></name></person-group> (<year>2000</year>). <source>Tone Sandhi: Patterns Across Chinese Dialects</source>. <publisher-loc>Cambridge</publisher-loc>: <publisher-name>Cambridge University Press</publisher-name>.</citation>
</ref>
<ref id="B12">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Chen</surname> <given-names>Y.</given-names></name> <name><surname>Xu</surname> <given-names>Y.</given-names></name></person-group> (<year>2006</year>). <article-title>Production of weak elements in speech &#x02013; evidence from F0 patterns of neutral tone in standard Chinese</article-title>. <source>Phonetica</source> <volume>63</volume>, <fpage>47</fpage>&#x02013;<lpage>75</lpage>. <pub-id pub-id-type="doi">10.1159/000091406</pub-id><pub-id pub-id-type="pmid">16514275</pub-id></citation></ref>
<ref id="B13">
<citation citation-type="book"><person-group person-group-type="author"><name><surname>Cheng</surname> <given-names>C.-C.</given-names></name></person-group> (<year>1973</year>). <source>A Synchronic Phonology of Mandarin Chinese</source>. <publisher-loc>The Hague</publisher-loc>: <publisher-name>De Gruyter Mouton</publisher-name>.</citation>
</ref>
<ref id="B14">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>de Lacy</surname> <given-names>P.</given-names></name></person-group> (<year>2002</year>). <article-title>The interaction of tone and stress in optimality theory</article-title>. <source>Phonology</source> <volume>19</volume>, <fpage>1</fpage>&#x02013;<lpage>32</lpage>. <pub-id pub-id-type="doi">10.1017/S0952675702004220</pub-id></citation>
</ref>
<ref id="B15">
<citation citation-type="book"><person-group person-group-type="author"><name><surname>Donohue</surname> <given-names>C.</given-names></name></person-group> (<year>2013</year>). <source>Fuzhou Tonal Acoustics and Tonology</source>. <publisher-loc>M&#x000FC;nchen</publisher-loc>: <publisher-name>Lincom Europa</publisher-name>.</citation>
</ref>
<ref id="B16">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Downing</surname> <given-names>L. J.</given-names></name></person-group> (<year>1990</year>). <article-title>Local and metrical tone shift in Nguni</article-title>. <source>Stud. Afr. Linguist.</source> <volume>21</volume>, <fpage>261</fpage>&#x02013;<lpage>318</lpage>. <pub-id pub-id-type="doi">10.32473/sal.v21i3.107431</pub-id></citation>
</ref>
<ref id="B17">
<citation citation-type="book"><person-group person-group-type="author"><name><surname>Duanmu</surname> <given-names>S.</given-names></name></person-group> (<year>1990</year>). <source>A formal study of syllable, tone, stress and domain in Chinese languages</source> (<publisher-loc>Doctoral dissertation</publisher-loc>). Massachusetts Institute of Technology.</citation>
</ref>
<ref id="B18">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Duanmu</surname> <given-names>S.</given-names></name></person-group> (<year>1993</year>). <article-title>Rime length, stress, and association domains</article-title>. <source>J. East Asian Ling.</source> <volume>2</volume>, <fpage>1</fpage>&#x02013;<lpage>44</lpage>. <pub-id pub-id-type="doi">10.1007/BF01440582</pub-id></citation>
</ref>
<ref id="B19">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Duanmu</surname> <given-names>S.</given-names></name></person-group> (<year>1995</year>). <article-title>Metrical and tonal phonology of compounds in two chinese dialects</article-title>. <source>Language.</source> <volume>71</volume>, <fpage>225</fpage>. <pub-id pub-id-type="doi">10.2307/416163</pub-id></citation>
</ref>
<ref id="B20">
<citation citation-type="book"><person-group person-group-type="author"><name><surname>Duanmu</surname> <given-names>S.</given-names></name></person-group> (<year>2007</year>). <source>The Phonology of Standard Chinese, 2nd Edn</source>. <publisher-loc>Oxford</publisher-loc>: <publisher-name>Oxford University Press</publisher-name>.</citation>
</ref>
<ref id="B21">
<citation citation-type="book"><person-group person-group-type="author"><name><surname>Feng</surname> <given-names>S.</given-names></name></person-group> (<year>2017</year>). <source>Prosodic Morphology in Mandarin Chinese</source>. <publisher-loc>New York, NY</publisher-loc>: <publisher-name>Routledge</publisher-name>.</citation>
</ref>
<ref id="B22">
<citation citation-type="book"><person-group person-group-type="author"><name><surname>Fox</surname> <given-names>J.</given-names></name> <name><surname>Weisberg</surname> <given-names>S.</given-names></name></person-group> (<year>2019</year>). <source>An R Companion to Applied Regression, 3rd</source>. <publisher-loc>Thousand Oaks, CA</publisher-loc>: <publisher-name>Sage Publications</publisher-name>.</citation>
</ref>
<ref id="B23">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Gordon</surname> <given-names>M. K.</given-names></name> <name><surname>Applebaum</surname> <given-names>A.</given-names></name> <name><surname>Chacon</surname> <given-names>T.</given-names></name> <name><surname>Martin</surname> <given-names>J.</given-names></name> <name><surname>Rose</surname> <given-names>F.</given-names></name></person-group> (<year>2018</year>). <article-title>A cross-linguistic study of phonetic correlates of metrical structure in under-documented languages</article-title>. <source>J. Acoust. Soc. Am.</source> <volume>144</volume>, <fpage>1937</fpage>&#x02013;<lpage>1937</lpage>. <pub-id pub-id-type="doi">10.1121/1.5068474</pub-id></citation>
</ref>
<ref id="B24">
<citation citation-type="book"><person-group person-group-type="author"><name><surname>Guo</surname> <given-names>C.</given-names></name></person-group> (<year>2020</year>). <source>Types of Prosodic Structure and Tone Sandhi of Disyllabic Words in Chinese Dialects</source>. <publisher-loc>Shanghai</publisher-loc>: <publisher-name>Shanghai Normal University</publisher-name>.</citation>
</ref>
<ref id="B25">
<citation citation-type="book"><person-group person-group-type="author"><name><surname>Hayes</surname> <given-names>B.</given-names></name></person-group> (<year>1995</year>). <source>Metrical Stress Theory: Principles and Case Studies</source>. <publisher-loc>Chicago, IL</publisher-loc>: <publisher-name>The University of Chicago Press</publisher-name>.</citation>
</ref>
<ref id="B26">
<citation citation-type="book"><person-group person-group-type="author"><name><surname>Hoa</surname> <given-names>M.</given-names></name></person-group> (<year>1983</year>). <source>L&#x00027;accentuation en P</source>&#x000C3;&#x000A9;<italic>kinois</italic>. Paris: Editions Langages Crois&#x000E9;s.</citation>
</ref>
<ref id="B27">
<citation citation-type="web"><person-group person-group-type="author"><name><surname>Hsieh</surname> <given-names>F. F.</given-names></name></person-group> (<year>2005</year>). <article-title>&#x0201C;Tonal chain-shifts as anti-neutralization-induced tone sandhi,&#x0201D;</article-title> in <source>University of Pennsylvania Working Papers in Linguistics, Vol. 11</source>, <fpage>99</fpage>&#x02013;<lpage>112</lpage>. Available online at: <ext-link ext-link-type="uri" xlink:href="https://repository.upenn.edu/pwpl/vol11/iss1/9">https://repository.upenn.edu/pwpl/vol11/iss1/9</ext-link></citation>
</ref>
<ref id="B28">
<citation citation-type="book"><person-group person-group-type="author"><name><surname>Hubbard</surname> <given-names>K.</given-names></name></person-group> (<year>1994</year>). <source>Duration in Moraic Theory</source>. <publisher-loc>Berkeley, CA</publisher-loc>: <publisher-name>University of California</publisher-name>.</citation>
</ref>
<ref id="B29">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Hyman</surname> <given-names>L. M.</given-names></name></person-group> (<year>2006</year>). <article-title>Word-prosodic typology</article-title>. <source>Phonology</source> <volume>23</volume>, <fpage>225</fpage>&#x02013;<lpage>257</lpage>. <pub-id pub-id-type="doi">10.1017/S0952675706000893</pub-id></citation>
</ref>
<ref id="B30">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Hyman</surname> <given-names>L. M.</given-names></name></person-group> (<year>2009</year>). <article-title>How (not) to do phonological typology: the case of pitch-accent</article-title>. <source>Lang. Sci.</source> <volume>31</volume>, <fpage>213</fpage>&#x02013;<lpage>238</lpage>. <pub-id pub-id-type="doi">10.1016/j.langsci.2008.12.007</pub-id></citation>
</ref>
<ref id="B31">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Inkelas</surname> <given-names>S.</given-names></name> <name><surname>Zec</surname> <given-names>D.</given-names></name></person-group> (<year>1988</year>). <article-title>Serbo-croatian pitch accent: the interaction of tone, stress, and intonation</article-title>. <source>Language</source> <volume>64</volume>, <fpage>227</fpage>. <pub-id pub-id-type="doi">10.2307/415433</pub-id></citation>
</ref>
<ref id="B32">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Jenks</surname> <given-names>P.</given-names></name> <name><surname>Rose</surname> <given-names>S.</given-names></name></person-group> (<year>2011</year>). <article-title>High tone in Moro: effects of prosodic categories and morphological domains</article-title>. <source>Natural Lang. Linguist. Theory</source> <volume>29</volume>, <fpage>211</fpage>&#x02013;<lpage>250</lpage>. <pub-id pub-id-type="doi">10.1007/s11049-011-9120-x</pub-id></citation>
</ref>
<ref id="B33">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Kennedy</surname> <given-names>G. A.</given-names></name></person-group> (<year>1953</year>). <article-title>Two tone patterns in Tangsic</article-title>. <source>Language</source> <volume>29</volume>, <fpage>367</fpage>&#x02013;<lpage>373</lpage>. <pub-id pub-id-type="doi">10.2307/410033</pub-id></citation>
</ref>
<ref id="B34">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Kochanski</surname> <given-names>G.</given-names></name> <name><surname>Shih</surname> <given-names>C.</given-names></name> <name><surname>Jing</surname> <given-names>H.</given-names></name></person-group> (<year>2003</year>). <article-title>Hierarchical structure and word strength prediction of mandarin prosody</article-title>. <source>Int. J. Speech Technol.</source> <volume>6</volume>, <fpage>33</fpage>&#x02013;<lpage>43</lpage>. <pub-id pub-id-type="doi">10.1023/A:1021095805490</pub-id></citation>
</ref>
<ref id="B35">
<citation citation-type="book"><person-group person-group-type="author"><name><surname>Kurpaska</surname> <given-names>M.</given-names></name></person-group> (<year>2010</year>). <source>Chinese Language(s)</source>. <publisher-loc>Berlin</publisher-loc>: <publisher-name>De Gruyter Mouton</publisher-name>.</citation>
</ref>
<ref id="B36">
<citation citation-type="book"><person-group person-group-type="author"><name><surname>Lee</surname> <given-names>K. A.</given-names></name></person-group> (<year>1997</year>). <source>Chinese Tone Sandhi and Prosody</source>. <publisher-loc>Urbana</publisher-loc>: <publisher-name>University of Illinois Urbana-Champaign</publisher-name>.</citation>
</ref>
<ref id="B37">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Lenth</surname> <given-names>R. V.</given-names></name></person-group> (<year>2016</year>). <article-title>Least-squares means: the R package lsmeans</article-title>. <source>J. Stat. Softw.</source> <volume>69</volume>, <fpage>1</fpage>&#x02013;<lpage>33</lpage>. <pub-id pub-id-type="doi">10.18637/jss.v069.i01</pub-id></citation>
</ref>
<ref id="B38">
<citation citation-type="book"><person-group person-group-type="author"><name><surname>Li</surname> <given-names>R.</given-names></name> <name><surname>Xiong</surname> <given-names>Z.</given-names></name> <name><surname>Zhang</surname> <given-names>Z.</given-names></name></person-group> (<year>1987</year>). <source>Language Atlas of China</source>. <publisher-loc>Hong Kong</publisher-loc>: <publisher-name>Longman Group (Far East)</publisher-name>.</citation>
</ref>
<ref id="B39">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Li</surname> <given-names>W. M.</given-names></name></person-group> (<year>1981</year>). <article-title>Study on the neutral tone and stress</article-title>. <source>Zhongguo Yuwen</source> <volume>1</volume>, <fpage>35</fpage>&#x02013;<lpage>40</lpage>.</citation>
</ref>
<ref id="B40">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Liberman</surname> <given-names>M.</given-names></name> <name><surname>Prince</surname> <given-names>A.</given-names></name></person-group> (<year>1977</year>). <article-title>On stress and linguistic rhythm</article-title>. <source>Linguist. Inquiry</source> <volume>8</volume>, <fpage>249</fpage>&#x02013;<lpage>336</lpage>.</citation>
</ref>
<ref id="B41">
<citation citation-type="web"><person-group person-group-type="author"><name><surname>Lin</surname> <given-names>H.-S.</given-names></name></person-group> (<year>2006</year>). <article-title>Directionality in Chengdu tone sandhi</article-title>. <source>Concentr. Stud. Linguist.</source> <volume>32</volume>, <fpage>31</fpage>&#x02013;<lpage>67</lpage>. Available online at: <ext-link ext-link-type="uri" xlink:href="http://www.concentric-linguistics.url.tw/upload/articlesfs161402111113162203.pdf">http://www.concentric-linguistics.url.tw/upload/articlesfs161402111113162203.pdf</ext-link></citation>
</ref>
<ref id="B42">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Lin</surname> <given-names>H.-S.</given-names></name></person-group> (<year>2011</year>). <article-title>Changsha tone sandhi</article-title>. <source>J. Chin. Lang. Teach.</source> <volume>8</volume>, <fpage>27</fpage>&#x02013;<lpage>64</lpage>. <pub-id pub-id-type="doi">10.6393/JCLT.201108.0027</pub-id></citation>
</ref>
<ref id="B43">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Maddieson</surname> <given-names>I.</given-names></name></person-group> (<year>1993</year>). <article-title>"Splitting the mora</article-title>. <source>UCLA Working Pap. Phonet.</source> <volume>83</volume>, <fpage>9</fpage>&#x02013;<lpage>18</lpage>.</citation>
</ref>
<ref id="B44">
<citation citation-type="book"><person-group person-group-type="author"><name><surname>McCarthy</surname> <given-names>J. J.</given-names></name> <name><surname>Prince</surname> <given-names>A.</given-names></name></person-group> (<year>1994</year>). <article-title>&#x0201C;Prosodic morphology,&#x0201D;</article-title> in <source>A Handbook of Phonological Theory, Vol</source>. <italic>15</italic>. (Amherst).</citation>
</ref>
<ref id="B45">
<citation citation-type="book"><person-group person-group-type="author"><name><surname>Mirman</surname> <given-names>D.</given-names></name></person-group> (<year>2014</year>). <source>Growth Curve Analysis and Visualization Using R</source>. <publisher-loc>Boca Raton, FL</publisher-loc>: <publisher-name>CRC Press</publisher-name>.</citation>
</ref>
<ref id="B46">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Pan</surname> <given-names>W.</given-names></name></person-group> (<year>1982</year>). <article-title>A Note on tone development: several problems in the development of Chinese tones</article-title>. <source>J. Chin. Linguist.</source> <volume>10</volume>, <fpage>359</fpage>&#x02013;<lpage>385</lpage>.</citation>
</ref>
<ref id="B47">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Pearce</surname> <given-names>M.</given-names></name></person-group> (<year>2006</year>). <article-title>The interaction between metrical structure and tone in Kera</article-title>. <source>Phonology</source> <volume>23</volume>, <fpage>259</fpage>&#x02013;<lpage>286</lpage>. <pub-id pub-id-type="doi">10.1017/S095267570600090X</pub-id></citation>
</ref>
<ref id="B48">
<citation citation-type="book"><person-group person-group-type="author"><name><surname>Prince</surname> <given-names>A.</given-names></name></person-group> (<year>1990</year>). <article-title>&#x0201C;Quantitative consequences of rhythmic organization,&#x0201D;</article-title> in <source>Papers from the 26th Annual Regional Meeting of the Chicago Linguistic Society : The General Session (CLS 26), Vol. 2</source>, eds M. Ziolkowski, M. Noske, and K. Deaton (<publisher-loc>Chicago, IL</publisher-loc>: <publisher-name>Chicago Linguistic Society</publisher-name>), <fpage>355</fpage>&#x02013;<lpage>398</lpage>.</citation>
</ref>
<ref id="B49">
<citation citation-type="book"><person-group person-group-type="author"><name><surname>Qin</surname> <given-names>Z.</given-names></name></person-group> (<year>2012</year>). <source>Prosodic Constituents in the Prosodic Structure of Chengdu Dialect</source>. <publisher-loc>Shanghai</publisher-loc>: <publisher-name>Tongji University,</publisher-name>.</citation>
</ref>
<ref id="B50">
<citation citation-type="web"><person-group person-group-type="author"><collab>R Core Team</collab></person-group> (<year>2020</year>). <source>R: A Language and Environment for Statistical Computing (Version 3.6.3)</source>. <publisher-loc>Vienna</publisher-loc>: <publisher-name>R Foundation for Statistical Computing</publisher-name>. Available online at: <ext-link ext-link-type="uri" xlink:href="https://www.R-project.org/">https://www.R-project.org/</ext-link></citation>
</ref>
<ref id="B51">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Rose</surname> <given-names>P.</given-names></name></person-group> (<year>1987</year>). <article-title>Considerations in the normalisation of the fundamental frequency of linguistic tone</article-title>. <source>Speech Commun.</source> <volume>6</volume>, <fpage>343</fpage>&#x02013;<lpage>352</lpage>. <pub-id pub-id-type="doi">10.1016/0167-6393(87)90009-4</pub-id></citation>
</ref>
<ref id="B52">
<citation citation-type="book"><person-group person-group-type="author"><name><surname>Selkirk</surname> <given-names>E. O.</given-names></name></person-group> (<year>1982</year>). <source>The Syntax of Words (Linguistic Inquiry Monographs), Vol. 7</source>. The MIT Press.</citation>
</ref>
<ref id="B53">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Shi</surname> <given-names>M.</given-names></name> <name><surname>Chen</surname> <given-names>Y.</given-names></name> <name><surname>Mous</surname> <given-names>M.</given-names></name></person-group> (<year>2020</year>). <article-title>Tonal split and laryngeal contrast of onset consonant in Lili Wu Chinese</article-title>. <source>J. Acoust. Soc. Am.</source> <volume>147</volume>, <fpage>2901</fpage>&#x02013;<lpage>2916</lpage>. <pub-id pub-id-type="doi">10.1121/10.0001000</pub-id><pub-id pub-id-type="pmid">32359279</pub-id></citation></ref>
<ref id="B54">
<citation citation-type="book"><person-group person-group-type="author"><name><surname>Styler</surname> <given-names>W.</given-names></name></person-group> (<year>2011</year>). <source>Using Praat for Linguistic Research</source>. <publisher-loc>Boulder, CO</publisher-loc>: <publisher-name>LSALinguistic Institute&#x00027;s PraatWorkshop</publisher-name>.</citation>
</ref>
<ref id="B55">
<citation citation-type="book"><person-group person-group-type="author"><name><surname>Sui</surname> <given-names>Y.</given-names></name></person-group> (<year>2016</year>). <article-title>&#x0201C;The interaction of metrical structure and tone in standard Chinese,&#x0201D;</article-title> in <source>Dimensions of Phonological Stress</source>, eds J. Heinz, R. Goedemans, and H. van der Hulst (<publisher-loc>Cambridge</publisher-loc>: <publisher-name>Cambridge University Press</publisher-name>), <fpage>101</fpage>&#x02013;<lpage>122</lpage>.</citation>
</ref>
<ref id="B56">
<citation citation-type="book"><person-group person-group-type="author"><name><surname>Sun</surname> <given-names>X.</given-names></name></person-group> (<year>2002</year>). <article-title>&#x0201C;Pitch determination and voice quality analysis using Subharmonic-to-Harmonic Ratio,&#x0201D;</article-title> in <source>IEEE International Conference on Acoustics Speech and Signal Processing</source> (<publisher-loc>Orlando, FL</publisher-loc>: <publisher-name>IEEE</publisher-name>), I-333&#x02013;I-336.</citation>
</ref>
<ref id="B57">
<citation citation-type="book"><person-group person-group-type="author"><name><surname>Turk</surname> <given-names>A.</given-names></name> <name><surname>Nakai</surname> <given-names>S.</given-names></name> <name><surname>Sugahara</surname> <given-names>M.</given-names></name></person-group> (<year>2006</year>). <article-title>&#x0201C;Acoustic segment durations in prosodic research: a practical guide,&#x0201D;</article-title> in <source>Methods in Empirical Prosody Research</source>, eds S. Sudhoff, D. Lenertova, R. Meyer, S. Pappert, P. Augurzky, I. Mleinek, N. Richter, and J. Schlie&#x000DF;er (Berlin: De Gruyter), <fpage>1</fpage>&#x02013;<lpage>28</lpage>.</citation>
</ref>
<ref id="B58">
<citation citation-type="book"><person-group person-group-type="author"><name><surname>Wang</surname> <given-names>H.</given-names></name></person-group> (<year>2008</year>). <source>Nonlinear Phonology of Chinese: The Phonological Structure and Monosyllabic Sounds of Chinese.</source> <publisher-loc>Beijing</publisher-loc>: <publisher-name>Peking University Press</publisher-name>.</citation>
</ref>
<ref id="B59">
<citation citation-type="book"><person-group person-group-type="author"><name><surname>Wright</surname> <given-names>M. S.</given-names></name></person-group> (<year>1983</year>). <source>A Metrical Approach to Tone Sandhi in Chinese Dialects</source>. <publisher-loc>Amherst, MA</publisher-loc>: <publisher-name>University of Massachusetts Amherst</publisher-name>.</citation>
</ref>
<ref id="B60">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Wu</surname> <given-names>F.</given-names></name> <name><surname>Kenstowicz</surname> <given-names>M.</given-names></name></person-group> (<year>2015</year>). <article-title>Duration reflexes of syllable structure in Mandarin</article-title>. <source>Lingua</source> <volume>164</volume>, <fpage>87</fpage>&#x02013;<lpage>99</lpage>. <pub-id pub-id-type="doi">10.1016/j.lingua.2015.06.010</pub-id></citation>
</ref>
<ref id="B61">
<citation citation-type="web"><person-group person-group-type="author"><name><surname>Xia</surname> <given-names>L.</given-names></name></person-group> (<year>2018</year>). <article-title>The weak and strong structure and tone sandhi of Yiyang dialect in Xiang dialect</article-title>. <source>Fangyan</source> <volume>40</volume>, <fpage>48</fpage>&#x02013;<lpage>57</lpage>. Available online at: <ext-link ext-link-type="uri" xlink:href="http://www.cqvip.com/qk/81953x/201801/674672474.html">http://www.cqvip.com/qk/81953x/201801/674672474.html</ext-link></citation>
</ref>
<ref id="B62">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Xu</surname> <given-names>Y.</given-names></name></person-group> (<year>2009</year>). <article-title>Timing and coordination in tone and intonation&#x02014;An articulatory-functional perspective</article-title>. <source>Lingua</source> <volume>119</volume>, <fpage>906</fpage>&#x02013;<lpage>927</lpage>. <pub-id pub-id-type="doi">10.1016/j.lingua.2007.09.015</pub-id></citation>
</ref>
<ref id="B63">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Xu</surname> <given-names>Y.</given-names></name> <name><surname>Wang</surname> <given-names>M.</given-names></name></person-group> (<year>2009</year>). <article-title>Organizing syllables into groups-Evidence from F0 and duration patterns in Mandarin</article-title>. <source>J. Phon.</source> <volume>37</volume>, <fpage>502</fpage>&#x02013;<lpage>520</lpage>. <pub-id pub-id-type="doi">10.1016/j.wocn.2009.08.003</pub-id><pub-id pub-id-type="pmid">23482405</pub-id></citation></ref>
<ref id="B64">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Yao</surname> <given-names>Y.</given-names></name> <name><surname>Chang</surname> <given-names>C. B.</given-names></name></person-group> (<year>2016</year>). <article-title>On the cognitive basis of contact-induced sound change: VOWEL merger reversal in Shanghainese</article-title>. <source>Language</source> <volume>92</volume>, <fpage>433</fpage>&#x02013;<lpage>467</lpage>. <pub-id pub-id-type="doi">10.1353/lan.2016.0031</pub-id></citation>
</ref>
<ref id="B65">
<citation citation-type="book"><person-group person-group-type="author"><name><surname>Yip</surname> <given-names>M.</given-names></name></person-group> (<year>2002</year>). <source>Tone</source>. <publisher-loc>Cambridge</publisher-loc>: <publisher-name>Cambridge University Press</publisher-name>.</citation>
</ref>
<ref id="B66">
<citation citation-type="book"><person-group person-group-type="author"><name><surname>Yue-Hashimoto</surname> <given-names>A. O.</given-names></name></person-group> (<year>1986</year>). <article-title>Tonal flip-flop in Chinese dialects</article-title>. <source>J. Chin. Linguist</source>. <volume>14</volume>, <fpage>161</fpage>&#x02013;<lpage>183</lpage>.</citation>
</ref>
<ref id="B67">
<citation citation-type="book"><person-group person-group-type="author"><name><surname>Yue-Hashimoto</surname> <given-names>A. O.</given-names></name></person-group> (<year>1987</year>). <article-title>&#x0201C;Tone sandhi across Chinese dialects,&#x0201D;</article-title> in <source>Wang Li Memorial Volumes (English Volume)</source>, ed Chinese Language Society of Hong Kong (<publisher-loc>Hong Kong</publisher-loc>: <publisher-name>Joint Publishing Company</publisher-name>), <fpage>445</fpage>&#x02013;<lpage>474</lpage>.</citation>
</ref>
<ref id="B68">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Zeng</surname> <given-names>X.</given-names></name> <name><surname>Niu</surname> <given-names>S.</given-names></name></person-group> (<year>2006</year>). <article-title>The disyllabic tone sandhi and its typological explain of Liujia dialect in Sanjiang county in Guangxi</article-title>. <source>Fangyan</source> <volume>4</volume>, <fpage>290</fpage>&#x02013;<lpage>308</lpage>.</citation>
</ref>
<ref id="B69">
<citation citation-type="book"><person-group person-group-type="author"><name><surname>Zhang</surname> <given-names>H.</given-names></name></person-group> (<year>2016</year>). <source>Syntax-Phonology Interface</source>. <publisher-loc>New York, NY</publisher-loc>: <publisher-name>Routledge</publisher-name>.</citation>
</ref>
<ref id="B70">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Zhang</surname> <given-names>J.</given-names></name></person-group> (<year>2007</year>). <article-title>A directional asymmetry in Chinese tone sandhi systems</article-title>. <source>J. East Asian Ling.</source> <volume>16</volume>, <fpage>259</fpage>&#x02013;<lpage>302</lpage>. <pub-id pub-id-type="doi">10.1007/s10831-007-9016-2</pub-id></citation>
</ref>
<ref id="B71">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Zhang</surname> <given-names>Z.</given-names></name></person-group> (<year>2021</year>). <article-title>Contribution of laryngeal size to differences between male and female voice production</article-title>. <source>J. Acoust. Soc. Am.</source> <volume>150</volume>, <fpage>4511</fpage>&#x02013;<lpage>4521</lpage>. <pub-id pub-id-type="doi">10.1121/10.0009033</pub-id><pub-id pub-id-type="pmid">34972311</pub-id></citation></ref>
<ref id="B72">
<citation citation-type="web"><person-group person-group-type="author"><name><surname>Zhong</surname> <given-names>Q.</given-names></name></person-group> (<year>2003</year>). <article-title>Light tone in Changsha dialect</article-title>. <source>Fangyan</source> <volume>3</volume>, <fpage>255</fpage>&#x02013;<lpage>264</lpage>. Available online at: <ext-link ext-link-type="uri" xlink:href="http://www.cqvip.com/qk/81953x/200303/9351513.html">http://www.cqvip.com/qk/81953x/200303/9351513.html</ext-link></citation>
</ref>
<ref id="B73">
<citation citation-type="book"><person-group person-group-type="author"><name><surname>Zhu</surname> <given-names>X.</given-names></name></person-group> (<year>2010</year>). <source>Phonetics</source>. <publisher-loc>Beijing</publisher-loc>: <publisher-name>Commercial Press</publisher-name>.</citation>
</ref>
<ref id="B74">
<citation citation-type="book"><person-group person-group-type="author"><name><surname>Zhu</surname> <given-names>Y.</given-names></name></person-group> (<year>2021</year>). <article-title>&#x0201C;Moraic footing in Suzhou Chinese: evidence from toneless moras,&#x0201D;</article-title> in <source>Proceedings of the Annual Meetings on Phonology</source> (<publisher-loc>Toronto, ON</publisher-loc>).</citation>
</ref>
</ref-list> 
</back>
</article>