<?xml version="1.0" encoding="UTF-8" standalone="no"?>
<!DOCTYPE article PUBLIC "-//NLM//DTD Journal Publishing DTD v2.3 20070202//EN" "journalpublishing.dtd">
<article xml:lang="EN" xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink" article-type="research-article">
<front>
<journal-meta>
<journal-id journal-id-type="publisher-id">Front. Commun.</journal-id>
<journal-title>Frontiers in Communication</journal-title>
<abbrev-journal-title abbrev-type="pubmed">Front. Commun.</abbrev-journal-title>
<issn pub-type="epub">2297-900X</issn>
<publisher>
<publisher-name>Frontiers Media S.A.</publisher-name>
</publisher>
</journal-meta>
<article-meta>
<article-id pub-id-type="doi">10.3389/fcomm.2024.1338844</article-id>
<article-categories>
<subj-group subj-group-type="heading">
<subject>Communication</subject>
<subj-group>
<subject>Hypothesis and Theory</subject>
</subj-group>
</subj-group>
</article-categories>
<title-group>
<article-title>What makes a multimodal construction? Evidence for a prosodic mode in spoken English</article-title>
</title-group>
<contrib-group>
<contrib contrib-type="author" corresp="yes">
<name><surname>Lehmann</surname> <given-names>Claudia</given-names></name>
<xref ref-type="corresp" rid="c001"><sup>&#x0002A;</sup></xref>
<uri xlink:href="http://loop.frontiersin.org/people/1261944/overview"/>
<role content-type="https://credit.niso.org/contributor-roles/writing-original-draft/"/>
<role content-type="https://credit.niso.org/contributor-roles/writing-review-editing/"/>
</contrib>
</contrib-group>
<aff><institution>Chair of Present-Day English, Institute of English and American Studies, University of Potsdam</institution>, <addr-line>Potsdam</addr-line>, <country>Germany</country></aff>
<author-notes>
<fn fn-type="edited-by"><p>Edited by: Maria Grazia Sindoni, University of Messina, Italy</p></fn>
<fn fn-type="edited-by"><p>Reviewed by: Kyle Jasmin, University of London, United Kingdom</p>
<p>Valentin Werner, University of Bamberg, Germany</p></fn>
<corresp id="c001">&#x0002A;Correspondence: Claudia Lehmann <email>claudia.lehmann&#x00040;uni-potsdam.de</email></corresp>
</author-notes>
<pub-date pub-type="epub">
<day>19</day>
<month>02</month>
<year>2024</year>
</pub-date>
<pub-date pub-type="collection">
<year>2024</year>
</pub-date>
<volume>9</volume>
<elocation-id>1338844</elocation-id>
<history>
<date date-type="received">
<day>15</day>
<month>11</month>
<year>2023</year>
</date>
<date date-type="accepted">
<day>01</day>
<month>02</month>
<year>2024</year>
</date>
</history>
<permissions>
<copyright-statement>Copyright &#x000A9; 2024 Lehmann.</copyright-statement>
<copyright-year>2024</copyright-year>
<copyright-holder>Lehmann</copyright-holder>
<license xlink:href="http://creativecommons.org/licenses/by/4.0/"><p>This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.</p></license>
</permissions>
<abstract>
<p>Traditionally, grammar deals with morphosyntax, and so does Construction Grammar. Prosody, in contrast, is deemed <italic>paralinguistic</italic>. Testifying to the &#x0201C;multimodal turn,&#x0201D; the past decade has witnessed a rise in interest in multimodal Construction Grammar, i.e., an interest in grammatic constructions other than exclusively morphosyntactic ones. Part of the debate in this recent area of interest is the question of what defines a multimodal construction and, more specifically, which role prosody plays. This paper will show that morphosyntax and prosody are two different semiotic modes and, therefore, can combine to form a multimodal construction. To this end, studies showing the independence of prosody for meaning-making will be reviewed and a small-scale experimental study on the ambiguous utterance <italic>Tell me about it</italic> will be reported on.</p></abstract>
<kwd-group>
<kwd>Construction Grammar</kwd>
<kwd>usage-based</kwd>
<kwd>prosody</kwd>
<kwd>semiotic mode</kwd>
<kwd>forced-choice experiment</kwd>
</kwd-group>
<contract-sponsor id="cn001">Deutsche Forschungsgemeinschaft<named-content content-type="fundref-id">10.13039/501100001659</named-content></contract-sponsor>
<counts>
<fig-count count="8"/>
<table-count count="3"/>
<equation-count count="0"/>
<ref-count count="80"/>
<page-count count="15"/>
<word-count count="10010"/>
</counts>
<custom-meta-wrap>
<custom-meta>
<meta-name>section-at-acceptance</meta-name>
<meta-value>Multimodality of Communication</meta-value>
</custom-meta>
</custom-meta-wrap>
</article-meta>
</front>
<body>
<sec id="s1">
<title>1 Introduction</title>
<p>Grammar deals with morphosyntactic patterns. True to this claim, the introductory sentence to the <italic>Oxford Handbook of English Grammar</italic> states that &#x0201C;&#x02018;grammar&#x00027; is used in the sense which encompasses morphology (the principles of word formation) and syntax (the system for combining words into phrases, clauses, and sentences)&#x0201D; (Aarts et al., <xref ref-type="bibr" rid="B1">2019</xref>). Construction Grammar is no exception to this rule: Goldberg defines a grammatical construction as a &#x0201C;learned pairing of form with semantic or discourse function, including morphemes or words, idioms, partially lexically filled and fully general phrasal patterns&#x0201D; (Goldberg, <xref ref-type="bibr" rid="B24">2006</xref>, p. 5). While Construction Grammar foregrounds the role meaning plays in forming grammatical structures, neither intonation nor prosody are explicitly mentioned. This is surprising to the extent that research at the prosody-meaning interface has a long tradition and intonation is acknowledged to fulfill grammatical functions (see e.g., Tench, <xref ref-type="bibr" rid="B69">1996</xref>; Wells, <xref ref-type="bibr" rid="B75">2006</xref>; Levis and Wichmann, <xref ref-type="bibr" rid="B46">2015</xref>; Nolan, <xref ref-type="bibr" rid="B56">2021</xref>). One of the reasons for separating prosody from grammar may have to do with the fact that even within prosody research, its grammatical function used to be downplayed, maintaining that &#x0201C;in practice it is usually context that disambiguates and the role of intonation is minimal&#x0201D; (Levis and Wichmann, <xref ref-type="bibr" rid="B46">2015</xref>, p. 151), even though Wichmann and Blakemore (<xref ref-type="bibr" rid="B76">2006</xref>, p. 1,537) argue earlier that &#x0201C;[t]he choice of a rise or fall, or the placement of a pitch accent, may be as important a cue to speaker meaning as its phonetic realization.&#x0201D; Rather, the so-called paralinguistic functions of prosody were foregrounded, i.e., its role in indicating emotions and attitudes (F&#x000E9;ry, <xref ref-type="bibr" rid="B18">2017</xref>, p. 7) and, indeed, the grammatical and the attitudinal functions of prosody are often interrelated (Gussenhoven, <xref ref-type="bibr" rid="B26">2004</xref>).</p>
<p>Testifying to the &#x0201C;multimodal turn,&#x0201D; the past decade has witnessed a rise in interest in multimodal Construction Grammar (see Section 2.2 below), i.e., an interest in constructions other than exclusively morphosyntactic ones. Part of the debate in this recent area of interest is the question of what defines a multimodal construction and, more specifically, which role prosody plays. While it seems uncontested that the combination of a morphosyntactic and a kinesic form might form a multimodal construction (see e.g., Ningelgen and Auer, <xref ref-type="bibr" rid="B55">2017</xref>; Ziem, <xref ref-type="bibr" rid="B78">2017</xref>; and other papers in Zima and Bergs, <xref ref-type="bibr" rid="B80">2017</xref>; or in Uhrig, <xref ref-type="bibr" rid="B70">2020</xref>), prosodic peculiarities of constructions are seldom addressed (notable exceptions include Lelandais and Ferr&#x000E9;, <xref ref-type="bibr" rid="B44">2019</xref>; P&#x000F5;ldvere and Paradis, <xref ref-type="bibr" rid="B59">2020</xref>). There is no <italic>a priori</italic> reason to exclude prosody from a constructional analysis, though; the only reason to do so seems to be the traditional misconception of prosody being something outside of the scope of grammar and, therefore, not worth any further consideration.</p>
<p>The aim of the present paper is twofold. First, it will show that prosody and morphosyntax can (and should) be considered independent semiotic modes (in the sense of Bateman et al., <xref ref-type="bibr" rid="B5">2017</xref>), which independently can fulfill grammatical functions. Second, the paper will also show that the two semiotic modes can combine to form a multimodal construction (in the sense of Construction Grammar). The paper will proceed as follows: The main tenets of usage-based Construction Grammar and the notion of multimodal constructions will be introduced. Based on previous research, the paper will then argue that prosody and morphosyntax are independent semiotic modes by showing that they make use of different materiality and forms and that they independently contribute to the discourse semantics. It will then report on evidence that the two different modes may combine to form a multimodal construction using the results of a forced choice experiment.</p></sec>
<sec id="s2">
<title>2 (Usage-based) Construction Grammar and multimodality</title>
<p>In this section, the core assumptions of (usage-based) Construction Grammar and its relation to multimodality will be introduced. More specifically, the debate surrounding the notion of multimodal construction will be reviewed.</p>
<sec>
<title>2.1 Constructions in Construction Grammar</title>
<p>Construction Grammar is no unified theory. For an overview of the different strands of Construction Grammar, Hoffmann and Trousdale (<xref ref-type="bibr" rid="B32">2013</xref>) is a useful resource. One of a few things all Construction Grammars have in common is that they consider the construction to be the core unit of language-related knowledge. A unit is considered a construction (C) &#x0201C;iff<sub>def</sub> C is a form-meaning pair &#x0003C; F<sub>i</sub>, S<sub>i</sub>&#x0003E; such that some aspects of F<sub>i</sub> or some aspect of S<sub>i</sub> is not strictly predictable from C&#x00027;s component parts or from other previously established constructions&#x0201D; (Goldberg, <xref ref-type="bibr" rid="B23">1995</xref>, p. 4). <xref ref-type="fig" rid="F1">Figure 1</xref> provides a schematic representation of a construction (taken from Croft and Cruse, <xref ref-type="bibr" rid="B12">2004</xref>, p. 258). An example is the English idiom <italic>Tell me about it</italic>. Its component parts suggest (predict) that information is requested, but experienced language users know that it can also mean &#x0201C;&#x02018;I&#x00027;m well aware of that,&#x00027; &#x02018;I agree;&#x00027; &#x02018;you don&#x00027;t have to tell me&#x0201D;&#x00027; (Tell, <xref ref-type="bibr" rid="B68">2023</xref>). Since its meaning cannot be predicted from its component parts, it is a separate construction and must be learned. From such a perspective, idioms enjoy the same ontological status as words and more schematic constructions.</p>
<fig id="F1" position="float">
<label>Figure 1</label>
<caption><p>Schematic representation of a construction.</p></caption>
<graphic mimetype="image" mime-subtype="tiff" xlink:href="fcomm-09-1338844-g0001.tif"/>
</fig>
<p>Usage-based approaches to Construction Grammar also consider predictable units to be constructions as long as they occur frequently enough so that they become entrenched in the language users constructicon, i.e., the mental repository of constructions (e.g., Bybee, <xref ref-type="bibr" rid="B8">2006</xref>, <xref ref-type="bibr" rid="B9">2013</xref>; Goldberg, <xref ref-type="bibr" rid="B24">2006</xref>; Divjak, <xref ref-type="bibr" rid="B16">2019</xref>). One example for this is the word <italic>singer</italic>. Even though its meaning &#x0201C;someone who sings&#x0201D; is perfectly predictable from its component parts, the verb <italic>sing</italic> and the derivational morpheme -<italic>er</italic>, the derivate <italic>singer</italic> is likely stored as a separate construction, because it is one of the 5,000 most frequent words in (written) English (Singer, <xref ref-type="bibr" rid="B65">2023</xref>). Usage-based approaches to Construction Grammar further assume that the cognitive processes involved in language production and comprehension are domain-general and not specific to language. One of these domain-general cognitive processes is cross-modal association, which &#x0201C;allows humans to match up the phonetic (or manual) form experienced with properties of the context and meaning&#x0201D; (Bybee, <xref ref-type="bibr" rid="B9">2013</xref>, p. 50), and which seems to be key in language learning (Imai and Kita, <xref ref-type="bibr" rid="B34">2014</xref>; Dingemanse et al., <xref ref-type="bibr" rid="B15">2015</xref>). An example of cross-modal association is sound symbolism, which is more pervasive in English than traditionally assumed. Sidhu et al. (<xref ref-type="bibr" rid="B64">2021</xref>) could show that sounds associated with roundedness (like /m/) more often than not denote round objects in English, while sounds associated with spikiness (like /k/) often denote spiky objects in English; an effect also known as the maluma/takete effect (K&#x000F6;hler, <xref ref-type="bibr" rid="B36">1929</xref>).</p>
</sec>
<sec>
<title>2.2 Multimodal constructions</title>
<p>Constructions can be of any size, &#x0201C;including morphemes or words, idioms, partially lexically filled and fully general phrasal patterns&#x0201D; (Goldberg, <xref ref-type="bibr" rid="B24">2006</xref>, p. 5) as well as argument and information structure constructions (see e.g., relevant chapters in Hoffmann and Trousdale, <xref ref-type="bibr" rid="B32">2013</xref>; Hilpert, <xref ref-type="bibr" rid="B29">2019</xref>; Hoffmann, <xref ref-type="bibr" rid="B31">2022</xref>), but, evidently, the vast majority of constructions considered is of a morphosyntactic nature. This is surprising to the extent that usage-based Construction Grammar emphasizes language knowledge to emerge from the input language users get&#x02014;and arguably this input commonly is multimodal. For instance, spoken language, i.e., the language infants are exposed to first, is inherently multimodal (Vigliocco et al., <xref ref-type="bibr" rid="B73">2014</xref>; Feyaerts et al., <xref ref-type="bibr" rid="B19">2017</xref>; Perniss, <xref ref-type="bibr" rid="B58">2018</xref>), since speakers use gaze, gestures, facial expressions and other resources to convey meaning (see also Section 4.1 on the multimodality of <italic>Tell me about it</italic>). But also written language is often produced in multimodal situations (see e.g., Kress, <xref ref-type="bibr" rid="B37">2000</xref>; van Leeuwen, <xref ref-type="bibr" rid="B72">2014</xref>; Hiippala, <xref ref-type="bibr" rid="B28">2017</xref>). Internet memes, for example, use written language and an image to convey their (conventionalized) meaning (Dancygier and Vandelanotte, <xref ref-type="bibr" rid="B13">2017</xref>; B&#x000FC;low et al., <xref ref-type="bibr" rid="B7">2018</xref>). Despite these facts, multimodal constructional analyses are often noticeably absent from research in (usage-based) Construction Grammar.</p>
<p>In parallel to the multimodal turn in linguistics in general (see St&#x000F6;ckl, <xref ref-type="bibr" rid="B67">2020</xref>), the past decade has also witnessed a growing interest in multimodal issues in Construction Grammar. One strand of research concerns itself with speech-embedded non-verbal depictions, i.e., gestures that may fill specific slots of constructions, such as Verb or Noun Phrase (see e.g., Clark, <xref ref-type="bibr" rid="B10">2016</xref>; Ladewig, <xref ref-type="bibr" rid="B39">2020</xref>; Hsu et al., <xref ref-type="bibr" rid="B33">2021</xref>). Although not all of these studies position themselves in a Construction Grammar framework, their examples can be reanalyzed, like in Example (1):</p>
<list list-type="simple">
<list-item><p>(1) [MB was discussing a measure in a Mozart sonata] But then he writes &#x0201C;(<italic>gazing at audience and singing</italic>) <underline>dee-duh dum</underline>.&#x0201D; That is very expressive.</p></list-item>
<list-item><p>(Clark, <xref ref-type="bibr" rid="B10">2016</xref>, p. 325)</p></list-item>
</list>
<p>From a Construction Grammar perspective, the nonverbal depiction (i.e., <italic>dee-duh dum</italic>) fulfills the function of the object noun phrase in the transitive construction. Examples like these thus show that constructional slots need not be filled by morphosyntactic elements but can also be realized by other means.</p>
<p>Another strand of research discusses the possible existence of multimodal constructions. Ziem (<xref ref-type="bibr" rid="B78">2017</xref>) names four conditions under which a construction can be seen as multimodal, of which only the first two will be reviewed here, because they are central to the argumentation put forward in this paper.<xref ref-type="fn" rid="fn0001"><sup>1</sup></xref> The first condition states that</p>
<list list-type="simple">
<list-item><p>(a) A multimodal construction is a conventionalized pairing of a complex form that consists, at least, of a verbal element combined with a kinetic element (Ziem, <xref ref-type="bibr" rid="B78">2017</xref>, p. 5).</p></list-item>
</list>
<p>In other words, a multimodal construction needs some kind of verbal form (with syntactic, morphological and/or phonological properties) and, necessarily, a kinetic element (like a manual gesture, a facial expression, or a particular gaze behavior) to be called such. Based on the representation of a construction (provided in Croft and Cruse, <xref ref-type="bibr" rid="B12">2004</xref>, p. 258), <xref ref-type="fig" rid="F2">Figure 2</xref> depicts the representation of a multimodal construction.</p>
<fig id="F2" position="float">
<label>Figure 2</label>
<caption><p>Schematic representation of a multimodal construction.</p></caption>
<graphic mimetype="image" mime-subtype="tiff" xlink:href="fcomm-09-1338844-g0002.tif"/>
</fig>
<p>A prime example for such a multimodal construction is the complex form of a deictic expression like <italic>there</italic> and a deictic gesture (like pointing, a head nod or directed gaze; Levinson, <xref ref-type="bibr" rid="B45">2006</xref>), which, together, serve to identify a location in a given situation. This condition, however, may be and, as will be argued in this paper, is, in fact, incomplete. While a complex form might be a verbal plus a kinetic element, it might also be a verbal element plus a prosodic pattern. To show that the second combination is also a possible manifestation of a multimodal construction, it needs to be shown that morphosyntax and prosody are two different modes, each contributing independently to the meaning of the construction. Alternatively, it might be assumed that prosody is yet another aspect of unimodal constructions, on a par with their phonological properties. The review provided in Section 3 will rule out this alternative viewpoint.</p>
<p>The second condition Ziem (<xref ref-type="bibr" rid="B78">2017</xref>) puts forward runs as follows:</p>
<list list-type="simple">
<list-item><p>(b) Multimodal constructions manifest themselves either as inherently multimodal units or as entrenched cooccurrences of a verbal and a kinetic element (as opposed to constructions solely realized in a multimodal way).</p></list-item>
</list>
<p>This condition indicates that there are two kinds of multimodal constructions, which need to be kept distinct from incidental cooccurrences of e.g., a construction and a gesture (see also Hoffmann, <xref ref-type="bibr" rid="B30">2017</xref>). The first kind of multimodal construction is inherently multimodal, i.e., it is non-predictable in some way. This holds for the combination of a deictic expression and a deictic gesture: The deictic expression remains incomplete in meaning (at least in some of the cases) unless it is used with deictic gesture. The second kind of multimodal construction follows from the usage-based premise that an expression can be fully predictable and still be a construction when it occurs with sufficient frequency. Schoonjans (<xref ref-type="bibr" rid="B63">2018</xref>), for example, could show that the German particle <italic>einfach</italic> cooccurs with a head shake in 24% in his corpus. Zima (<xref ref-type="bibr" rid="B79">2017</xref>) could show that [all the way from X PREP Y] is produced with a gesture in 80% of cases. And Uhrig (<xref ref-type="bibr" rid="B71">2022</xref>) could show that verbs of throwing are, on average, accompanied by a gesture in 54% of cases (with 66% for <italic>fling</italic> but only 42% for <italic>lob</italic>). Even though these corpus studies attest statistically significant cooccurrences of morphosyntactic and kinetic elements, they could only provide indirect evidence that this statistical significance can be equated with practical significance, i.e., show that these multimodal realizations constitute cognitive units. Therefore, in Section 4, the present paper will provide some evidence that language users actively make use of the prosodic mode to disambiguate (multimodal) constructions by reporting on a forced-choice experiment using the construction <italic>Tell me about it</italic>.</p>
<p>The present paper is not the first trying to bring together Construction Grammar and prosody. The past decade has also seen a rise in studies researching the prosody-syntax interface from a Construction Grammar perspective, but did so independently, i.e., without referring to multimodal constructions. In the Introduction to their edited volume on Prosody and Construction Grammar, Imo and Lanwer (<xref ref-type="bibr" rid="B35">2020</xref>) summarize possible synergies. One possibility is the existence of prosodic constructions, i.e., assemblies of prosodic features that convey a particular meaning (relatively) independent of the words that are used with it. These prosodic constructions combine with morphosyntactic constructions in an <italic>ad hoc</italic> manner if their functions are compatible. Prosodic constructions have been proposed for French (Marandin, <xref ref-type="bibr" rid="B51">2006</xref>), Persian (Sadat-Tehrani, <xref ref-type="bibr" rid="B62">2010</xref>), Spanish (Gras and Elvira-Garc&#x000ED;a, <xref ref-type="bibr" rid="B25">2021</xref>), and English (Ward, <xref ref-type="bibr" rid="B74">2019</xref>). Another possibility is that prosodic properties, if recurring, can be part of the formal side of the (unimodal) construction. This was proposed for the reactive <italic>what-x</italic> construction (<italic>What mince pies?</italic>), which reacts to something in the preceding turn by another speaker and needs to be prosodically integrated (P&#x000F5;ldvere and Paradis, <xref ref-type="bibr" rid="B59">2020</xref>). And, finally, a third possibility is that prosody and morphosyntax interact in a meaningful way such that a construction would be incomplete without considering both components and none of the two components constitute independent constructions. This seems to be the case for German appositive structures (e.g., <italic>der Spitzenkoch Tim M&#x000E4;lzer</italic>, English <italic>the top chef Tim M&#x000E4;lzer</italic>), as evidenced in Lanwer (<xref ref-type="bibr" rid="B40">2020</xref>). Even though this is not made explicit, this possible relation between prosody and Construction Grammar fits the definition of a multimodal construction with the only exception that &#x0201C;kinetic&#x0201D; form needs to be replaced by &#x0201C;prosodic&#x0201D; form. <xref ref-type="fig" rid="F3">Figures 3</xref>&#x02013;<xref ref-type="fig" rid="F5">5</xref> summarize all possible configurations.</p>
<fig id="F3" position="float">
<label>Figure 3</label>
<caption><p>Schematic representation of an <italic>ad hoc</italic> relationship between prosodic and (morpho) syntactic constructions.</p></caption>
<graphic mimetype="image" mime-subtype="tiff" xlink:href="fcomm-09-1338844-g0003.tif"/>
</fig>
<fig id="F4" position="float">
<label>Figure 4</label>
<caption><p>Schematic representation of a unimodal construction with prosodic properties.</p></caption>
<graphic mimetype="image" mime-subtype="tiff" xlink:href="fcomm-09-1338844-g0004.tif"/>
</fig>
<fig id="F5" position="float">
<label>Figure 5</label>
<caption><p>Schematic representation of a multimodal construction made up of a morphosyntactic and a prosodic form.</p></caption>
<graphic mimetype="image" mime-subtype="tiff" xlink:href="fcomm-09-1338844-g0005.tif"/>
</fig>
<p>In a nutshell, the present paper aims to show that there are multimodal constructions that consist of a syntactic and a prosodic form, which combine to convey one meaning. To do so, evidence for a prosodic mode (in English) will be reviewed to show that, in principle, prosody and morphosyntax (or rather the phonological properties of morphosyntactic elements) are two different modes. Moreover, a forced-choice experiment will be reported on, which shows that certain prosodic forms are not just used incidentally, but that they are part of language users&#x00027; knowledge.</p></sec>
</sec>
<sec id="s3">
<title>3 Evidence for a prosodic mode in English</title>
<p>There are many definitions of the term mode and some of them equate mode with sensory channel. Such a, often pre-theoretical, notion of mode might be one of the reasons why prosody has been largely neglected in usage-based, multimodal approaches to Construction Grammar. From such a view, prosody and spoken language belong to the same mode and, thus, need not be part of multimodal analyses. The present paper, however, will use the notion of semiotic mode, which is prevalent in multimodality research. More specifically, the paper will make use of the definition of semiotic mode as proposed by Bateman and colleagues (Bateman, <xref ref-type="bibr" rid="B2">2011</xref>, <xref ref-type="bibr" rid="B3">2022</xref>; Bateman and Wildfeuer, <xref ref-type="bibr" rid="B4">2014</xref>; Bateman et al., <xref ref-type="bibr" rid="B5">2017</xref>).</p>
<p>Bateman defines a semiotic mode as &#x0201C;a three-way layered configuration of semiotic distinctions developed by a community of users in order to achieve some range of communicative or expressive tasks&#x0201D; (Bateman, <xref ref-type="bibr" rid="B3">2022</xref>, p. 68). The first layer of the semiotic mode is the material substrate, i.e., &#x0201C;the &#x02018;stuff&#x00027; which is used when making meaning&#x0201D; (Bateman and Wildfeuer, <xref ref-type="bibr" rid="B4">2014</xref>, p. 181). In other words, semiotic agents manipulate the material to communicate. The second layer is the form side of the mode. The form consists of categories derived from the (noisy) material that are, by convention, used to distinguish meanings. These forms can be simple or complex. And, finally, the third layer of the semiotic mode is that of discourse semantics, i.e., the meaning contribution of the mode in relation to its surroundings. The following subsections will show if and to what extent (spoken) morphosyntax and prosody differ along these lines.</p>
<sec>
<title>3.1 The material substrate</title>
<p>From an articulatory perspective, the material substrate of spoken English morphosyntax is part of introductory knowledge in linguistics. Speakers use the air stream coming from the lungs and manipulate this air stream with the help of different, active and passive, articulators to create sounds. One main active articulator is the vocal folds, which can produce voiced sounds when vibrating and voiceless sounds when not vibrating. The other articulators of English sounds are mainly found in the oral cavity: the lips, the teeth, the tongue, the alveolar ridge, the hard and the soft palate (also called velum) as well as the uvula (depending on the variety of English spoken). Acoustically, this manipulation of the airstream results in different shapes of the sound waves produced. For example, plosive sounds are characterized by a silent period and a sudden release burst, fricatives by a strong turbulence noise and vowels by energy peaks at certain frequencies (also known as first and second formants), to name but a few.</p>
<p>The articulatory mechanisms behind prosodic features in English (to be discussed below) partially overlap with that of the sounds of English. The most central prosodic features&#x02014;pitch, loudness, and duration&#x02014;are manipulated largely with the help of the diaphragm and the vocal folds. The diaphragm is a large muscle below the lungs that controls breathing and thus, the airstream. The greater the airflow, the louder the speech tends to get. The diaphragm is also involved in producing (English) speech sounds, because, when there is no airflow, no sounds can be produced.<xref ref-type="fn" rid="fn0002"><sup>2</sup></xref> Technically, speakers may also &#x0201C;speak from their throats,&#x0201D; i.e., without support from the diaphragm, but even that has respiratory constraints. Still, even though the diaphragm is involved in the production of speech sounds, it does not have an influence on the perception of these sounds as phonemes. A/l/is a/l/, no matter whether it is loud or quiet. Acoustically, with greater airflow, the pressure the sound signal exerts on the surrounding particles is higher. The other main articulator in prosody is the vocal folds, which are responsible for pitch production. The speed with which they vibrate correlates with the fundamental frequency (f<sub>0</sub>) of the sound produced. The faster they vibrate, the higher the sound is perceived. As outlined above, the vocal folds are also involved in sound production. However, even though the articulator is the same, it does two different things here. For sound production, it is important to either let the vocal folds vibrate or not. For pitch, what matters is the speed with which they vibrate. From an acoustic perspective, higher frequency of vibration causes the sound waves to oscillate faster, too.</p>
<p>All in all, what can be seen from this necessarily brief overview is that sounds (as the building blocks of spoken morphosyntax) and prosodic features are produced by different parts of the articulatory system. This means that they can be (and are) manipulated independently in the meaning-making process and, thus, also can take on different forms.</p>
</sec>
<sec>
<title>3.2 Form</title>
<p>Regarding the form of spoken English morphosyntax, the paper will only consider phonological categories, since these are most central for the present argument. The phonological features that serve meaning-distinguishing purposes in English are the state of the glottis, the manner of articulation, and the place of articulation for consonants, and the positioning of the tongue and duration for vowels. For English vowels, further meaning-distinguishing features have been proposed, either in addition or substituting duration, namely muscular tension and position of the lips. In any case, features like these enable language users to distinguish categories such as /b/ and /p/ (state of the glottis), /b/ and /m/ (manner of articulation), /b/ and /d/ (place of articulation), /i:/ and /u:/ (position of the tongue) as well as /i:/ and /&#x0026A;/ (duration, but also position of the tongue).</p>
<p>For prosody, features that serve meaning-distinguishing purposes include, at least, the &#x0201C;big three&#x0201D; pitch (the perceptual correlate to fundamental frequency), loudness (the perceptual correlate of the pressure of the sound signal), and aspects of timing (such as speaking rate, articulation rate or pauses). These features enable the language user to perceive categories such as rising and falling intonation (pitch), loud and quiet speech (loudness) as well as fast and slow speaking tempo (timing). These three often work together to form prosodic constructions, i.e., configurations of prosodic forms that convey a particular meaning independent of the words used (see Section 3.3. below for examples). There are further prosodic features, such as voice quality (nasality, creakiness) and articulatory precision, but these seldom serve meaning-distinguishing functions on their own. In sum, there is some overlap regarding the meaning-distinguishing features of spoken morphosyntax and prosody, since (vowel) duration and timing are both time-related features, but other than that, the features can clearly be distinguished from one another. What is more, even though vowel duration and timing seem to correlate, language users are able to distinguish the two nonetheless. Just consider a word like <italic>bit</italic>. Its vowel, /&#x0026A;/, is short in duration, but the meaning of the word does not change if it is pronounced in a slow manner (which is the case, of course, because no two words in English are ever distinguished by vowel duration alone) as long as the contrast with other vowels of a similar quality is maintained.</p>
<p>An interesting exception might be stress placement. There are words in English that only differ by word stress, e.g., <italic>differ</italic> /&#x00027;d&#x0026A;f&#x00259;/ or /&#x00027;d&#x0026A;f&#x00259;r/ and <italic>defer</italic> /d&#x0026A;&#x00027;f&#x00259;:/ or /d&#x0026A;&#x00027;f&#x00259;r/. The acoustic correlates of stress in English include, among others, pitch, loudness and timing (see e.g., Fry, <xref ref-type="bibr" rid="B21">1955</xref>, <xref ref-type="bibr" rid="B22">1958</xref>; Lieberman, <xref ref-type="bibr" rid="B48">1960</xref>), i.e., the &#x0201C;big three&#x0201D; mentioned above. Examples like <italic>differ</italic> and <italic>defer</italic> blur the lines between meaning-distinguishing features that are relevant for morphosyntax and those for prosody. Therefore, one could treat them as counterevidence that prosody is an independent mode because a prosodic configuration that language users perceive as word stress serves morphosyntactically relevant functions. Likewise, it could be argued that words like <italic>differ</italic> and <italic>defer</italic> are, in fact, multimodal constructions combining a phonological (e.g., /d&#x0026A;f&#x00259;/) and a prosodic form (e.g., /&#x00027;&#x003C3;&#x003C3;/) for <italic>differ</italic>. It is outside the scope of the present paper to provide evidence for one or the other claim. Still, the argument put forward in the following clearly favors the second option.</p>
</sec>
<sec>
<title>3.3 Discourse meaning</title>
<p>From a Construction Grammar perspective, all morphosyntactic units of interest, i.e., constructions, carry meaning per definition (although this is not uncontroversial, see e.g., Fillmore et al., <xref ref-type="bibr" rid="B20">2012</xref> on constructions without meaning). Therefore, there is no need to discuss the meaning of these.</p>
<p>The more interesting question is rather whether prosodic forms, independent of the words that are used with them, carry meaning. There is, in fact, quite some evidence for the existence of prosodic constructions. Prosodic constructions have been identified for Spanish (Elvira-Garc&#x000ED;a, <xref ref-type="bibr" rid="B17">2019</xref>; Gras and Elvira-Garc&#x000ED;a, <xref ref-type="bibr" rid="B25">2021</xref>), German (Neitsch and Niebuhr, <xref ref-type="bibr" rid="B52">2019</xref>; Niebuhr, <xref ref-type="bibr" rid="B54">2019</xref>), French (Marandin, <xref ref-type="bibr" rid="B51">2006</xref>), Persian (Sadat-Tehrani, <xref ref-type="bibr" rid="B62">2010</xref>), and most notably for the present purposes, English (Ward, <xref ref-type="bibr" rid="B74">2019</xref>). One of the prosodic constructions attested for English, the <italic>consider this</italic> construction, will be reviewed in more detail, because it is one of the constructions that is understood best. This prosodic construction was first described in Liberman and Sag (<xref ref-type="bibr" rid="B47">1974</xref>) and is attested both experimentally (Kurumada et al., <xref ref-type="bibr" rid="B38">2012</xref>) and with the help of corpora (Hedberg et al., <xref ref-type="bibr" rid="B27">2003</xref>; Ward, <xref ref-type="bibr" rid="B74">2019</xref>). Its formal features are illustrated in <xref ref-type="fig" rid="F6">Figure 6</xref>. While most of its formal descriptions focused on the pitch movements only, recent advances show that it consists formally of three parts: The first is a region that is high-pitched, loud and slow, to be seen on the word <italic>LOOKS</italic> in <xref ref-type="fig" rid="F6">Figure 6</xref>. The second is a region of level pitch, which can be seen on <italic>like a ze-</italic> in <xref ref-type="fig" rid="F6">Figure 6</xref>. And, third, another high-pitched region, visible on the last syllable <italic>-bra</italic> in <xref ref-type="fig" rid="F6">Figure 6</xref> (Ward, <xref ref-type="bibr" rid="B74">2019</xref>, p. 5&#x02013;24). Functionally, it marks some kind of contradiction or contrast, a piece of information that is offered to the hearer for further consideration. Thus, the syntactic string <italic>It looks like a zebra</italic> uttered with the prosodic pattern described above implies that even though the animal in question might resemble a zebra, it is actually some other animal (Kurumada et al., <xref ref-type="bibr" rid="B38">2012</xref>). There is compelling evidence that this form-function pairing is indeed conventionalized in American English: Corpus studies suggest that this prosodic form is more often than not used with contradictions (Hedberg et al., <xref ref-type="bibr" rid="B27">2003</xref>; Ward, <xref ref-type="bibr" rid="B74">2019</xref>) and experimental evidence suggests that language users favor a &#x0201C;no zebra&#x0201D; interpretation when presented with an utterance like depicted in <xref ref-type="fig" rid="F6">Figure 6</xref> (Kurumada et al., <xref ref-type="bibr" rid="B38">2012</xref>). What is more, Liberman and Sag (<xref ref-type="bibr" rid="B47">1974</xref>) even argue that &#x0201C;without having any idea of the content of his utterance, we know from the melody performed &#x02026; that [the speaker] objects in some way&#x0201D; (422), i.e., that the prosodic form has an independent meaning. This independent contribution to the discourse semantics of prosody is probably the most convincing piece of evidence that prosody is an independent semiotic mode.</p>
<fig id="F6" position="float">
<label>Figure 6</label>
<caption><p>Waveform and pitch contour of the &#x0201C;consider this&#x0201D; construction (taken from Kurumada et al., <xref ref-type="bibr" rid="B38">2012</xref>).</p></caption>
<graphic mimetype="image" mime-subtype="tiff" xlink:href="fcomm-09-1338844-g0006.tif"/>
</fig>
</sec>
</sec>
<sec id="s4">
<title>4 Entrenching prosodic information: <italic>Tell me about it</italic></title>
<p>Section 3 argued that prosody is best seen as an independent semiotic mode. For the discussion on the relation between prosody and morphosyntactic constructions this means that prosodic properties cannot be analyzed on a par with other properties of morphosyntactic construction but need independent consideration. Section 3.3, in particular, has shown that there are prosodic constructions, like the <italic>consider this</italic> construction, that may combine with morphosyntactic constructions in an <italic>ad hoc</italic> manner to form a multimodal construct. In what follows, the paper will present some evidence for a genuinely multimodal construction, i.e., a construction with both entrenched prosodic and morphosyntactic properties. The construction under consideration is called stance-related <italic>Tell me about it</italic> and will be contrasted with another, formally similar construction, i.e., requesting <italic>Tell me about it</italic>.</p>
<sec>
<title>4.1 Requesting and stance-related <italic>Tell me about it</italic></title>
<p>Formally, requesting and stance-related <italic>Tell me about it</italic> (henceforth TMAI) are morphosyntactically similar. While formal variations for the stance-related construction can be found (e.g., <italic>Tell me more</italic> or <italic>Tell me more about it</italic>), these are rare and <italic>Tell me about it</italic> seems to be the preferred variant as this is the only form that is listed in dictionaries (e.g., in the Oxford English Dictionary Online, Tell, <xref ref-type="bibr" rid="B68">2023</xref>). Functionally, the two TMAI constructions fulfill different, non-overlapping functions. Requesting TMAI is used to request information as is illustrated in Example (2).<xref ref-type="fn" rid="fn0003"><sup>3</sup></xref></p>
<list list-type="simple">
<list-item><p>(2) &#x0201C;sci-fi thriller&#x0201D; (simplified)</p></list-item>
<list-item><p>&#x000A0;&#x000A0;A: I know she also has a sci-fi thriller. Arrival.</p></list-item>
<list-item><p>&#x000A0;&#x000A0;B: Uh-huh.</p></list-item>
<list-item><p>&#x000A0;&#x000A0;A: Tell me about it. Is it worth seeing?</p></list-item>
<list-item><p>&#x000A0;&#x000A0;B: Absolutely.</p></list-item>
<list-item><p>&#x000A0;&#x000A0;(2016-09-25_0832_US_KNBC_Access_Hollywood, 29:41-29:48).</p></list-item>
</list>
<p>In Example (2), speaker A introduces a referent, i.e., a science fiction thriller called <italic>Arrival</italic>. After speaker B&#x00027;s brief backchannel, speaker A encourages speaker B to provide more information on this film using TMAI and specifies the preferred continuation to be an evaluation (i.e., <italic>Is it worth seeing?</italic>). Speaker B then provides the requested information. As can be seen in this example, requesting TMAI usually initiates speaker transition. This transition need not occur directly after issuing TMAI, but constitutes what Sacks et al. (<xref ref-type="bibr" rid="B61">1974</xref>) call a transition-relevance place. Moreover, the next turn is expected to be an informing sequence, providing some more information on the referent that was introduced shortly before.</p>
<p>Stance-related TMAI fulfills completely different functions as is illustrated in Example (3).</p>
<list list-type="simple">
<list-item><p>(3) &#x0201C;we&#x00027;re all getting older&#x0201D; (simplified)</p></list-item>
<list-item><p>&#x000A0;&#x000A0;A: We&#x00027;re getting older. We&#x00027;re all getting older. So&#x02026;</p></list-item>
<list-item><p>&#x000A0;&#x000A0;B: ((laughs)) T- Tell me about it.</p></list-item>
<list-item><p>&#x000A0;&#x000A0;A: ((laughs))</p></list-item>
<list-item><p>&#x000A0;&#x000A0;(2021-11-26_0600_US_KNBC_Dateline_NBC, 03:39-03:44).</p></list-item>
</list>
<p>In Example (3), speaker A makes an observation (<italic>we&#x00027;re getting older</italic>), which many people find saddening. This seems also to hold for speaker A since he repeats this utterance, slightly modifying it (<italic>we&#x00027;re all getting older</italic>). Speaker B reacts to this observation, at first, with laughter and then with stance-related TMAI. This construction expresses an affective stance, i.e., a saddening view on aging. Likewise, it expresses epistemic authority. Speaker B is, apparently, older than speaker A and thus claims to be more knowledgeable person on this matter. Crucially, stance-related TMAI neither necessitates speaker transition nor an informing sequence. Speaker A reacts with laughter to speaker B uttering TMAI and the conversation is cut at this point.</p>
<p>It could be argued that both TMAI constructions are ambiguous and are only disambiguated in predictive context. However, in a corpus study using the multimodal <italic>NewsScape Library of International Television News</italic> (Steen and Turner, <xref ref-type="bibr" rid="B66">2013</xref>), Lehmann (<xref ref-type="bibr" rid="B41">2023</xref>) showed that stance-related TMAI, when compared to requesting TMAI, is produced, more often than not, with raised eyebrows, averted gaze, smiling, some kind of head movement (often nods, shakes or tilts) and a slower speaking rate. This is illustrated with frame grabs of Example (3), which are provided in <xref ref-type="table" rid="T1">Table 1</xref>.</p>
<table-wrap position="float" id="T1">
<label>Table 1</label>
<caption><p>Frame grabs of an extract of example (3).</p></caption>
<table frame="box" rules="all">
<thead>
<tr style="background-color:#919498;color:#ffffff">
<th valign="top" align="left"><bold>Line</bold></th>
<th valign="top" align="left"><bold>Speaker</bold></th>
<th valign="top" align="left"><bold>Utterance</bold></th>
<th valign="top" align="left"><bold>Frame grab</bold></th>
</tr>
</thead>
<tbody>
<tr>
<td valign="top" align="left">1</td>
<td valign="top" align="left">A</td>
<td valign="top" align="left">So&#x02026;</td>
<td valign="top" align="left" rowspan="1"><inline-graphic xlink:href="fcomm-09-1338844-i0001.tif"/></td>
</tr> <tr>
<td valign="top" align="left">2</td>
<td valign="top" align="left">B</td>
<td valign="top" align="left">t-</td>
<td valign="top" align="left" rowspan="1"><inline-graphic xlink:href="fcomm-09-1338844-i0002.tif"/></td>
</tr> <tr>
<td valign="top" align="left">3</td>
<td/>
<td valign="top" align="left">Tell me about</td>
<td valign="top" align="left" rowspan="1"><inline-graphic xlink:href="fcomm-09-1338844-i0003.tif"/></td>
</tr> <tr>
<td valign="top" align="left">4</td>
<td/>
<td valign="top" align="left">It</td>
<td valign="top" align="left" rowspan="1"><inline-graphic xlink:href="fcomm-09-1338844-i0004.tif"/></td>
</tr></tbody>
</table>
</table-wrap>
<p>As can be seen <xref ref-type="table" rid="T1">Table 1</xref>, before uttering stance-related TMAI, speaker B looks at his interlocutor, already smiling. At the onset of TMAI, he turns his head (line 2) to the left and avoids eye contact with the recipient. In addition, he raises his eyebrows and continues smiling (see also line 3). Only after finishing uttering TMAI, on the last syllable, he turns his head orientation and his gaze back toward his interview partner. The duration of TMAI in Example (3) is 667 ms, which corresponds to a speaking rate of 7.4 syllables per second. This is very close to the mean speaking rate of stance-related TMAI in face-to-face interactions, which is 7.48 syllables per second, whereas requesting TMAI is faster in these contexts, with a speaking rate of 8.44 syllables per second (see Lehmann, <xref ref-type="bibr" rid="B42">in press</xref>).</p>
<p>All of these visual as well as prosodic properties of stance-related TMAI were shown to be statistically significant (Lehmann, <xref ref-type="bibr" rid="B41">2023</xref>), but as was argued above, some Construction Grammarians claim that statistical significance need not be equated with practical significance. Therefore, both visual and prosodic properties of TMAI were put to the test in a forced choice experiment to provide evidence that language users indeed draw on these properties when interpreting an instance of TMAI.</p>
</sec>
<sec>
<title>4.2 Putting the multimodal properties of <italic>Tell me about it</italic> to the test</title>
<sec>
<title>4.2.1 Method</title>
<sec>
<title>4.2.1.1 Participants</title>
<p>The participants in this experiment were 25 adult native speakers of American English, who were recruited via Prolific Academic (Palan and Schitter, <xref ref-type="bibr" rid="B57">2018</xref>). They were rewarded &#x000A3;4.50 for their participation. In addition, 18 adult advanced learners of English participated. These were students of the study program <italic>English-speaking Cultures</italic> at the University of Bremen, Germany. To be admitted to this study program, students need to have a command of English at level B2 (&#x0201C;independent user&#x0201D;) of the Common European Framework of Reference for Languages (Council for Cultural Co-operation, Education Committee, and Modern Languages Division, <xref ref-type="bibr" rid="B11">2001</xref>), but many of them self-reported to know English on a C1 level (&#x0201C;proficient user&#x0201D;). They participated for course credit.</p></sec>
<sec>
<title>4.2.1.2 Procedure</title>
<p>The participants were requested to complete an online forced choice experiment, which had been designed with SoSci Survey (Leiner, <xref ref-type="bibr" rid="B43">2021</xref>). In the instructions to this experiment, the participants were introduced to the two uses of TMAI, named <italic>requesting information</italic> and <italic>ironic rejoinder</italic>. This was done to make sure that the non-native speaker understand the task (in case they did not know TMAI could also be used in a stance-related way) and to introduce the two response options in the experiment. The label <italic>ironic rejoinder</italic> was preferred over the label <italic>stance-related</italic> in the experiment because the <italic>Oxford English Dictionary</italic> defines stance-related TMAI this way (Tell, <xref ref-type="bibr" rid="B68">2023</xref>). The participants were told that they would see and/or hear a speaker uttering TMAI and that their task was to guess whether this utterance is requesting information or an ironic rejoinder.</p></sec>
<sec>
<title>4.2.1.3 Stimuli</title>
<p>The experiment consisted of 69 stimuli in total. All of these were selected observations of the corpus study from Lehmann (<xref ref-type="bibr" rid="B41">2023</xref>). These observations were presented in four different conditions. In the first condition, called &#x0201C;context condition,&#x0201D; the participants were presented with TMAI with what was considered sufficient sequential context to disambiguate TMAI with the help of this context. This served as the reference condition. In the second condition, called &#x0201C;multimodal condition,&#x0201D; the participants could both hear and see a speaker uttering TMAI, but without further sequential context. In the third condition, called &#x0201C;visual condition,&#x0201D; the participants saw a speaker uttering TMAI, but they could not hear this person. Since these video snippets were extremely short with less than a second and some online video players have a time lag, the videos were played in slow motion. The participants were informed about this. Furthermore, to facilitate speaker identification in case there was more than one speaker visible, the videos were edited to such an extent that only the speaker of TMAI was visible. Finally, in the fourth condition, called &#x0201C;acoustic condition,&#x0201D; the participants were provided with an audio recording of a speaker uttering TMAI only. Within, but not between these conditions, stimuli rotated.</p>
<p>The stimuli were further selected regarding their anticipated interpretation. The statistical model that was fitted for the corpus data in Lehmann (<xref ref-type="bibr" rid="B41">2023</xref>) makes clear predictions about how participants should interpret these stimuli, if the results were of practical significance. Thus, stimuli were selected according to the visual and/or prosodic features that the speakers used during the utterance. That is, some stimuli were selected as either prototypically requesting or stance-related uses of TMAI, when they displayed the properties that the statistical model predicted. Vice versa, some of the stimuli were selected as ambiguous stimuli when they displayed conflicting properties, e.g., when the speaker raised their eyebrows (a property of stance-related TMAI) but continued looking at the recipient (a property of requesting TMAI).</p>
<p><xref ref-type="table" rid="T2">Table 2</xref> gives an overview on the stimuli used in the experiment.</p>
<table-wrap position="float" id="T2">
<label>Table 2</label>
<caption><p>Overview on the stimuli used in the experiment.</p></caption>
<table frame="box" rules="all">
<thead>
<tr style="background-color:#919498;color:#ffffff">
<th valign="top" align="left"><bold>Condition</bold></th>
<th valign="top" align="left"><bold>Description</bold></th>
<th valign="top" align="left"><bold>Anticipated interpretation</bold></th>
</tr>
</thead>
<tbody>
<tr>
<td valign="top" align="left" rowspan="2">Context</td>
<td valign="top" align="left" rowspan="2">TMAI embedded in sequential context</td>
<td valign="top" align="left">Requesting (<italic>N</italic> = 5)</td>
</tr>
 <tr>
<td valign="top" align="left">Stance-expressing (<italic>N</italic> = 5)</td>
</tr> <tr>
<td valign="top" align="left" rowspan="3">Multimodal</td>
<td valign="top" align="left" rowspan="3">Stand-alone TMAI Visual and acoustic information provided</td>
<td valign="top" align="left">Requesting (<italic>N</italic> = 5)</td>
</tr>
 <tr>
<td valign="top" align="left">Stance-expressing (<italic>N</italic> = 4)</td>
</tr>
 <tr>
<td valign="top" align="left">Ambiguous (<italic>N</italic> = 9)</td>
</tr> <tr>
<td valign="top" align="left" rowspan="3">Visual</td>
<td valign="top" align="left" rowspan="3">Stand-alone TMAI No acoustic information Pace slowed down</td>
<td valign="top" align="left">Requesting (<italic>N</italic> = 5)</td>
</tr>
 <tr>
<td valign="top" align="left">Stance-expressing (<italic>N</italic> = 5)</td>
</tr>
 <tr>
<td valign="top" align="left">Ambiguous (<italic>N</italic> = 11)</td>
</tr> <tr>
<td valign="top" align="left" rowspan="3">Acoustic</td>
<td valign="top" align="left" rowspan="3">Stand-alone TMAI No visual information</td>
<td valign="top" align="left">Requesting (<italic>N</italic> = 5)</td>
</tr>
 <tr>
<td valign="top" align="left">Stance-expressing (<italic>N</italic> = 4)</td>
</tr>
 <tr>
<td valign="top" align="left">ambiguous (<italic>N</italic> = 11)</td>
</tr></tbody>
</table>
</table-wrap>
</sec>
</sec>
<sec>
<title>4.2.2 Statistical analysis</title>
<p>The results of the forced choice experiment were analyzed with R (R Core Team, <xref ref-type="bibr" rid="B60">2022</xref>). With the help of the <italic>glmer</italic> function of the lme4 package (Bates et al., <xref ref-type="bibr" rid="B6">2015</xref>), a generalized linear mixed-effects model was fitted. The correctness of the response (i.e., whether the response was in line with the actual construction) was treated as the dependent variable. Initially, participant, language proficiency, stimulus, and construction were entered as random intercepts, while condition and anticipated interpretation were entered as fixed effects. This led to problems with convergence due to its complexity. An inspection of the initial model with the <italic>summ</italic> function of the jtools package (Long, <xref ref-type="bibr" rid="B49">2022</xref>) showed that language proficiency and participant were negligible effects and were, thus, removed from the model. No problems with convergence occurred thereafter. The <italic>summ</italic> function was used to summarize the fitted model, including the computation of confidence intervals, and the ggplot2 package (Wickham et al., <xref ref-type="bibr" rid="B77">2023</xref>) as well as the sjPlot package (L&#x000FC;decke, <xref ref-type="bibr" rid="B50">2023</xref>) were used to visualize the fitted model.</p></sec>
<sec>
<title>4.2.3 Results</title>
<p><xref ref-type="fig" rid="F7">Figure 7</xref> shows the overall distribution and central tendencies of correct responses for the different stimuli across conditions.</p>
<fig id="F7" position="float">
<label>Figure 7</label>
<caption><p>Grouped boxplots with jitter of correct responses regarding the anticipated interpretation across conditions. The asterisk indicates outliers.</p></caption>
<graphic mimetype="image" mime-subtype="tiff" xlink:href="fcomm-09-1338844-g0007.tif"/>
</fig>
<p><xref ref-type="fig" rid="F7">Figure 7</xref> suggests that, overall, the participants were successful at guessing the meaning of TMAI based on visual and/or acoustic cues alone, given that the median ratio of correct guesses for the unambiguous stimuli is higher than 0.75. <xref ref-type="fig" rid="F7">Figure 7</xref> also suggests that, when compared to the context condition, participants seemed to have difficulties with the ambiguous stimuli, but neither the requesting nor the stance-related ones, except for five stimuli which score lower than 0.75, three of which in the visual condition and two in the acoustic condition.<xref ref-type="fn" rid="fn0004"><sup>4</sup></xref> In general, participants perform worse in the visual and the acoustic condition than in the multimodal condition. In these two conditions, the ambiguous stimuli seem to pose the greatest difficulties to the participants, as expected.</p>
<p><xref ref-type="table" rid="T3">Table 3</xref> provides a summary of the fitted model and <xref ref-type="fig" rid="F8">Figure 8</xref> shows the odds ratios of the model terms (condition and anticipated interpretation).</p>
<table-wrap position="float" id="T3">
<label>Table 3</label>
<caption><p>Summary of the fitted model for correct responses.</p></caption>
<table frame="box" rules="all">
<thead>
<tr style="background-color:#919498;color:#ffffff">
<th valign="top" align="left" colspan="6"><bold>Model info:</bold></th>
</tr>
</thead>
<tbody>
<tr>
<td valign="top" align="left" colspan="6">Observations: 3010</td>
</tr> <tr>
<td valign="top" align="left" colspan="6">Dependent Variable: correctness</td>
</tr> <tr>
<td valign="top" align="left" colspan="6">Type: Mixed effects generalized linear regression</td>
</tr> <tr>
<td valign="top" align="left" colspan="6">Error Distribution: binomial</td>
</tr> <tr>
<td valign="top" align="left" colspan="6">Link function: logit</td>
</tr> <tr style="background-color:#dee1e1">
<td valign="top" align="left" colspan="6"><bold>Model fit:</bold></td>
</tr> <tr>
<td valign="top" align="left" colspan="6">AIC = 1,997.97, BIC = 2,046.05</td>
</tr> <tr>
<td valign="top" align="left" colspan="6">Pseudo-R<sup>2</sup> (fixed effects) = 0.36</td>
</tr> <tr>
<td valign="top" align="left" colspan="6">Pseudo-R<sup>2</sup> (total) = 0.64</td>
</tr>
<tr style="background-color:#919498;color:#ffffff">
<td valign="top" align="left" colspan="6"><bold>Fixed effects:</bold></td>
</tr>
<tr style="background-color:#919498;color:#ffffff">
<td/>
<td valign="top" align="left"><bold>Est</bold>.</td>
<td valign="top" align="left"><bold>2.5%</bold></td>
<td valign="top" align="left"><bold>97.5%</bold></td>
<td valign="top" align="left"><italic><bold>z</bold></italic> <bold>val</bold>.</td>
<td valign="top" align="left"><italic><bold>P</bold></italic></td>
</tr> <tr>
<td valign="top" align="left">(Intercept)</td>
<td valign="top" align="left">5.87</td>
<td valign="top" align="left">3.46</td>
<td valign="top" align="left">8.27</td>
<td valign="top" align="left">4.79</td>
<td valign="top" align="left">&#x0003C; 0.001</td>
</tr> <tr>
<td valign="top" align="left">Multimodal</td>
<td valign="top" align="left">&#x02212;2.03</td>
<td valign="top" align="left">&#x02212;3.97</td>
<td valign="top" align="left">&#x02212;0.10</td>
<td valign="top" align="left">&#x02212;2.06</td>
<td valign="top" align="left">0.04</td>
</tr> <tr>
<td valign="top" align="left">Visual</td>
<td valign="top" align="left">&#x02212;4.09</td>
<td valign="top" align="left">&#x02212;5.97</td>
<td valign="top" align="left">&#x02212;2.20</td>
<td valign="top" align="left">&#x02212;4.25</td>
<td valign="top" align="left">&#x0003C; 0.001</td>
</tr> <tr>
<td valign="top" align="left">Acoustic</td>
<td valign="top" align="left">&#x02212;3.79</td>
<td valign="top" align="left">&#x02212;5.69</td>
<td valign="top" align="left">&#x02212;1.89</td>
<td valign="top" align="left">&#x02212;3.91</td>
<td valign="top" align="left">&#x0003C; 0.001</td>
</tr> <tr>
<td valign="top" align="left">Stance-related</td>
<td valign="top" align="left">0.74</td>
<td valign="top" align="left">&#x02212;0.74</td>
<td valign="top" align="left">2.22</td>
<td valign="top" align="left">0.98</td>
<td valign="top" align="left">0.33</td>
</tr> <tr>
<td valign="top" align="left">Ambiguous</td>
<td valign="top" align="left">&#x02212;1.06</td>
<td valign="top" align="left">&#x02212;2.16</td>
<td valign="top" align="left">0.04</td>
<td valign="top" align="left">&#x02212;1.89</td>
<td valign="top" align="left">0.06</td>
</tr>
<tr style="background-color:#919498;color:#ffffff">
<td valign="top" align="left" colspan="6"><bold>Random effects:</bold></td>
</tr>
<tr style="background-color:#919498;color:#ffffff">
<td valign="top" align="left"><bold>Group</bold></td>
<td valign="top" align="left"><bold>Parameter</bold></td>
<td valign="top" align="left"><bold>Std. dev</bold>.</td>
<td/>
<td/>
<td/>
</tr> <tr>
<td valign="top" align="left">Stimulus</td>
<td valign="top" align="left">(Intercept)</td>
<td valign="top" align="left">1.25</td>
<td/>
<td/>
<td/>
</tr> <tr>
<td valign="top" align="left">Construction</td>
<td valign="top" align="left">(Intercept)</td>
<td valign="top" align="left">0.97</td>
<td/>
<td/>
<td/>
</tr>
<tr style="background-color:#919498;color:#ffffff">
<td valign="top" align="left" colspan="6"><bold>Grouping variables:</bold></td>
</tr>
<tr style="background-color:#919498;color:#ffffff">
<td valign="top" align="left"><bold>Group</bold></td>
<td valign="top" align="left"><bold>&#x00023; groups</bold></td>
<td valign="top" align="left"><bold>ICC</bold></td>
<td/>
<td/>
<td/>
</tr> <tr>
<td valign="top" align="left">Stimulus</td>
<td valign="top" align="left">70</td>
<td valign="top" align="left">0.27</td>
<td/>
<td/>
<td/>
</tr> <tr>
<td valign="top" align="left">Construction</td>
<td valign="top" align="left">2</td>
<td valign="top" align="left">0.16</td>
<td/>
<td/>
<td/>
</tr></tbody>
</table>
</table-wrap>
<fig id="F8" position="float">
<label>Figure 8</label>
<caption><p>Odds ratios of correctness of response [<italic>p</italic> &#x0003C; 0.05 (&#x0002A;), <italic>p</italic> &#x0003C; 0.01 (&#x0002A;&#x0002A;), <italic>p</italic> &#x0003C; 0.001 (&#x0002A;&#x0002A;&#x0002A;)].</p></caption>
<graphic mimetype="image" mime-subtype="tiff" xlink:href="fcomm-09-1338844-g0008.tif"/>
</fig>
<p>With a pseudo-R<sup>2</sup> of 0.64 for the total effects and a pseudo-R<sup>2</sup> of 0.36 for the fixed effects, the model summarized in <xref ref-type="table" rid="T3">Table 3</xref> explains a good amount of variance in the responses obtained. It shows that the participants were significantly worse at guessing the meaning of TMAI in the multimodal (with <italic>p</italic> = 0.04, OR = 0.13), visual (with <italic>p</italic> &#x0003C; 0.001, OR = 0.02) and acoustic condition (with <italic>p</italic> &#x0003C; 0.001, OR = 0.002) when compared to the context condition. It further shows that there is no significant difference between guessing requesting and stance-related TMAI correctly (with <italic>p</italic> = 0.33, OR = 2.09), but the ambiguous stimuli contribute to the model with borderline significance (with <italic>p</italic> = 0.06, OR = 0.35), suggesting that most incorrect guesses were due to the ambiguous stimuli, but not entirely.</p></sec>
<sec>
<title>4.2.4 Discussion of the forced-choice experiment</title>
<p>The experiment reported above shows that prosody alone can disambiguate TMAI if the prosodic features that are associated with the construction are displayed, i.e., the speaking rate in this case. If TMAI is ambiguous regarding its speaking rate and hearers lack other pieces of information, they seem to have difficulties in guessing its meaning. Vice versa, if the speaker produces TMAI with a slower speaking rate, hearers are more likely to understand this as stance-related TMAI, even if there are no further features available. Interestingly, the results also suggest that hearers use prosodic information alone to disambiguate TMAI about as accurate as they use visual information alone. This observation might suggest that the strength of association between prosodic properties and the construction is comparable to the one between visual properties and the construction.</p>
<p>Technically, these observations can be explained in two ways. One explanation is that slow speaking rate is an independent prosodic construction. Niebuhr (<xref ref-type="bibr" rid="B53">2010</xref>), for example, has shown that lengthened consonants correlate with negative sentiment in German. The same could be true for English stance-related TMAI. Informal observations of TMAI, however, suggest that it is not the lengthening of the consonants alone that result in a slower speaking rate, but also the lengthening of the vowels. At the same time, speaking rate alone does not explain all the findings observed in the experiment. There are quite a few stimuli that were neither slow nor fast (i.e., ambiguous), which posed no difficulties to the participants. This suggests that there might be more, albeit undetected, prosodic features associated with TMAI. Given that, it is possible that there is a (complex) prosodic construction that is often used with stance-related TMAI, but, at the moment, there is only scarce evidence for that. The other way to explain the findings of the experiment is to assume that the slow speaking rate is part of the stance-related construction, forming a multimodal construction. If there is, indeed, no prosodic construction that can be identified, and given that prosody is a mode, then stance-related TMAI must be considered a multimodal construction with morphosyntactic and prosodic (and, possibly, visual) features. Even if future studies show that there is a prosodic construction such as &#x0201C;slow speaking rate,&#x0201D; both the frequency with which it is used with stance-related TMAI and the apparent use of this construction to disambiguate TMAI would speak in favor of treating TMAI as a multimodal construction from a usage-based perspective.</p></sec>
</sec>
</sec>
<sec id="s5">
<title>5 Conclusions</title>
<p>The present paper had two objectives. The first objective was to show that prosody and morphosyntax are two independent semiotic modes with distinguishable differences in material and form as well as independent contributions to the discourse semantics. It could be shown that the aspects of the sound stream that are relevant for spoken morphosyntax are not the same as the aspects that are relevant for prosody. Using these different aspects, hearers transform the input from the sound stream to either arrive at categories like /p/, /m/ or /e/ (spoken language) or high pitch, loud speech, and/or fast speech (prosody). These categories are then combined to form meaningful structures like <italic>It looks like a zebra</italic> (spoken language) or (contextually meaningful) assemblies conveying &#x0201C;consider this&#x0201D; (prosody), and they do so largely independent of one another. Since spoken language and prosody differ in all three layers of the semiotic mode, they must be considered independent. For constructional analyses, this means that prosody cannot be represented on a par with other, morphosyntactic and phonological, properties. Rather, it needs its own place. This place could take on the form of a prosodic construction (in case the prosodic configuration has an independent meaning) or of being part of a multimodal construction (in case the prosodic configuration has no independent meaning). Such a view on prosody strengthens the multidimensional network approach to language-related knowledge, which assumes that constructions are interrelated by various kinds of associations (Diessel, <xref ref-type="bibr" rid="B14">2023</xref>). Prosodic constructions as well as multimodal constructions are prime examples of such a network of (cross-modal and multimodal) associations.</p>
<p>The second objective of the present paper was to provide evidence for a multimodal construction consisting of, at least, a morphosyntactic and a prosodic form. Both corpus and experimental evidence suggest that the stance-related use of <italic>Tell me about it</italic> is a likely candidate for such a multimodal construction. Regarding its prosodic form, stance-related <italic>Tell me about it</italic> is slower in tempo than its requesting counterpart. When language users are provided with nothing but this difference in tempo (i.e., they lack other clues like sequential context or visuals), they use this prosodic feature to disambiguate <italic>Tell me about it</italic>. In other words, this knowledge on the two uses of <italic>Tell me about it</italic> must be stored in the language users&#x00027; minds in some way. Stance-related <italic>Tell me about it</italic> thus fulfills Ziem&#x00027;s second condition of multimodal constructions, because it cannot be considered a construction that is &#x0201C;solely realized in a multimodal way,&#x0201D; but the paper has shown that it is an entrenched cooccurrence of a verbal and a prosodic form. In conclusion, the evidence presented in this study on <italic>Tell me about it</italic> is strongly suggestive of the existence of multimodal constructions. As a consequence, the role prosody plays in forming them needs more systematic attention in constructional analyses.</p>
<p>From a methodological perspective, the present paper could show that a triangulation of corpus and experimental evidence is valuable because it was able to shed light on both the production and the comprehension side of language and, in doing so, draw a complementary picture of prosody and multimodal constructions. However, the present study suffers from obvious limitations that require further systematic attention in future studies. One limitation is the low number of participants in the forced-choice experiment and the missing demographic information. From a usage-based perspective, the constructional network (including multimodal and prosodic constructions) is dynamic and, therefore, can vary for certain demographic groups. This aspect is not reflected in the present study and needs to be addressed in the future. In addition, future research also needs to address the role prosody plays in the constructional network in more detail. Studies that explore prosodic and multimodal constructions could identify the exact (inter)relations and associations between different types of constructions and, thereby, provide an answer to the question if multimodality is a central or a peripheral aspect of grammar.</p></sec>
<sec sec-type="data-availability" id="s6">
<title>Data availability statement</title>
<p>The datasets presented in this study can be found in online repositories. The names of the repository/repositories and accession number(s) can be found at: <ext-link ext-link-type="uri" xlink:href="https://osf.io/2sq7h/?view_only=746f3703bbde4236b832b34234d51beb">https://osf.io/2sq7h/?view_only=746f3703bbde4236b832b34234d51beb</ext-link>.</p></sec>
<sec sec-type="ethics-statement" id="s7">
<title>Ethics statement</title>
<p>Ethical approval was not required for the study involving human participants in accordance with the local legislation and institutional requirements. Written informed consent to participate in this study was not required from the participants in accordance with the national legislation and the institutional requirements.</p></sec>
<sec sec-type="author-contributions" id="s8">
<title>Author contributions</title>
<p>CL: Writing &#x02013; original draft, Writing &#x02013; review &#x00026; editing.</p></sec>
</body>
<back>
<sec sec-type="funding-information" id="s9">
<title>Funding</title>
<p>The author(s) declare financial support was received for the research, authorship, and/or publication of this article. This study was funded by the Deutsche Forschungsgemeinschaft (DFG, German Research Foundation)&#x02014;Projektnummer 491466077.</p>
</sec>
<ack><p>My thanks go to John Bateman. John was the best mentor you can imagine during my time at the University of Bremen. Without him, this paper would be less rigorous and less advanced in almost every respect.</p>
</ack>
<sec sec-type="COI-statement" id="conf1">
<title>Conflict of interest</title>
<p>The author declares that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.</p>
<p>The author(s) declared that they were an editorial board member of Frontiers, at the time of submission. This had no impact on the peer review process and the final decision.</p>
</sec>
<sec sec-type="disclaimer" id="s10">
<title>Publisher&#x00027;s note</title>
<p>All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article, or claim that may be made by its manufacturer, is not guaranteed or endorsed by the publisher.</p>
</sec>
<fn-group>
<fn id="fn0001"><p><sup>1</sup>The other two conditions follow from the first two and therefore do not need explicit attention. The third condition specifies what should not be considered a multimodal construction (e.g., a construction only realized multimodally) and the fourth condition states that multimodal constructions need to be part of the constructional network of a language, i.e., a network that covers the relevant knowledge a speaker of that language needs for understanding.</p></fn>
<fn id="fn0002"><p><sup>2</sup>Languages other than English have non-pulmonic sounds, i.e., sounds where the airflow does not come from the lungs, but these will not be considered here.</p></fn>
<fn id="fn0003"><p><sup>3</sup>All examples of TMAI come from the <italic>NewsScape Library of International Television News</italic>, an archive of televised discourse (Steen and Turner, <xref ref-type="bibr" rid="B66">2013</xref>). At the end of each example, the name of the source file and the relevant times are provided. Video snippets of the examples are provided on OSF: <ext-link ext-link-type="uri" xlink:href="https://osf.io/2sq7h/?view_only=746f3703bbde4236b832b34234d51beb">https://osf.io/2sq7h/?view_only=746f3703bbde4236b832b34234d51beb</ext-link>.</p></fn>
<fn id="fn0004"><p><sup>4</sup>There seem to be at least two reasons why the participants scored low in correctness for these prototypical stimuli. One reason might be the timing of TMAI and the visuals. That is, for some visual stimuli, some important visual displays (gaze aversion, raised eyebrows, and smiling) occurred right before, but not during the speaker uttered TMAI. This non-synchrony might have affected the speakers&#x00027; choices. Another reason might be that the model reported in Lehmann (<xref ref-type="bibr" rid="B41">2023</xref>) is incomplete. It seems that, while the duration of TMAI is a good predictor, it is not the only one.</p></fn>
</fn-group>
<ref-list>
<title>References</title>
<ref id="B1">
<citation citation-type="book"><person-group person-group-type="author"><name><surname>Aarts</surname> <given-names>B.</given-names></name> <name><surname>Bowie</surname> <given-names>J.</given-names></name> <name><surname>Popova</surname> <given-names>G.</given-names></name></person-group> (<year>2019</year>). <article-title>&#x0201C;Introduction,&#x0201D;</article-title> in <source>The Oxford Handbook of English Grammar</source>, eds. B. Aarts, J. Bowie, and G. Popova (<publisher-loc>Oxford</publisher-loc>: <publisher-name>Oxford University Press</publisher-name>), 1.</citation>
</ref>
<ref id="B2">
<citation citation-type="book"><person-group person-group-type="author"><name><surname>Bateman</surname> <given-names>J.</given-names></name></person-group> (<year>2011</year>). <article-title>&#x0201C;The decomposability of semiotic modes,&#x0201D;</article-title> in <source>Multimodal Studies: Exploring Issues and Domains</source>, eds. K. O&#x00027;Halloran and B. Smith (<publisher-loc>New York, NY</publisher-loc>: <publisher-name>Routledge</publisher-name>), <fpage>17</fpage>&#x02013;<lpage>38</lpage>.</citation>
</ref>
<ref id="B3">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Bateman</surname> <given-names>J.</given-names></name></person-group> (<year>2022</year>). <article-title>Growing theory for practice: empirical multimodality beyond the case study</article-title>. <source>Multimodal Commun.</source> <volume>11</volume>, <fpage>63</fpage>&#x02013;<lpage>74</lpage>. <pub-id pub-id-type="doi">10.1515/mc-2021-0006</pub-id></citation>
</ref>
<ref id="B4">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Bateman</surname> <given-names>J.</given-names></name> <name><surname>Wildfeuer</surname> <given-names>J.</given-names></name></person-group> (<year>2014</year>). <article-title>A multimodal discourse theory of visual narrative</article-title>. <source>J. Pragmat.</source> <volume>74</volume>, <fpage>180</fpage>&#x02013;<lpage>208</lpage>. <pub-id pub-id-type="doi">10.1016/j.pragma.2014.10.001</pub-id></citation>
</ref>
<ref id="B5">
<citation citation-type="book"><person-group person-group-type="author"><name><surname>Bateman</surname> <given-names>J.</given-names></name> <name><surname>Wildfeuer</surname> <given-names>J.</given-names></name> <name><surname>Hiippala</surname> <given-names>T.</given-names></name></person-group> (<year>2017</year>). <source>Multimodality: Foundations, Research and Analysis &#x02013; A Problem-Oriented Introduction</source>. <publisher-loc>Berlin</publisher-loc>: <publisher-name>De Gruyter Mouton</publisher-name>.</citation>
</ref>
<ref id="B6">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Bates</surname> <given-names>D.</given-names></name> <name><surname>M&#x000E4;chler</surname> <given-names>M.</given-names></name> <name><surname>Bolker</surname> <given-names>B.</given-names></name> <name><surname>Walker</surname> <given-names>S.</given-names></name></person-group> (<year>2015</year>). <article-title>Fitting linear mixed-effects models using Lme4</article-title>. <source>J. Stat. Softw.</source> <volume>67</volume>, <fpage>1</fpage>&#x02013;<lpage>48</lpage>. <pub-id pub-id-type="doi">10.18637/jss.v067.i01</pub-id></citation>
</ref>
<ref id="B7">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>B&#x000FC;low</surname> <given-names>L.</given-names></name> <name><surname>Merten</surname> <given-names>M. L.</given-names></name> <name><surname>Johann</surname> <given-names>M.</given-names></name></person-group> (<year>2018</year>). <article-title>Internet-Memes Als Zugang Zu Multimodalen Konstruktionen</article-title>. <source>Zeitschrift f&#x000FC;r Angewandte Linguistik</source> <volume>69</volume>, <fpage>1</fpage>&#x02013;<lpage>32</lpage>. <pub-id pub-id-type="doi">10.1515/zfal-2018-0015</pub-id></citation>
</ref>
<ref id="B8">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Bybee</surname> <given-names>J.</given-names></name></person-group> (<year>2006</year>). <article-title>From usage to grammar: the mind&#x00027;s response to repetition</article-title>. <source>Language</source> <volume>82</volume>, <fpage>711</fpage>&#x02013;<lpage>733</lpage>. <pub-id pub-id-type="doi">10.1353/lan.2006.0186</pub-id></citation>
</ref>
<ref id="B9">
<citation citation-type="book"><person-group person-group-type="author"><name><surname>Bybee</surname> <given-names>J.</given-names></name></person-group> (<year>2013</year>). <article-title>&#x0201C;Usage-based theory and exemplar representations of constructions,&#x0201D;</article-title> in <source>The Oxford Handbook of Construction Grammar</source>, eds. T. Hoffmann and G. Trousdale (<publisher-loc>Oxford</publisher-loc>: <publisher-name>Oxford University Press</publisher-name>), <fpage>49</fpage>&#x02013;<lpage>69</lpage>.</citation>
</ref>
<ref id="B10">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Clark</surname> <given-names>H. H.</given-names></name></person-group> (<year>2016</year>). <article-title>Depicting as a method of communication</article-title>. <source>Psychol. Rev.</source> <volume>123</volume>, <fpage>324</fpage>&#x02013;<lpage>347</lpage>. <pub-id pub-id-type="doi">10.1037/rev0000026</pub-id><pub-id pub-id-type="pmid">26855255</pub-id></citation></ref>
<ref id="B11">
<citation citation-type="book"><person-group person-group-type="author"><collab>Council for Cultural Co-operation Education Committee, and Modern Languages Division.</collab></person-group> (<year>2001</year>). <source>Common European Framework of Reference for Languages: Learning, Teaching, Assessment</source>. <publisher-loc>Cambridge</publisher-loc>: <publisher-name>Cambridge University Press</publisher-name>.</citation>
</ref>
<ref id="B12">
<citation citation-type="book"><person-group person-group-type="author"><name><surname>Croft</surname> <given-names>W.</given-names></name> <name><surname>Cruse</surname> <given-names>D. A.</given-names></name></person-group> (<year>2004</year>). <source>Cognitive Linguistics.</source> <publisher-loc>Cambridge</publisher-loc>: <publisher-name>Cambridge University Press</publisher-name>.</citation>
</ref>
<ref id="B13">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Dancygier</surname> <given-names>B.</given-names></name> <name><surname>Vandelanotte</surname> <given-names>L.</given-names></name></person-group> (<year>2017</year>). <article-title>Internet memes as multimodal constructions</article-title>. <source>Cogn. Linguist.</source> <volume>28</volume>, <fpage>565</fpage>&#x02013;<lpage>598</lpage>. <pub-id pub-id-type="doi">10.1515/cog-2017-0074</pub-id></citation>
</ref>
<ref id="B14">
<citation citation-type="book"><person-group person-group-type="author"><name><surname>Diessel</surname> <given-names>H.</given-names></name></person-group> (<year>2023</year>). <source>The Constructicon: Taxonomies and Networks</source>. <publisher-loc>Cambridge</publisher-loc>: <publisher-name>Cambridge University Press</publisher-name>.</citation>
</ref>
<ref id="B15">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Dingemanse</surname> <given-names>M.</given-names></name> <name><surname>Blasi</surname> <given-names>D. E.</given-names></name> <name><surname>Gary</surname> <given-names>L.</given-names></name> <name><surname>Christiansen</surname> <given-names>M. H.</given-names></name> <name><surname>Monaghan</surname> <given-names>P.</given-names></name></person-group> (<year>2015</year>). <article-title>Arbitrariness, iconicity, and systematicity in language</article-title>. <source>Trends Cogn. Sci.</source> <volume>19</volume>, <fpage>603</fpage>&#x02013;<lpage>615</lpage>. <pub-id pub-id-type="doi">10.1016/j.tics.2015.07.013</pub-id><pub-id pub-id-type="pmid">26412098</pub-id></citation></ref>
<ref id="B16">
<citation citation-type="book"><person-group person-group-type="author"><name><surname>Divjak</surname> <given-names>D.</given-names></name></person-group> (<year>2019</year>). <source>Frequency in Language: Memory, Attention and Learning</source>. <publisher-loc>Cambridge</publisher-loc>: <publisher-name>Cambridge University Press</publisher-name>.</citation>
</ref>
<ref id="B17">
<citation citation-type="book"><person-group person-group-type="author"><name><surname>Elvira-Garc&#x000ED;a</surname> <given-names>W.</given-names></name></person-group> (<year>2019</year>). <article-title>&#x0201C;Two constructions, one syntactic form: perceptual prosodic differences between elliptical and independent clauses in Spanish,&#x0201D;</article-title> in <source>Insubordination. Theoretical and Empirical Issues</source>, eds. K. Beijering, G. Kaltenb&#x000F6;ck, and M. S. Sansi&#x000F1;ena (<publisher-loc>Berlin/Boston, MA</publisher-loc>: <publisher-name>De Gruyter Mouton</publisher-name>), <fpage>240</fpage>&#x02013;<lpage>264</lpage>.</citation>
</ref>
<ref id="B18">
<citation citation-type="book"><person-group person-group-type="author"><name><surname>F&#x000E9;ry</surname> <given-names>C.</given-names></name></person-group> (<year>2017</year>). <source>Intonation and Prosodic Structure</source>. <publisher-loc>Cambridge</publisher-loc>: <publisher-name>Cambridge University Press</publisher-name>.</citation>
</ref>
<ref id="B19">
<citation citation-type="book"><person-group person-group-type="author"><name><surname>Feyaerts</surname> <given-names>K.</given-names></name> <name><surname>Br&#x000F4;ne</surname> <given-names>G.</given-names></name> <name><surname>Oben</surname> <given-names>B.</given-names></name></person-group> (<year>2017</year>). <article-title>&#x0201C;Multimodality in interaction,&#x0201D;</article-title> in <source>The Cambridge Handbook of Cognitive Linguistics</source>, ed. B. Dancygier (<publisher-loc>Cambridge</publisher-loc>: <publisher-name>Cambridge University Press</publisher-name>), <fpage>135</fpage>&#x02013;<lpage>156</lpage>.</citation>
</ref>
<ref id="B20">
<citation citation-type="book"><person-group person-group-type="author"><name><surname>Fillmore</surname> <given-names>C. J.</given-names></name> <name><surname>Lee-Goldman</surname> <given-names>R. R.</given-names></name> <name><surname>Rhodes</surname> <given-names>R.</given-names></name></person-group> (<year>2012</year>). <article-title>&#x0201C;The FrameNet constructicon,&#x0201D;</article-title> in <source>Sign-Based Construction Grammar</source>, eds. H. C. Boas and I. A. Sag (<publisher-loc>Stanford</publisher-loc>: <publisher-name>CSLI</publisher-name>), <fpage>309</fpage>&#x02013;<lpage>379</lpage>.</citation>
</ref>
<ref id="B21">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Fry</surname> <given-names>D. B.</given-names></name></person-group> (<year>1955</year>). <article-title>Duration intensity as physical correlates of linguistic stress</article-title>. <source>J. Acoust. Soc. Am.</source> <volume>32</volume>, <fpage>765</fpage>&#x02013;<lpage>769</lpage>. <pub-id pub-id-type="doi">10.1121/1.1908022</pub-id></citation>
</ref>
<ref id="B22">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Fry</surname> <given-names>D. B.</given-names></name></person-group> (<year>1958</year>). <article-title>Experiments in the perception of stress</article-title>. <source>Lang. Speech</source> <volume>1</volume>, <fpage>126</fpage>&#x02013;<lpage>152</lpage>. <pub-id pub-id-type="doi">10.1177/002383095800100207</pub-id></citation>
</ref>
<ref id="B23">
<citation citation-type="book"><person-group person-group-type="author"><name><surname>Goldberg</surname> <given-names>A. E.</given-names></name></person-group> (<year>1995</year>). <source>Constructions: A Construction Grammar Approach to Argument Structure</source>. <publisher-loc>Chicago, IL</publisher-loc>: <publisher-name>University of Chicago Press</publisher-name>.</citation>
</ref>
<ref id="B24">
<citation citation-type="book"><person-group person-group-type="author"><name><surname>Goldberg</surname> <given-names>A. E.</given-names></name></person-group> (<year>2006</year>). <source>Constructions at Work: The Nature of Generalizations in Language</source>. <publisher-loc>Oxford</publisher-loc>: <publisher-name>Oxford University Press</publisher-name>.</citation>
</ref>
<ref id="B25">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Gras</surname> <given-names>P.</given-names></name> <name><surname>Elvira-Garc&#x000ED;a</surname> <given-names>W.</given-names></name></person-group> (<year>2021</year>). <article-title>The role of intonation in construction grammar: on prosodic constructions</article-title>. <source>J. Pragmat.</source> <volume>180</volume>, <fpage>232</fpage>&#x02013;<lpage>247</lpage>. <pub-id pub-id-type="doi">10.1016/j.pragma.2021.05.010</pub-id></citation>
</ref>
<ref id="B26">
<citation citation-type="book"><person-group person-group-type="author"><name><surname>Gussenhoven</surname> <given-names>C.</given-names></name></person-group> (<year>2004</year>). <source>The Phonology of Tone and Intonation</source>. <publisher-loc>Cambridge</publisher-loc>: <publisher-name>Cambridge University Press</publisher-name>.</citation>
</ref>
<ref id="B27">
<citation citation-type="web"><person-group person-group-type="author"><name><surname>Hedberg</surname> <given-names>N.</given-names></name> <name><surname>Sosa</surname> <given-names>J. M.</given-names></name> <name><surname>Fadden</surname> <given-names>L.</given-names></name></person-group> (<year>2003</year>). <article-title>&#x0201C;The intonation of contradictions in American English,&#x0201D;</article-title> in <source>Prosody and Pragmatics Conference</source>, 1&#x02013;12. Available online at: <ext-link ext-link-type="uri" xlink:href="https://www.sfu.ca/&#x0007E;hedberg/Preston_paper_text4.pdf">https://www.sfu.ca/&#x0007E;hedberg/Preston_paper_text4.pdf</ext-link> (accession October 13, 2023).</citation>
</ref>
<ref id="B28">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Hiippala</surname> <given-names>T.</given-names></name></person-group> (<year>2017</year>). <article-title>The multimodality of digital longform journalism</article-title>. <source>Digit. Journal.</source> <volume>5</volume>, <fpage>420</fpage>&#x02013;<lpage>442</lpage>. <pub-id pub-id-type="doi">10.1080/21670811.2016.1169197</pub-id></citation>
</ref>
<ref id="B29">
<citation citation-type="book"><person-group person-group-type="author"><name><surname>Hilpert</surname> <given-names>M.</given-names></name></person-group> (<year>2019</year>). <source>Construction Grammar and Its Application to English, 2nd Edn</source>. <publisher-loc>Edinburgh</publisher-loc>: <publisher-name>Edinburgh University Press</publisher-name>.</citation>
</ref>
<ref id="B30">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Hoffmann</surname> <given-names>T.</given-names></name></person-group> (<year>2017</year>). <article-title>Multimodal constructs &#x02013; multimodal constructions? The role of constructions in the working memory</article-title>. <source>Linguist. Vanguard</source> <volume>3</volume>:<fpage>20160042</fpage>. <pub-id pub-id-type="doi">10.1515/lingvan-2016-0042</pub-id></citation>
</ref>
<ref id="B31">
<citation citation-type="book"><person-group person-group-type="author"><name><surname>Hoffmann</surname> <given-names>T.</given-names></name></person-group> (<year>2022</year>). <source>Construction Grammar: The Structure of English.</source> <publisher-loc>Cambridge</publisher-loc>: <publisher-name>Cambridge University Press</publisher-name>.</citation>
</ref>
<ref id="B32">
<citation citation-type="book"><person-group person-group-type="author"><name><surname>Hoffmann</surname> <given-names>T.</given-names></name> <name><surname>Trousdale</surname> <given-names>G.</given-names></name></person-group> (<year>2013</year>). <source>The Oxford Handbook of Construction Grammar.</source> <publisher-loc>Oxford</publisher-loc>: <publisher-name>Oxford University Press</publisher-name>.</citation>
</ref>
<ref id="B33">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Hsu</surname> <given-names>H. C.</given-names></name> <name><surname>Br&#x000F4;ne</surname> <given-names>G.</given-names></name> <name><surname>Feyaerts</surname> <given-names>K.</given-names></name></person-group> (<year>2021</year>). <article-title>When gesture &#x0201C;takes over&#x0201D;: speech-embedded nonverbal depictions in multimodal interaction</article-title>. <source>Front. Psychol.</source> <volume>11</volume>:<fpage>552533</fpage>. <pub-id pub-id-type="doi">10.3389/fpsyg.2020.552533</pub-id><pub-id pub-id-type="pmid">33643106</pub-id></citation></ref>
<ref id="B34">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Imai</surname> <given-names>M.</given-names></name> <name><surname>Kita</surname> <given-names>S.</given-names></name></person-group> (<year>2014</year>). <article-title>The sound symbolism bootstrapping hypothesis for language acquisition and language evolution</article-title>. <source>Philos. Trans. Royal Soc. B</source> <volume>369</volume>:<fpage>20130298</fpage>. <pub-id pub-id-type="doi">10.1098/rstb.2013.0298</pub-id><pub-id pub-id-type="pmid">25092666</pub-id></citation></ref>
<ref id="B35">
<citation citation-type="book"><person-group person-group-type="author"><name><surname>Imo</surname> <given-names>W.</given-names></name> <name><surname>Lanwer</surname> <given-names>J. P.</given-names></name></person-group> (<year>2020</year>). <source>Prosodie und Konstruktionsgrammatik.</source> <publisher-loc>Berlin</publisher-loc>: <publisher-name>De Gruyter</publisher-name>.</citation>
</ref>
<ref id="B36">
<citation citation-type="book"><person-group person-group-type="author"><name><surname>K&#x000F6;hler</surname> <given-names>W.</given-names></name></person-group> (<year>1929</year>). <source>Gestalt Psychology.</source> <publisher-loc>New York, NY</publisher-loc>: <publisher-name>Liveright</publisher-name>.</citation>
</ref>
<ref id="B37">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Kress</surname> <given-names>G.</given-names></name></person-group> (<year>2000</year>). <article-title>Multimodality: challenges to thinking about language</article-title>. <source>TESOL Quarterly</source> <volume>34</volume>, <fpage>337</fpage>&#x02013;<lpage>340</lpage>. <pub-id pub-id-type="doi">10.2307/3587959</pub-id></citation>
</ref>
<ref id="B38">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Kurumada</surname> <given-names>C.</given-names></name> <name><surname>Brown</surname> <given-names>M.</given-names></name> <name><surname>Tanenhaus</surname> <given-names>M.</given-names></name></person-group> (<year>2012</year>). <article-title>Pragmatic interpretation of contrastive prosody: it looks like speech adaptation</article-title>. <source>Proc. Ann. Meet. Cogn. Sci. Soc.</source> <volume>34</volume>, <fpage>647</fpage>&#x02013;<lpage>652</lpage>.</citation>
</ref>
<ref id="B39">
<citation citation-type="book"><person-group person-group-type="author"><name><surname>Ladewig</surname> <given-names>S.</given-names></name></person-group> (<year>2020</year>). <article-title>Integrating Gestures: The Dimension of Multimodality in Cognitive Grammar</article-title>. <publisher-loc>Berlin</publisher-loc>: <publisher-name>De Gruyter</publisher-name>.</citation>
</ref>
<ref id="B40">
<citation citation-type="book"><person-group person-group-type="author"><name><surname>Lanwer</surname> <given-names>J. P.</given-names></name></person-group> (<year>2020</year>). <article-title>&#x0201C;Appositive syntax oder appositive prosodie?&#x0201D;</article-title> in <source>Prosodie und Konstruktionsgrammatik</source>, eds. W. Imo and J. P. Lanwer (<publisher-loc>Berlin</publisher-loc>: <publisher-name>De Gruyter</publisher-name>), <fpage>233</fpage>&#x02013;<lpage>281</lpage>.</citation>
</ref>
<ref id="B41">
<citation citation-type="book"><person-group person-group-type="author"><name><surname>Lehmann</surname> <given-names>C.</given-names></name></person-group> (<year>2023</year>). <article-title>&#x0201C;Multimodal markers of irony in televised discourse: a corpus-based approach,&#x0201D;</article-title> in <source>Multimodal Im/politeness: Signed, Spoken, Written</source>, eds. L. Brown, I. H&#x000FC;bscher, and A. H. Jucker (<publisher-loc>Amsterdam</publisher-loc>: <publisher-name>Benjamins</publisher-name>), <fpage>251</fpage>&#x02013;<lpage>272</lpage>.</citation>
</ref>
<ref id="B42">
<citation citation-type="book"><person-group person-group-type="author"><name><surname>Lehmann</surname> <given-names>C.</given-names></name></person-group> (in press). <article-title>&#x0201C;The prosody of irony is diverse and sometimes construction-specific,&#x0201D;</article-title> in <source>Interfaces of Phonetics</source>, ed. M. Schlechtweg (<publisher-loc>Berlin</publisher-loc>: <publisher-name>De Gruyter</publisher-name>)</citation>
</ref>
<ref id="B43">
<citation citation-type="web"><person-group person-group-type="author"><name><surname>Leiner</surname> <given-names>D. J.</given-names></name></person-group> (<year>2021</year>). <source>SoSci Survey</source>. Available online at: <ext-link ext-link-type="uri" xlink:href="https://www.soscisurvey.de">https://www.soscisurvey.de</ext-link> (accessed June 18, 2023).</citation>
</ref>
<ref id="B44">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Lelandais</surname> <given-names>M.</given-names></name> <name><surname>Ferr&#x000E9;</surname> <given-names>G.</given-names></name></person-group> (<year>2019</year>). <article-title>The verbal, vocal, and gestural expression of (in)dependency in two types of subordinate constructions</article-title>. <source>J. Corpora Discour. Stud.</source> <volume>2</volume>, <fpage>117</fpage>&#x02013;<lpage>143</lpage>. <pub-id pub-id-type="doi">10.18573/jcads.4</pub-id></citation>
</ref>
<ref id="B45">
<citation citation-type="book"><person-group person-group-type="author"><name><surname>Levinson</surname> <given-names>S. C.</given-names></name></person-group> (<year>2006</year>). <article-title>&#x0201C;Deixis,&#x0201D;</article-title> in <source>The Handbook of Pragmatics</source>, eds. L. R. Horn and G. Ward (<publisher-loc>Hoboken</publisher-loc>: <publisher-name>Wiley</publisher-name>), <fpage>97</fpage>&#x02013;<lpage>121</lpage>.</citation>
</ref>
<ref id="B46">
<citation citation-type="book"><person-group person-group-type="author"><name><surname>Levis</surname> <given-names>J. M.</given-names></name> <name><surname>Wichmann</surname> <given-names>A.</given-names></name></person-group> (<year>2015</year>). <article-title>&#x0201C;English intonation - form and meaning,&#x0201D;</article-title> in <source>The Handbook of English Pronunciation</source>, eds. M. Reed and J. M. Levis (<publisher-loc>Chichester</publisher-loc>: <publisher-name>Wiley-Blackwell</publisher-name>), <fpage>139</fpage>&#x02013;<lpage>155</lpage>.</citation>
</ref>
<ref id="B47">
<citation citation-type="book"><person-group person-group-type="author"><name><surname>Liberman</surname> <given-names>M.</given-names></name> <name><surname>Sag</surname> <given-names>I. A.</given-names></name></person-group> (<year>1974</year>). <article-title>&#x0201C;Prosodic form and discourse function,&#x0201D;</article-title> in <source>Papers from the Tenth Regional Meeting Chicago Linguistic Society</source>, eds. M. W. La Galy, R. A. Fox, and A. Bruck (<publisher-loc>Chicago, IL</publisher-loc>: <publisher-name>Chicago Linguistic Society</publisher-name>), <fpage>416</fpage>&#x02013;<lpage>427</lpage>.</citation>
</ref>
<ref id="B48">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Lieberman</surname> <given-names>P.</given-names></name></person-group> (<year>1960</year>). <article-title>Some acoustic correlates of word stress in American English</article-title>. <source>J. Acoust. Soc. Am.</source> <volume>32</volume>, <fpage>451</fpage>&#x02013;<lpage>454</lpage>. <pub-id pub-id-type="doi">10.1121/1.1908095</pub-id></citation>
</ref>
<ref id="B49">
<citation citation-type="web"><person-group person-group-type="author"><name><surname>Long</surname> <given-names>J. A.</given-names></name></person-group> (<year>2022</year>). <source>Jtools: Analysis and Presentation of Social Scientific Data</source>. Available online at: <ext-link ext-link-type="uri" xlink:href="https://CRAN.R-project.org/package=jtools">https://CRAN.R-project.org/package=jtools</ext-link> (accessed October 13, 2023).</citation>
</ref>
<ref id="B50">
<citation citation-type="web"><person-group person-group-type="author"><name><surname>L&#x000FC;decke</surname> <given-names>D.</given-names></name></person-group> (<year>2023</year>). <source>sjPlot: Data Visualization for Statistics in Social Science.</source> R package version 2.8.15. Available online at: <ext-link ext-link-type="uri" xlink:href="https://CRAN.R-project.org/package=sjPlot">https://CRAN.R-project.org/package=sjPlot</ext-link></citation>
</ref>
<ref id="B51">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Marandin</surname> <given-names>J.-M.</given-names></name></person-group> (<year>2006</year>). <source>Contours as Constructions. Constructions Special Volume 1</source>. <pub-id pub-id-type="doi">10.24338/cons-448</pub-id></citation>
</ref>
<ref id="B52">
<citation citation-type="web"><person-group person-group-type="author"><name><surname>Neitsch</surname> <given-names>J.</given-names></name> <name><surname>Niebuhr</surname> <given-names>O.</given-names></name></person-group> (<year>2019</year>). <article-title>&#x0201C;Questions as prosodic configurations: how prosody and context shape the multiparametric acoustic nature of rhetorical questions in German,&#x0201D;</article-title> in <source>Proceedings of the 19th International Congress of Phonetic Sciences</source>, eds. S. Calhoun, P. Escudero, M. Tabain and P. Warren (<publisher-loc>Canberra, ACT</publisher-loc>: <publisher-name>Australasian Speech Science and Technology Association</publisher-name>), <fpage>2425</fpage>&#x02013;<lpage>2429</lpage>. Available online at: <ext-link ext-link-type="uri" xlink:href="https://www.internationalphoneticassociation.org/icphs-proceedings/ICPhS2019/papers/ICPhS2019_Proceedings.pdf">https://www.internationalphoneticassociation.org/icphs-proceedings/ICPhS2019/papers/ICPhS2019_Proceedings.pdf</ext-link> (accessed October 13, 2023).</citation>
</ref>
<ref id="B53">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Niebuhr</surname> <given-names>O.</given-names></name></person-group> (<year>2010</year>). <article-title>On the phonetics of intensifying emphasis in German</article-title>. <source>Phonetica</source> <volume>67</volume>, <fpage>170</fpage>&#x02013;<lpage>198</lpage>. <pub-id pub-id-type="doi">10.1159/000321054</pub-id><pub-id pub-id-type="pmid">20926915</pub-id></citation></ref>
<ref id="B54">
<citation citation-type="book"><person-group person-group-type="author"><name><surname>Niebuhr</surname> <given-names>O.</given-names></name></person-group> (<year>2019</year>). <article-title>&#x0201C;Pitch accents as multiparametric configurations of prosodic features &#x02013; evidence from pitch-accent specific micro-rhythms in German,&#x0201D;</article-title> in <source>A Sound Approach to Language Matters - in Honor of Ocke-Schwen Bohn</source>, eds. A. M. Nyvad, M. Hejn&#x000E1;, A. Hojen, A. B. Jespersen, and M. H. Sorensen (<publisher-loc>Aarhus</publisher-loc>: <publisher-name>Aarhus University Press</publisher-name>), <fpage>321</fpage>&#x02013;<lpage>351</lpage>.</citation>
</ref>
<ref id="B55">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Ningelgen</surname> <given-names>J.</given-names></name> <name><surname>Auer</surname> <given-names>P.</given-names></name></person-group> (<year>2017</year>). <article-title>Is there a multimodal construction based on non-deictic so in German?</article-title> <source>Linguist. Vanguard</source> <volume>3</volume>:<fpage>20160051</fpage>. <pub-id pub-id-type="doi">10.1515/lingvan-2016-0051</pub-id></citation>
</ref>
<ref id="B56">
<citation citation-type="book"><person-group person-group-type="author"><name><surname>Nolan</surname> <given-names>F.</given-names></name></person-group> (<year>2021</year>). <article-title>&#x0201C;Intonation,&#x0201D;</article-title> in <source>The Handbook of English Linguistics, 2nd Edn</source>, eds. B. Aarts, A. McMahon, and L. Hinrichs (<publisher-loc>Chichester</publisher-loc>: <publisher-name>Wiley</publisher-name>), <fpage>385</fpage>&#x02013;<lpage>405</lpage>.</citation>
</ref>
<ref id="B57">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Palan</surname> <given-names>S.</given-names></name> <name><surname>Schitter</surname> <given-names>C.</given-names></name></person-group> (<year>2018</year>). <article-title>Prolific Ac&#x02014;a subject pool for online experiments</article-title>. <source>J. Behav. Exp. Fin.</source> <volume>17</volume>, <fpage>22</fpage>&#x02013;<lpage>27</lpage>. <pub-id pub-id-type="doi">10.1016/j.jbef.2017.12.004</pub-id></citation>
</ref>
<ref id="B58">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Perniss</surname> <given-names>P.</given-names></name></person-group> (<year>2018</year>). <article-title>Why we should study multimodal language</article-title>. <source>Front. Psychol.</source> <volume>9</volume>:<fpage>e01109</fpage>. <pub-id pub-id-type="doi">10.3389/fpsyg.2018.01109</pub-id></citation>
</ref>
<ref id="B59">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>P&#x000F5;ldvere</surname> <given-names>N.</given-names></name> <name><surname>Paradis</surname> <given-names>C.</given-names></name></person-group> (<year>2020</year>). <article-title>&#x02018;What and Then a little robot brings it to you?&#x00027; The reactive <italic>What-X</italic> construction in spoken dialogue</article-title>. <source>Engl. Lang. Linguist.</source> <volume>24</volume>, <fpage>307</fpage>&#x02013;<lpage>332</lpage>. <pub-id pub-id-type="doi">10.1017/S.1360674319000091</pub-id></citation>
</ref>
<ref id="B60">
<citation citation-type="web"><person-group person-group-type="author"><collab>R Core Team</collab></person-group> (<year>2022</year>). <source>R: A Language and Environment for Statistical Computing</source>. Vienna: R Foundation for Statistical Computing. Available online at: <ext-link ext-link-type="uri" xlink:href="https://www.r-project.org/">https://www.r-project.org/</ext-link> (accessed June 23, 2022).</citation>
</ref>
<ref id="B61">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Sacks</surname> <given-names>H.</given-names></name> <name><surname>Schegloff</surname> <given-names>E. A.</given-names></name> <name><surname>Jefferson</surname> <given-names>G. D.</given-names></name></person-group> (<year>1974</year>). <article-title>A simplest systematics for the organization of turn-taking for conversation</article-title>. <source>Language</source> <volume>50</volume>, <fpage>696</fpage>&#x02013;<lpage>735</lpage>. <pub-id pub-id-type="doi">10.1353/lan.1974.0010</pub-id></citation>
</ref>
<ref id="B62">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Sadat-Tehrani</surname> <given-names>N.</given-names></name></person-group> (<year>2010</year>). <article-title>An intonational construction</article-title>. <source>Constructions</source> <volume>3</volume>, <fpage>1</fpage>&#x02013;<lpage>13</lpage>. <pub-id pub-id-type="doi">10.24338/cons-451</pub-id></citation>
</ref>
<ref id="B63">
<citation citation-type="book"><person-group person-group-type="author"><name><surname>Schoonjans</surname> <given-names>S.</given-names></name></person-group> (<year>2018</year>). <source>Modalpartikeln als Multimodale Konstruktionen: Eine Korpusbasierte Kookkurrenzanalyse von Modalpartikeln und Gestik Im Deutschen</source>. <publisher-loc>Berlin</publisher-loc>: <publisher-name>De Gruyter</publisher-name>.</citation>
</ref>
<ref id="B64">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Sidhu</surname> <given-names>D. M.</given-names></name> <name><surname>Westbury</surname> <given-names>C.</given-names></name> <name><surname>Hollis</surname> <given-names>G.</given-names></name> <name><surname>Pexman</surname> <given-names>P. M.</given-names></name></person-group> (<year>2021</year>). <article-title>Sound symbolism shapes the English language: the Maluma/takete effect in English nouns</article-title>. <source>Psychon. Bullet. Rev.</source> <volume>28</volume>, <fpage>1390</fpage>&#x02013;<lpage>1398</lpage>. <pub-id pub-id-type="doi">10.3758/s13423-021-01883-3</pub-id><pub-id pub-id-type="pmid">33821463</pub-id></citation></ref>
<ref id="B65">
<citation citation-type="book"><person-group person-group-type="author"><name><surname>Singer</surname> <given-names>N. 1.</given-names></name></person-group> (<year>2023</year>). <source>Oxford English Dictionary</source>. <publisher-loc>Oxford</publisher-loc>: <publisher-name>Oxford University Press</publisher-name>.</citation>
</ref>
<ref id="B66">
<citation citation-type="book"><person-group person-group-type="author"><name><surname>Steen</surname> <given-names>F.</given-names></name> <name><surname>Turner</surname> <given-names>M. B.</given-names></name></person-group> (<year>2013</year>). <article-title>&#x0201C;Multimodal construction grammar,&#x0201D;</article-title> in <source>Language and the Creative Mind</source>, eds M. Borkent, B. Dancygier, and J. Hinnell (<publisher-loc>Stanford</publisher-loc>: <publisher-name>CSLI</publisher-name>), <fpage>255</fpage>&#x02013;<lpage>274</lpage>.</citation>
</ref>
<ref id="B67">
<citation citation-type="book"><person-group person-group-type="author"><name><surname>St&#x000F6;ckl</surname> <given-names>H.</given-names></name></person-group> (<year>2020</year>). <article-title>&#x0201C;Linguistic multimodality &#x02013; multimodal linguistics: a state-of-the-art sketch,&#x0201D;</article-title> in <source>Multimodality: Disciplinary Thoughts and the Challenge of Diversity</source>, eds. J. Wildfeuer, J. Pflaeging, J. Bateman, O. Seizov, and C. I. Tseng (<publisher-loc>Berlin</publisher-loc>: <publisher-name>DeGruyter</publisher-name>), <fpage>41</fpage>&#x02013;<lpage>68</lpage>.</citation>
</ref>
<ref id="B68">
<citation citation-type="book"><person-group person-group-type="author"><name><surname>Tell</surname> <given-names>V.</given-names></name></person-group> (<year>2023</year>). <source>In Oxford English Dictionary</source>. <publisher-loc>Oxford</publisher-loc>: <publisher-name>Oxford University Press</publisher-name>.</citation>
</ref>
<ref id="B69">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Tench</surname> <given-names>P.</given-names></name></person-group> (<year>1996</year>). <article-title>Intonation and the differentiation of syntactic patterns in English and German</article-title>. <source>Int. J. Appl. Linguist.</source> <volume>6</volume>, <fpage>223</fpage>&#x02013;<lpage>256</lpage>. <pub-id pub-id-type="doi">10.1111/j.1473-4192.1996.tb00096.x</pub-id></citation>
</ref>
<ref id="B70">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Uhrig</surname> <given-names>P.</given-names></name></person-group> (<year>2020</year>). <article-title>Multimodality in language and communication</article-title>. <source>Zeitschrift f&#x000FC;r Anglistik und Amerikanistik</source> <volume>68</volume>:<fpage>4</fpage>. <pub-id pub-id-type="doi">10.1515/zaa-2020-2019</pub-id></citation>
</ref>
<ref id="B71">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Uhrig</surname> <given-names>P.</given-names></name></person-group> (<year>2022</year>). <article-title>Hand gestures with verbs of throwing: collostructions, style and Metaphor</article-title>. <source>Yearb. German Cogn. Linguist. Assoc.</source> <volume>10</volume>, <fpage>99</fpage>&#x02013;<lpage>120</lpage>. <pub-id pub-id-type="doi">10.1515/gcla-2022-0006</pub-id></citation>
</ref>
<ref id="B72">
<citation citation-type="book"><person-group person-group-type="author"><name><surname>van Leeuwen</surname> <given-names>T.</given-names></name></person-group> (<year>2014</year>). <article-title>&#x0201C;Critical discourse analysis and multimodality,&#x0201D;</article-title> in <source>Contemporary Critical Discourse Studies</source>, eds. C. Hart and P. Cap (<publisher-loc>London</publisher-loc>: <publisher-name>Bloomsbury</publisher-name>), <fpage>281</fpage>&#x02013;<lpage>296</lpage>.</citation>
</ref>
<ref id="B73">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Vigliocco</surname> <given-names>G.</given-names></name> <name><surname>Perniss</surname> <given-names>P.</given-names></name> <name><surname>Vinson</surname> <given-names>D.</given-names></name></person-group> (<year>2014</year>). <article-title>Language as a Multimodal phenomenon: implications for language learning, processing and evolution</article-title>. <source>Philos. Trans. Royal Soc. Lond. Ser. B Biol. Sci.</source> <volume>369</volume>:<fpage>20130292</fpage>. <pub-id pub-id-type="doi">10.1098/rstb.2013.0292</pub-id><pub-id pub-id-type="pmid">25092660</pub-id></citation></ref>
<ref id="B74">
<citation citation-type="book"><person-group person-group-type="author"><name><surname>Ward</surname> <given-names>N. G.</given-names></name></person-group> (<year>2019</year>). <source>The Prosodic Patterns of English Conversation</source>. <publisher-loc>Cambridge</publisher-loc>: <publisher-name>Cambridge University Press</publisher-name>.</citation>
</ref>
<ref id="B75">
<citation citation-type="book"><person-group person-group-type="author"><name><surname>Wells</surname> <given-names>J. C.</given-names></name></person-group> (<year>2006</year>). <source>English Intonation: An Introduction.</source> <publisher-loc>Cambridge</publisher-loc>: <publisher-name>Cambridge University Press</publisher-name>.</citation>
</ref>
<ref id="B76">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Wichmann</surname> <given-names>A.</given-names></name> <name><surname>Blakemore</surname> <given-names>D.</given-names></name></person-group> (<year>2006</year>). <article-title>The prosody-pragmatics interface</article-title>. <source>J. Pragmat.</source> <volume>38</volume>, <fpage>1537</fpage>&#x02013;<lpage>1541</lpage>. <pub-id pub-id-type="doi">10.1016/j.pragma.2006.02.009</pub-id></citation>
</ref>
<ref id="B77">
<citation citation-type="web"><person-group person-group-type="author"><name><surname>Wickham</surname> <given-names>H.</given-names></name> <name><surname>Chang</surname> <given-names>W.</given-names></name> <name><surname>Henry</surname> <given-names>L.</given-names></name> <name><surname>Pedersen</surname> <given-names>T. L.</given-names></name> <name><surname>Takahashi</surname> <given-names>K.</given-names></name> <name><surname>Wilke</surname> <given-names>C.</given-names></name> <etal/></person-group>. (<year>2023</year>). <source>Ggplot2, Create Elegant Data Visualisations Using the Grammar of Graphics</source>. Available online at: <ext-link ext-link-type="uri" xlink:href="https://CRAN.R-project.org/package=ggplot2">https://CRAN.R-project.org/package=ggplot2</ext-link> (accessed November 13, 2023).</citation>
</ref>
<ref id="B78">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Ziem</surname> <given-names>A.</given-names></name></person-group> (<year>2017</year>). <article-title>Do we really need a multimodal construction grammar?</article-title> <source>Linguist. Vanguard</source> <volume>3</volume>:<fpage>20160095</fpage>. <pub-id pub-id-type="doi">10.1515/lingvan-2016-0095</pub-id></citation>
</ref>
<ref id="B79">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Zima</surname> <given-names>E.</given-names></name></person-group> (<year>2017</year>). <article-title>On the multimodality of [all the way from X PREP Y]</article-title>. <source>Linguist. Vanguard</source> <volume>3</volume>:<fpage>20160055</fpage>. <pub-id pub-id-type="doi">10.1515/lingvan-2016-0055</pub-id></citation>
</ref>
<ref id="B80">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Zima</surname> <given-names>E.</given-names></name> <name><surname>Bergs</surname> <given-names>A.</given-names></name></person-group> (<year>2017</year>). <article-title>Towards a multimodal construction grammar</article-title>. <source>Linguist. Vanguard</source> 3 :20161006. <pub-id pub-id-type="doi">10.1515/lingvan-2016-1006</pub-id></citation>
</ref>
</ref-list>
</back>
</article>
