Locality and Word Order in Active Dependency Formation in Bangla

Chacón, Dustin A.; Imtiaz, Mashrur; Dasgupta, Shirsho; Murshed, Sikder M.; Dan, Mina; Phillips, Colin

doi:10.3389/fpsyg.2016.01235

ORIGINAL RESEARCH article

Front. Psychol., 25 August 2016

Sec. Psychology of Language

Volume 7 - 2016 | https://doi.org/10.3389/fpsyg.2016.01235

Locality and Word Order in Active Dependency Formation in Bangla

DA
Dustin A. Chacón ^1,2^*
MI
Mashrur Imtiaz ³
SD
Shirsho Dasgupta ⁴
SM
Sikder M. Murshed ³
MD
Mina Dan ⁴
CP
Colin Phillips ¹

1. Department of Linguistics, University of Maryland, College Park College Park, MD, USA
2. Department of Linguistics, University of Minnesota Minneapolis, MN, USA
3. Department of Linguistics, University of Dhaka Dhaka, Bangladesh
4. Department of Linguistics, University of Calcutta Kolkata, India

Abstract

Research on filler-gap dependencies has revealed that there are constraints on possible gap sites, and that real-time sentence processing is sensitive to these constraints. This work has shown that comprehenders have preferences for potential gap sites, and immediately detect when these preferences are not met. However, neither the mechanisms that select preferred gap sites nor the mechanisms used to detect whether these preferences are met are well-understood. In this paper, we report on three experiments in Bangla, a language in which gaps may occur in either a pre-verbal embedded clause or a post-verbal embedded clause. This word order variation allows us to manipulate whether the first gap linearly available is contained in the same clause as the filler, which allows us to dissociate structural locality from linear locality. In Experiment 1, an untimed ambiguity resolution task, we found a global bias to resolve a filler-gap dependency with the first gap linearly available, regardless of structural hierarchy. In Experiments 2 and 3, which use the filled-gap paradigm, we found sensitivity to disruption only when the blocked gap site is both structurally and linearly local, i.e., the filler and the gap site are contained in the same clause. This suggests that comprehenders may not show sensitivity to the disruption of all preferred gap resolutions.

Introduction

The formation of linguistic dependencies is subject to a wide variety of constraints. Some constraints are conditions on grammatical well-formedness, whereas others define the interpretations that are preferred in real-time sentence processing. Locality constraints on filler-gap dependencies are one particularly well-studied example of both constraint types. Some locality constraints distinguish acceptable filler-gap dependencies from unacceptable filler-gap dependencies, as long recognized by syntacticians (Ross, 1967; Huang, 1982; Rizzi, 1982, 1990, 2013; Chomsky, 1986; Rudin, 1988; Lasnik and Saito, 1992; Manzini, 1992; Szabolcsi and den Dikken, 1999; Boeckx, 2008). For instance, the filler-gap dependency in (1a) between who and the position in which it is interpreted (marked as ___) is judged acceptable, in contrast with the sentence in (1b). This is because filler-gap dependencies may not cross into clauses (marked S′) in the subject position of another clause (this violates the sentential subject constraint and the complex noun phrase constraint, Ross, 1967). Constraints on acceptable filler-gap dependencies are called island constraints.

(1) a. I know who it surprised Dale [_S′ that Sarah saw ___].
b. ^*I know who [_S′ that Sarah saw ___] surprised Dale.

Other locality constraints determine which gap sites are preferred when multiple possibilities are available. In on-line tasks, this manifests as a preference for early resolution, a process called active dependency formation (Fodor, 1978; Crain and Fodor, 1985; Stowe, 1986; Frazier, 1987; Frazier and Flores d'Arcais, 1989). For instance, Stowe (1986) observed longer reading times at the direct object us in (2a) compared to the control sentence in (2b), which lacks a filler-gap dependency. This increase in reading times, called the filled-gap effect, suggests that readers make an early commitment to resolve who as the direct object of bring before it is clear whether there is a direct object gap. Encountering the direct object pronoun us then triggers a reanalysis process, leading to an increase in processing difficulty.

(2) a. My brother wanted to know who Ruth would bring us home to ___ at Christmas.
b. My brother wanted to know if Ruth would bring us home to somebody at Christmas.

There has been much interest in determining whether these two types of constraints are the same, following from some independently motivated restrictions on linguistic processes, e.g., restrictions on memory capacity (Deane, 1991; Pritchett, 1992; Kluender and Kutas, 1993; Kluender, 1998, 2004; Hofmeister and Sag, 2010; for discussion see Phillips, 2013). Explaining island phenomena as a consequence of resource limitation has the potential to radically simplify grammatical theories.

If island constraints are indeed reducible to constraints on preferred gap sites, then both sets of constraints should be sensitive to the same properties of the linguistic representation being computed. In other words, the notion of “local” that is relevant should be the same. It is relatively uncontroversial that island constraints are defined in terms of formal linguistic structure, either hierarchical syntactic relations (Ross, 1967; Chomsky, 1981, 1986; Huang, 1982; Rizzi, 1990; Lasnik and Saito, 1992; for review, see Rizzi, 2013), or semantic/pragmatic relations (Erteschik-Shir, 1973; Kuno, 1976; Szabolcsi and Zwarts, 1993; Truswell, 2007; Ambridge and Goldberg, 2008; Abrusán, 2011a,b). However, it is unclear what notion of locality is relevant for determining preferred gap sites. For instance, the direct object position of bring in (2b) may be preferred, because fewer nodes separate this gap site from the filler compared to other potential forthcoming gap sites, i.e., there is an additional PP node separating the filler and prepositional object gap site, illustrated in (3). To construct the direct object gap, the comprehender needs to postulate a less articulated structure (a verb phrase and an object position) than in alternative analyses (a verb phrase, plus dependents on this verb phrase, such as a prepositional phrase, and an object position). Alternatively, the direct object position may be preferred because it is the first position that is linearly available. That is, the locality constraints on preferred gap sites may be defined in terms of structural locality or linear locality. If the constraints on preferred gap sites are sensitive to linear locality, then this motivates maintaining a distinction between island constraints and locality constraints on preferred gap sites.

(3) My brother wanted to know
who [_S Ruth would [_VP bring us home [_PP to ___] at Christmas]]]

Most research on filler-gap dependency processing cannot decide among these hypotheses, because most studies are conducted on languages like English, where structural and linear locality converge, as illustrated above. However, previous work on Japanese, a language with different word order properties than English, suggests that these constraints are dissociated (Aoshima et al., 2004; Yoshida, 2006; Omaki et al., 2013). This is discussed in more detail in Section Locality in Filler-Gap Dependencies.

In this paper, we report on three experiments in Bangla (Bengali) that further investigate locality constraints on preferred gap sites. Bangla is a valuable language for this purpose, because embedded clauses may either precede or follow the embedding verb, as shown in (4) and (5). Additionally, Bangla allows filler-gap dependencies with wh-phrases. These filler-gap dependencies may resolve in either the main clause, or an embedded clause on either side of the main verb, as shown in (6) and (7). This allows us to manipulate whether the first gap site is structurally local or distant within the same language, which allows a within-language comparison of the influence of word order on filler-gap dependency processing, which has previously only been conducted in a cross-language fashion (Omaki et al., 2013).

(4) raj bollo [_S′ še ašbe ]
Raj said he come.fut
“Raj said that he will come.”

(5) raj [_S′ še ašbe ] bollo
Raj he come.fut said
“Raj said that he will come.”

(6) raj kɔkhon ___ bollo [_S′ še ___ ašbe ]
raj when said he come.fut
“When did Raj say ___ that he will come ___ ?”

(7) raj kɔkhon [_S′ še ___ ašbe ] ___ bollo
raj when he come.fut said
“When did Raj say ___ that he will come ___ ?”

Experiment 1 was a within-language replication of the cross-language findings from Omaki et al. (2013). In Experiment 1, we investigated how ambiguous filler-gap dependencies like (6) and (7) are resolved using an off-line ambiguity resolution task. This task allows us to probe for preferences directly, instead of relying on an indirect measure, such as increased reading times indicating detection of an unexpected parse. We found that filler-gap dependencies are resolved with the first position linearly available across word orders. In main verb first word orders as in (6), the filler-gap dependency was resolved with the main verb. In embedded verb first word orders like (7), it was resolved with the embedded verb.

In Experiment 2, we investigated the preference for linearly local gap sites in an on-line, filled-gap paradigm task. This task provides a more standard measure of disruption in moment-by-moment sentence comprehension, and thus it can be used to determine the time course of active dependency formation. Like Experiment 1, we leveraged the flexible word order of Bangla to manipulate whether the first available gap site was in the same clause as the filler or in an embedded clause. We found a filled-gap effect when resolution with the first gap site was blocked in main verb first word orders like (6), where structural locality and linear locality aligned, but not in embedded verb first word orders like (7). In other words, there was only detection of a blocked filler-gap resolution when the gap site was both structurally local and linearly local, but not when this position was structurally distant. The comparison between Experiments 1 and 2 suggests a contrast between gap site preferences and sensitivity to disruption.

The apparent mismatch in Experiments 1 and 2 may be due to the on-line/off-line contrast between the two experiments, or to the ambiguity resolution/filled-gap paradigm difference. In Experiment 3, we diagnosed the cause of this mismatch. Experiment 3 was an off-line acceptability judgment task, like Experiment 1, that used the filled-gap paradigm, like Experiment 2. We again only found evidence that comprehenders detected a filled-gap when the filler-gap dependency was blocked from resolving with a structurally local and linearly local position, as in Experiment 2. This suggests that the contrast between locality preferences and sensitivity to disruption for embedded verb first word orders in Experiments 1 and 2 was not due to the off-line/on-line contrast, but rather the specific mechanisms underlying filled-gap detection.

Locality in filler-gap dependencies

There is substantial evidence that shorter filler-gap dependencies are preferred to longer filler-gap dependencies. For instance, Frazier and Clifton (1989) found that reading times were increased for sentences containing filler-gap dependencies spanning multiple clauses compared to controls (see also Kluender and Kutas, 1993; Dickey, 1996; Kluender, 1998). This bias against longer filler-gap dependencies is also reflected in offline acceptability judgments, where sentences containing filler-gap dependencies spanning multiple clauses are rated lower than sentences with shorter filler-gap dependencies (Phillips et al., 2005; Alexopoulou and Keller, 2007; Sprouse et al., 2012).

Online studies show that the preference for shorter filler-gap dependencies manifests as a preference for early resolution. For instance, the filled-gap effect discussed in Section Introduction demonstrates that blocking an early filler-gap dependency resolution triggers a costly reanalysis process (Crain and Fodor, 1985; Stowe, 1986; Lee, 2004). Converging evidence comes from the plausibility mismatch paradigm (Garnsey et al., 1989; Traxler and Pickering, 1996). For instance, in a series of eye-tracking experiments, Traxler and Pickering (1996) observed that gaze times increased on the verb wrote in (8b) compared to (8a).

(8) a. We like the book that the author wrote unceasingly and with great dedication about ___ while waiting for a contract.
b. We like the city that the author wrote unceasingly and with great dedication about ___ while waiting for a contract.

This suggests that the city was first interpreted as the object of wrote. Comprehenders could then detect that the early gap commitment yields an implausible interpretation. Then, they rejected this commitment, and searched for a different gap, yielding a reanalysis cost. Thus, like we argued for the filled-gap effect, the plausibility mismatch effect illustrates not only early commitment to a local gap, but also sensitivity to disruption when this position is unavailable. Other converging evidence for active dependency formation comes from EEG studies (Garnsey et al., 1989; Kaan et al., 2000; Phillips et al., 2005), the “stops making sense” task (Tanenhaus et al., 1985; Boland et al., 1995), cross-modal lexical priming (Nicol and Swinney, 1989; Nicol et al., 1994), and “visual world” eye-tracking (Sussman and Sedivy, 2003).

This bias toward early filler-gap dependency resolution in real-time behavior and toward shorter dependencies in offline judgments is commonly attributed to resource limitations. For instance, unintegrated fillers may require memory resources to be actively maintained (Jackendoff and Culicover, 1971; Wanner and Maratsos, 1978). Alternatively, longer dependencies in general may be more costly, leading to a dispreference for longer filler-gap dependencies (Gibson, 1998; Hawkins, 2004). Other analyses contend that longer filler-gap dependencies may cause increased processing difficulty because the filler must be retrieved from memory at the gap site, which may be costly and error-prone in the case of longer dependencies (McElree, 2006; Wagers and Phillips, 2014). Lastly, more local gaps may be preferred because comprehenders attempt to resolve as many grammatical requirements as early as possible (Pritchett, 1992; Weinberg, 1992; Altmann and Kamide, 1999; Sedivy et al., 1999; Aoshima et al., 2004; Wagers and Phillips, 2009). These accounts all imply that the comprehender should minimize filler-gap dependency length in order to optimize resource usage. However, these accounts make no commitment as to whether linear locality or structural locality are relevant in selecting preferred gap sites.

Island constraints, in contrast, are typically described in structural terms. Island constraints are restrictions on possible filler-gap dependencies, with several illustrated in

(9) a. Relative Clause Island:
^*Who did Dale comfort [_NP the woman that [_S saw ___ ?]]
b. Whether Island:
^*Who did Dale wonder [whether Bob frightened ___ ?]
c. Wh-Island:
^*Who did Dale say [who saw ___ behind Laura's bed?]
d. Subject Island:
^*Who did [the fact that Sarah saw ___] surprise Dale?
e. Adjunct Island:
^*Who did Dale ruminate [while Harry interrogated ___ ?]
f. Coordinate Structure Constraint:
^*Who did [Dale suspect ___ and Harry interrogate Leland?]
g. Factive Island:
^*Why did Dale remember [that Ben was suspicious ___?]

Island constraints have long been studied in theoretical linguistics, where they typically are characterized as constraints on well-formed linguistic representations, either as formal syntactic constraints (Ross, 1967; Chomsky, 1977, 1981, 1986; Huang, 1982; Rizzi, 1990, 2013; Lasnik and Saito, 1992), or as constraints on well-formed and felicitous semantic/pragmatic forms (Erteschik-Shir, 1973; Kuno, 1976; Szabolcsi and Zwarts, 1993; Truswell, 2007; Ambridge and Goldberg, 2008; Abrusán, 2011a,b). As such, island constraints are typically defined over the hierarchical structure of the sentence, or the formal relations between the words and phrases. This can be demonstrated with pairs like (10), repeated from (1), in which the filler-gap dependency that spans fewer words is dispreferred to a filler-gap dependency that spans more words. This contrast can be characterized as a formal constraint against gaps in subject clauses, but not extraposed clauses (Ross, 1967).

(10) a. I know who it surprised Dale [_S′ that Sarah saw ___] ?
b. ^* I know who [_S′ that Sarah saw ___] surprised Dale?

Island constraints are observed to be robust in both off-line and on-line measures. Off-line acceptability judgments show that speakers give low ratings to sentences with island violations (Sobin, 1987; Cowart, 1996, 2003; Alexopoulou and Keller, 2007; Heestand et al., 2011; Sprouse et al., 2012). Additionally, the effects of active dependency formation typically disappear in island constructions. There are no filled-gap effects or plausibility mismatch effects inside island contexts (Stowe, 1986; Bourdages, 1992; Traxler and Pickering, 1996). Similarly, results from EEG studies (Neville et al., 1991; Kluender and Kutas, 1993; McKinnon and Osterhout, 1996) and speed-accuracy tradeoff studies (McElree and Griffith, 1998) suggest that comprehenders immediately detect island boundaries. The rapid application of island constraints can be explained in theories of sentence processing that posit rapid and faithful use of grammatical constraints (e.g., Lewis and Phillips, 2015) or theories that posit that representations with gap sites inside island contexts are too costly to represent (Gibson, 1998; Hawkins, 2004).

Some data suggests the constraints on preferred gaps should be dissociated from island constraints (Phillips, 2006; Wagers and Phillips, 2009; Sprouse et al., 2012; Yoshida et al., 2014). Other findings imply that constraints on preferred gap sites are defined in terms of linear locality, unlike island constraints which are defined in terms of structural locality. These findings come from Japanese, a language in which embedded clauses precede the main verb, meaning that in multi-clause sentences structural positions that are linearly closer may be structurally more distant. This makes it possible to dissociate structural locality and linear locality. Japanese speakers prefer to resolve filler-gap dependencies in embedded clauses, likely because this is the first position linearly available. For instance, Aoshima et al. (2004) found filled-gap effects for sentences like (11), in which the fronted dative phrase dono-syain-ni “which employee-dat” was blocked from resolving with the embedded clause because of the case-matched noun phrase kacyoo-ni “assistant manager-dat” (see also Yoshida, 2006). Similarly, Omaki et al. (2013) showed that speakers of Japanese interpreted an ambiguously fronted wh-phrase, as in (12), with the embedded clause in a Question after Story task, a task that provides an untimed measure of how speakers prefer to interpret ambiguous questions (de Villiers et al., 1990). This shows that in off-line measures of gap location preferences and on-line measures of filled-gap detection, Japanese speakers prefer a linearly local resolution.

(11) Dono-syain-ni senmu-wa
which employee-dat managing director-top
[syacyoo-ga kaigi-de
president-nom meeting-at
kacyoo-ni syookyuu-o yakusoku-sita-to]
assistant manager-dat raise-acc promised-declc
iimasita-ka?
told-q?
“Which employee did the managing director tell ___ that the president promised a raise to the assistant manager at the meeting?)”

(12) Doko-de Yukiko-chan-wa [choucho-o
where-at Yukiko-dim-top butterfly-acc
tsukumaeru-to] itteta-no?
catch-declc was telling-q?
“Where did Yukiko say that she will catch butterflies?”

In this paper, we further investigate this generalization in Bangla, a language with variable word order that permits us to manipulate whether the most linearly local potential gap site is within the same clause as the fronted filler (i.e., structurally local), or in an embedded clause (i.e., structurally non-local). In Section Grammatical Properties of Bangla, we describe the relevant properties of Bangla syntax. In Sections Experiment 1–General Discussion we describe the results of three experiments on Bangla filler-gap dependency processing.

Grammatical properties of Bangla

Bangla is a language spoken primarily in Bangladesh and the eastern Indian state of West Bengal, with approximately 180 million speakers worldwide (Lewis et al., 2015). Bangla is in the Eastern Zone of the Indo-Aryan branch of the Indo-European language family. Due to its contact with multiple linguistic areas, Bangla has many properties typical of northern Indo-Aryan, Dravidian, and Southeast Asian languages. For more complete descriptions of the language, see Thompson (2010) and David (2015).

Embedded clauses in Bangla may either precede or follow an embedding verb, shown in (13). Post-verbal embedded clauses may be introduced with the complementizer je, shown in (14a). Pre-verbal embedded clauses may appear with the complementizer bole at the end of the clause, shown in (14b), or with je in a clause-internal position, shown in (14c). Dasgupta (2007) describes the clause-internal je as an “anchor,” which may be a distinct lexical category. Examples are taken from Bayer (1996).

(13) a. še bollo ora ašbe
he said they come.fut
            b. še ora ašbe    bollo
                he they come.fut said
              ‘He said that they will come’

(14) a. chele-ṭa bollo [_S′ je tar baba ašbe ]
boy-cl said that his father come.fut
b. chele-ṭa [_S′ tar baba ašbe bole ] bollo
boy- cl his father come.fut that said
          c. chele-ṭa [_S′ tar baba    je    ašbe       ] bollo
              boy- cl    his father that come.fut said
             ‘The boy said that his father will come'

These constructions are used in similar contexts, although there are subtle syntactic and semantic differences that we leave aside (for discussion see Bal, 1990 on related constructions in Oriya, and Bayer, 1996, 1999, 2001; Simpson and Bhattacharya, 2000, 2003).

Case-marking is often an important cue in detecting clause boundaries in head-final languages. For example, Japanese speakers use nominative-marked noun phrases to detect the beginning of embedded clauses (Miyamoto, 2002). We assume that Bangla speakers do the same, although we have not directly tested this. Bangla has four cases—nominative, accusative, genitive, and oblique. The first three cases are clearly marked in the pronoun system, e.g., še “3sg.nom,” take “3sg.acc,” and tar “3sg.gen.” Thus, in (13b), the comprehender can detect the embedded clause, because ora “3pl.nom” is a clearly nominative-marked pronoun, as is še “3sg.nom.” For other noun phrases, nominative case is left unmarked, and the accusative case morpheme (-ke) is reserved for animate objects or specific inanimate objects. In (14b–14c), a comprehender can detect the embedded clause at baba, “father.” This is because baba “father” is an animate noun that is not marked with an overt accusative, genitive, or oblique morpheme. Thus, it must be nominative. Given that there was a previous nominative noun phrase (chele-ṭa “the boy”), the comprehender should postulate an embedded clause here, as well.

Like English and Japanese, Bangla also permits unbounded filler-gap dependencies. Gaps may either occur in pre-verbal or post-verbal embedded clauses. Extraction from a post-verbal clause is shown in (15), adapted from Simpson and Bhattacharya (2003). In (15a), the noun phrase hæmleṭ “Hamlet” is interpreted as the direct object of the verb poṛeche “read.” In (15b) and (15c), hæmleṭ “Hamlet” appears either one or two clauses away from the embedded clause, but is still interpreted as the direct object of poṛeche “read.” The filler may appear either after the subject or before the subject, as in (15d).

(15) a. jɔn bhablo [_S′ meri bollo [_S′ su hæmleṭ poṛeche read ]]
John thought Mary said Sue Hamlet read
          b. jɔn    bhablo   [_S′ meri hæmleṭ bollo [_S′ su ____
             John thought   Mary Hamlet said    Sue
             poṛeche ]]
             read
          c. jɔn    hæmleṭ bhablo [_S′ meri bollo [_S′ su ____
             John Hamlet thought    Mary said   Sue
             poṛeche ]]
             read
            ‘John thought that Mary said that Sue has read
             Hamlet’
          d. hæmleṭ jɔn    bhablo   [_S′ meri bollo [_S′ su ____
              Hamlet John thought    Mary said   Sue
              poṛeche ]]
              read
             ‘John thought that Mary said that Sue has read
              Hamlet’

Extraction from pre-verbal clauses is shown in (16). In (16a), the noun phrase tomar beṛal-ke “your cat-acc” is interpreted as the object of the embedded verb kamṛeche “bit,” but it appears in the left edge position of the main clause. Similarly, in (16b), the prepositional phrase bas theke “bus from” appears in the left edge position of the main clause, but is interpreted as a modifier of the embedded clause. This contrasts with other languages with both pre-verbal and post-verbal clauses, like Basque which disallows gap sites in pre-verbal clauses (Uriagereka, 1992), and Malayalam which only allows direct object gaps in pre-verbal clauses, but not for adjunct phrases like bas theke “bus from” (Srikumar, 2007). The filler may again either appear before the subject or after the subject, as in (16c).

(16)       a. tomar beṛal-ke amra šɔbai
                your       cat-acc we     everyone
                [_S′ paš-er baṛi-r     kukur ___ kamṛeche bole ]
                    neighbor-gen dog         bit           that
               šunechilam
               heard
              ‘We had all heard that the neighbor’s dog has bitten
                your cat'
              b. bas theke amar didi
                  bus from my     sister
                 [_S′ ɔtogulo duronto       bacca laphiye
                   so many uncontrollable child jumping
                 nambe       bole ] bhabe ni
                descend.fut that     think pst.neg
                ‘My sister hasn’t thought that so many children could
                 jump down from a bus.
              c. amar didi     bas theke
                  my   sister bus     from
                [_S′ ɔtogulo duronto       bacca laphiye
                    so many uncontrollable child jumping
                nambe       bole ] bhabe ni
                descend.fut that     think pst.neg
                ‘My sister hasn’t thought that so many children could
                 jump down from a bus.

To summarize, Bangla permits embedded clauses to precede or follow the embedding verb. Additionally, fillers in the main clause may resolve with gap sites in the main clause or in an embedded clause on either side of the embedding verb. This means the schematic representations in (17) are all permissible, making Bangla an excellent language for testing locality biases.

(17) a. Post-verbal embedded clause, main clause resolution:
…filler …___ …V …[_S′…] …
b. Post-verbal embedded clause, embedded clause resolution:
…filler …V …[_S′…___ …] …
c. Pre-verbal embedded clause, embedded clause resolution:
…filler …[_S′…___ …] …V …
d. Pre-verbal embedded clause, main clause resolution:
…filler …[_S′…] …___ …V …

If the locality constraints on preferred gap sites are sensitive to linear order, as suggested by findings in Japanese, then the dependencies schematized in (17a) and (17c) should be preferred to those in (17b) and (17d). However, if locality constraints on preferred gap sites are sensitive to structural locality, then the representations in (17a) and (17d) should be preferred, since the filler and gap site are structurally more local to the filler. We test these predictions in Experiments 1–3.

Experiment 1

Rationale

In Experiment 1, we used the Question after Story task (de Villiers et al., 1990) to determine whether Bangla speakers prefer linearly local gap sites across word orders. We adapted the design used by Omaki et al. (2013), which probed for word order effects on filler-gap dependency resolution using a between language comparison. In their study, participants viewed a series of vignettes in which a character acted out an event in one location and reported on it in another location. Afterwards, participants were asked to respond to a question that contained a fronted wh-filler that could resolve in either the embedded clause or main clause. Participants' responses revealed in which clause they preferred to resolve the filler-gap dependency. In English, a language that conflates linear and structural locality, the ambiguous filler-gap dependency was most commonly resolved with the main clause in Omaki and colleagues' studies. Conversely, in Japanese, the filler-gap dependency was preferentially resolved in the embedded clause. They took this as evidence for a universal preference to resolve filler-gap dependencies with the first position linearly available.

Our study took advantage of the flexible word order in Bangla to further test this hypothesis. The study had two main conditions: a main verb first condition, shown in (18a), and an embedded verb first condition, shown in (18b). For both sentences, the fronted wh-filler kothae “where” could be resolved in the embedded clause, modifying the catching event, or the main clause, modifying the telling event. If gaps are preferentially constructed in the first position linearly available, as suggested by Omaki and colleagues' cross-language contrast, then we expected kothae “where” to be resolved with the main verb in word orders like (18a), and with the embedded verb in word orders like (18b).

(18) a. Main Verb First Condition:
            šumi   kothae ækjɔn-ke     boleche [_S′ je     še
            Shumi where someone-acc told     that she
            prɔjapoti dhorbe]?
            butterfly catch.fut
        b. Embedded Verb First Condition:
            šumi kothae [_S′ še prɔjapoti dhorbe     bole]
            Shumi where     she butterfly catch.fut that
            ækjɔn-ke     boleche?
            someone-acc told
          “Where did Shumi tell someone that she will catch
            butterflies?”

Participants

Ninety-six participants were recruited for Experiment 1. Forty-eight adult native speakers of Bangla were collected from the student population at The University of Dhaka in Dhaka, Bangladesh, and 48 participants were from the student population at Calcutta University in Kolkata, India. Bangladeshi participants were compensated 500 Bangladeshi Taka (BDT), and Indian participants were compensated 200 Indian Rupees (INR). This session took approximately 15 min. Experiment 1 was conducted after participants completed either Experiment 2 or after another experiment unrelated to the current study. These populations were each split into two groups, a “within-subjects” and a “between-subjects” group, as discussed in section Materials. We tested participants in both India and Bangladesh to probe for any potential influence of dialect difference, especially given that Indian Bangla speakers are likely to be competent in Hindi, which uses different wh-scope marking strategies (e.g., Dayal, 1996; Manetta, 2012). Additionally, we included a within-subjects and between-subjects manipulation to check for any effect of self-priming in the experiment. This was important for comparing our within-language findings to results from previous between-language comparisons, where participants in each language, e.g., Japanese and English, saw only one of the word orders tested in Bangla.

Materials

The materials were adapted from Omaki et al. (2013). The stories and audio were translated by three of the authors to standard colloquial Dhakaiya Bangla. Some lexical material was changed to better suit the different cultural context, including names. The questions were presented on a paper questionnaire. Participants were instructed to respond to a question printed on the questionnaire immediately after each vignette, before progressing onto the next vignette. Across all questionnaires, we rigidly alternated between a target item and a filler item, in order to reduce priming or perseveration effects. The target items were two-clause sentences with an ambiguous wh-dependency, presented in (18). The fillers were one-clause sentences with an unambiguous kæno “why” question.

Participants were split into two groups—the “between participants” group and the “within participants” group. The “between participants” group was included to make a closer comparison to the existing literature comparing English and Japanese. The division of participants is illustrated in Table 1. Questionnaires were prepared for each group. For the “between participants” questionnaires, the target items all had either main verb first word orders or embedded verb first word orders, i.e., participants saw 4 target items in one of the two conditions. The remaining participants received a “within participants” questionnaire, where the target items contained both verb first word order and embedded verb first word orders, i.e., 2 target items per condition. In the within participants questionnaire, the two conditions alternated, such that there were two questions of each word order in each questionnaire.

Table 1

Total: 96
Dhaka: 48			Kolkata: 48
Within participants: 24	Between participants: 24		Within participants: 24	Between participants: 24
	Main verb first: 12	Embedded verb first: 12		Main verb first: 12	Embedded verb first: 12

Distribution of participants in experiment 1.

There were 96 participants in Experiment 1, 48 for each city, Dhaka and Kolkata. Each city was split into two groups. One group of 24 in each city saw both conditions in the same questionnaire, the within-participants group. Another group of 24 in each city was further divided in two groups of 12, one seeing only lists with main verb first word orders and the other seeing only lists with embedded verb first word order.

The stories were animated vignettes made from clipart images. In each vignette, a character went to four different locations, and performed an action in each. A sample story from the English study in Omaki et al. (2013) is presented in (19). The videos are included as Supplementary Material.

(19) Sample story:
- [Introduction]
  It was a beautiful day in spring so Lizzie decided she was going to go catch butterflies in the park.
  [1st Location]
  Her Mom and Dad weren't home, so Lizzie thought she should tell her brother or sister about going to the park, so that Mom and Dad would know where she was when they got back. She first went to her brother's room, but he was taking a nap and she couldn't tell him about catching butterflies.
  [2nd Location]
  Instead, Lizzie looked for her sister. She looked all over the house but didn't see her sister anywhere! When she was about to give up, Lizzie heard her sister's voice in the basement! She went to the basement and said to her sister: “I'm gonna catch butterflies in the park!”
  [3rd Location]
  Then, on her way to the park, Lizzie passed by a parking lot and saw a butterfly near it. She walked slowly toward the butterfly, but before Lizzie could get there, another girl came along and caught the butterfly! Lizzie didn't see any more butterflies there, so she kept walking toward the park.
  [4th Location]
  There were lots and lots of butterflies in the park, and she caught one in a jar and took it home with her. She liked the one that she caught, but she wished she could have caught more butterflies.

Each vignette consisted of six phases. The first phase introduced the protagonist, displayed in the center of the screen. The following four phases depicted him or her at each of the four locations. The protagonist succeeded or failed to perform some intended action as announced in the introductory phase, or succeeded or failed to report on it. The contrast between successes and failures was intended to make the event-location pairings more memorable, and to ensure that the “where” test questions were felicitous. In locations where the protagonist succeeded on performing his or her stated action or reported on it, there was a visual trace left behind (i.e., a butterfly in a bottle, or a word balloon). The first two and last two locations were relevant for either the main clause event (i.e., the reporting event), or the embedded clause event (i.e., the intended action). In the sixth and final phase, the protagonist returned to the center of the screen, and then the story concluded. A sample image from the vignette is given in Figure 1.

Figure 1

To avoid any potential recency bias, the ordering of the events within each story was counterbalanced, such that the first pair of events pertained to the reporting event in half of the stories, and to the embedded clause event in the other half of the stories. In each case, the story provided motivation for continuing to the next series of events. For instance, In (19), the reporting events are motivated by the character's need to tell her siblings where she was going. The pairings of quadrant position and event were randomized across stories so that participants could not predict which locations would correspond to which actions.

Methods

Experiment 1 was an adaptation of Omaki et al. (2013), question-after-story task (de Villiers et al., 1990). Participants were instructed in Bangla to watch a sequence of 8 vignettes. At the end of each vignette, the screen displayed “write your answer now” in Bangla. At this point, the experimenter paused the video and instructed the participant to read a question printed on a paper questionnaire. Participants were instructed to write a brief response. We asked that the responses be brief because in pilot studies, participants attempted to recapitulate large portions of the story, which complicated coding the results. After responding, the experimenter resumed the video, which progressed to the next vignette.

Results

We coded each response as either a main clause response or an embedded clause response, depending on which location the participant named. Responses that either failed to answer the question or that provided both possible answers were excluded. Most of the excluded responses named both possible locations, implying that Bangla speakers were often aware of the ambiguity. The proportions of excluded observations are given in Table 2.

Table 2

	Dhaka		Kolkata
	Between participants (%)	Within participants (%)	Between participants (%)	Within participants (%)
Main verb first	25	29	31	21
Embedded verb first	15	33	8	23

Proportion of removed responses in Experiment 1.

There were fewer exclusions for the embedded verb first conditions in the between-participants conditions compared to other conditions. This is the only list in which participants saw only the canonical, verb-final word order. This is because the fillers across all lists used this word order, and all target items in this list also used embedded verb first word order. The presence of non-canonical word orders in other lists may have made the ambiguity more salient, leading to a higher number of exclusions. After excluding these observations, participants responded with the main verb location in 81% of the main verb first word orders, but only 23% of the embedded verb first word orders.

Using the lmer package in R (Bates et al., 2015), we submitted the results to a logit mixed effects model with a bobyqa optimizer. The predicted variable was main clause response, coded as 1. For fixed effects, we included word order (main verb first or embedded verb first), location (Dhaka or Kolkata), and list type (within participants or between participants), with their interaction terms. We included these factors in order to fit a maximal model that tested for all potential variables of interest. For random effects, we included participant and items. Afterwards, we used the backward elimination method to eliminate factors from the model one-by-one to minimize the AIC (Akaike Information Criterion) of the model, as described by Faraway (2002). The results of the best-fit model are given in Table 3. The p-values in Table 3 were generated using the lmerTest package (Kuznetsova et al., 2015). The mean proportion of main verb responses is actually given in Figure 2.

Table 3

Fixed effects	Estimate	SE	z	p
(Intercept)	−2.03	0.97	−2.08	0.04^*
Word order	5.16	0.95	5.41	<0.001^*
City	−1.12	0.92	−1.22	0.22
List type	−0.96	0.91	−1.05	0.29
City ^* list type	2.62	1.34	1.96	0.05

Results of best-fit logistic regression model for Experiment 1.

P-values lower than 0.05 are marked with an asterisk.

Figure 2

We found a significant effect of word order on the proportion of main clause responses. The effect was as predicted: for the main verb first word order, participants showed a strong bias to answer with main verb locations. With embedded verb first word orders, there was a strong bias to answer with embedded verb locations. There was no significant effect of city, implying that there were no systematic dialect differences detected in Experiment 1. Additionally, there was no significant effect of list type, i.e., participants typically responded with the event denoted by the first verb linearly available regardless of whether they saw lists with only one word order or lists with mixed word order. However, there was a marginal interaction of city and list type, due to an increase in main clause responses for the Kolkata participants in the within-participants list (β = 2.61, SE = 1.33, z = 1.96, p = 0.0504). This suggests that participants from Kolkata may have a main clause preference when exposed to both word orders, although the effect of interest persists even in this population.

For the main verb first word order, participants responded with the location denoted by the main verb in 72% of the trials in the within-participants list, and 81% of the trials in the between-participants list. For the embedded verb first word order, participants responded with the location denoted by the main verb in 28% of the within-participants trials, and 19% of the between-participants trials. Thus, we replicated Omaki and colleagues' cross-language findings in the between participant group, and showed a robust bias to resolve the filler-gap dependency with the first verb across word orders in the within participant group as well.

Discussion

In Experiment 1, we showed that Bangla speakers preferentially resolved a filler-gap dependency with the first position linearly available, regardless of whether this position was in the same clause as the filler or in a more deeply embedded clause. This suggests that the locality constraints determining preferred gap sites are primarily sensitive to linear distance, as previously shown in a between-language comparison by Omaki et al. (2013). Importantly, this contrasts with observations about island constraints, which appear to be defined in terms of hierarchical structure.

This within-language demonstration of sensitivity to linear order is also important because it helps keep constant all other grammatical properties between the word order comparisons. The results found by Omaki and colleagues may be due to some other grammatical distinction between English and Japanese apart from word order. For instance, obligatory long-distance wh-dependencies as observed in English have different properties than the optional wh-dependencies observed in Japanese (“scrambling,” Saito, 1985; Mahajan, 1990), which might indirectly bias the filler-gap dependency resolution preferences in these languages. These concerns are less likely to impact the results of Experiment 1, particularly because the effect is robust in the within participant questionnaires. We cannot exclude the possibility that there are subtle formal differences between the pre-verbal and post-verbal filler-gap dependencies. But even if there are such differences, extant accounts of filler-gap dependency processing do not predict that such fine-grained differences should have a large effect on locality biases. We therefore take our findings to lend support to the notion of a general linear locality bias in filler-gap dependency processing.

One potential concern is that the sentences in the embedded-verb first condition may have been parsed as unambiguous. Since the question word kothae “where” in (18b) appeared adjacent to the embedded subject it may have been parsed as having a surface position inside the embedded clause. That is, the filler may have been entirely contained in the embedded clause, requiring an embedded clause interpretation. If so, then the embedded clause responses clearly would have been required. However, we consider this unlikely, since these conditions elicited 23% main verb responses, plus additional (excluded) responses in which participants mentioned both possible answers. So, we think that it is unlikely that these sentences were surface unambiguous for our participants.

An advantage of the Question after Story task in Experiment 1 is that it directly probed participants' preferred resolution sites instead of measuring measuring whether they detect an unexpected parse, as in the filled-gap effect. However, the Question after Story task does not reveal the time course of dependency formation. We cannot infer from these data that there is early commitment to the linearly first gap site. For this reason, in Experiment 2, we used a filled-gap paradigm in a self-paced reading task to probe for detection of an unsubstantiated gap expectation across word orders.