Hyper-active gap filling

Omaki, Akira; Lau, Ellen F.; Davidson White, Imogen; Dakan, Myles L.; Apple, Aaron; Phillips, Colin

doi:10.3389/fpsyg.2015.00384

ORIGINAL RESEARCH article

Front. Psychol., 10 April 2015

Sec. Psychology of Language

Volume 6 - 2015 | https://doi.org/10.3389/fpsyg.2015.00384

This article is part of the Research TopicEncoding and Navigating Linguistic Representations in MemoryView all 49 articles

Hyper-active gap filling

Akira Omaki^1*

Ellen F. Lau²

Imogen Davidson White²

Myles L. Dakan²

Aaron Apple¹

Colin Phillips²

¹Department of Cognitive Science, Johns Hopkins University, Baltimore, MD, USA
²Department of Linguistics, University of Maryland, College Park, MD, USA

Much work has demonstrated that speakers of verb-final languages are able to construct rich syntactic representations in advance of verb information. This may reflect general architectural properties of the language processor, or it may only reflect a language-specific adaptation to the demands of verb-finality. The present study addresses this issue by examining whether speakers of a verb-medial language (English) wait to consult verb transitivity information before constructing filler-gap dependencies, where internal arguments are fronted and hence precede the verb. This configuration makes it possible to investigate whether the parser actively makes representational commitments on the gap position before verb transitivity information becomes available. A key prediction of the view that rich pre-verbal structure building is a general architectural property is that speakers of verb-medial languages should predictively construct dependencies in advance of verb transitivity information, and therefore that disruption should be observed when the verb has intransitive subcategorization frames that are incompatible with the predicted structure. In three reading experiments (self-paced and eye-tracking) that manipulated verb transitivity, we found evidence for reading disruption when the verb was intransitive, although no such reading difficulty was observed when the critical verb was embedded inside a syntactic island structure, which blocks filler-gap dependency completion. These results are consistent with the hypothesis that in English, as in verb-final languages, information from preverbal noun phrases is sufficient to trigger active dependency completion without having access to verb transitivity information.

Introduction

A leading goal of sentence processing research is to understand how the parser adapts to a multitude of linguistic differences across languages to enable successful comprehension. In this regard, comparisons of verb-medial and verb-final languages have provided a valuable source of evidence (Mazuka and Lust, 1990; Inoue and Fodor, 1995). The main verb contains rich information such as subcategorization and thematic role information that is critical for constructing structural analyses and interpretations (e.g., Chomsky, 1965; Grimshaw, 1990; Pollard and Sag, 1994; Levin and Rappaport Hovav, 1995). Much experimental evidence shows that the verb is a valuable source of information for parsing (e.g., Ford et al., 1982; Tanenhaus and Carlson, 1989; Boland et al., 1990; MacDonald et al., 1994; Spivey-Knowlton and Sedivy, 1995; Garnsey et al., 1997; Mauner and Koenig, 2000; Traxler et al., 2002; Blodgett and Boland, 2004; Snedeker and Trueswell, 2004). The importance of the information from the verb head has engendered theoretical claims that structure building processes do not even start until the parser encounters the head of a phrase (e.g., verbal head) to be constructed, even in verb-final languages where this would be significantly delayed (Abney, 1989; Pritchett, 1992).

However, subsequent empirical research on verb-final languages like Japanese or German has generated evidence against such head-driven parsing theories in their strongest form, demonstrating that the parser uses various morphological and syntactic cues to incrementally build structures and interpretations in verb-final languages (Bader and Lasser, 1994; Koh, 1997; Clahsen and Featherston, 1999; Kamide and Mitchell, 1999; Konieczny, 2000; Bornkessel et al., 2002; Felser et al., 2003; Kamide et al., 2003; Aoshima et al., 2009; Yoshida, unpublished doctoral dissertation). Thus, although verb information strongly influences parsing decisions when available, speakers of verb-final languages often begin building syntactic and semantic structure in advance of the verb.

These findings raise the question of whether pre-verbal structure building reflects a language-specific adaptation to the processing demands of verb-finality, or rather a property of a general parsing architecture that speakers of all languages use. For example, consider less frequent cases in verb-medial languages where multiple arguments precede the verb. A classic example of this comes from processing of ‘filler-gap’ dependencies as illustrated by the relative clause construction shown in (1), where the object noun phrase (NP) the city (called the filler) is dislocated from the post-verbal thematic position (called the gap¹), and the parser needs to associate the filler and the gap in order to assign a thematic interpretation.

(1) The city that the author visited ____ was named for an explorer.

It has been reported that speakers of verb-final languages complete filler-gap dependencies in advance of verb information, associating the filler with the earliest structural position where a thematic role could be assigned (pre-verbal object gap creation: Nakano et al., 2002; Aoshima et al., 2004). The current study examines whether this may also be the case in a verb-medial language like English, and whether pre-verbal gap creation is a language-general parsing procedure rather than an adaptation specific to verb-final languages. Under this hypothesis, we predict that English speakers should posit a gap irrespective of whether the verb ultimately licenses a direct object gap position, and that signs of reading disruption should be observed in cases where the verb does not accommodate a direct object.

We report the results of three on-line reading experiments in English that tested this prediction by examining the effect of verb transitivity on reading times in filler-gap configurations. The results are consistent with the hypothesis that the parser actively associates the filler with the verb in advance of the verb across languages, regardless of differences in verb positions. These results suggest that the procedure for filler-gap dependency completion may be uniform across languages, and are consistent with the view that the parser predictively constructs rich representations at the earliest possible moment in advance of critical bottom–up evidence.

Background on Active Filler-Gap Dependency Processing

Past research on filler-gap dependency processing has established that the parser postulates a gap before there is sufficient bottom–up evidence to confirm that analysis (Active gap filling: Fodor, 1978; Crain and Fodor, 1985; Stowe, 1986; Frazier and Flores d’Arcais, 1989). For example, Stowe (1986) observed the so-called Filled gap effect in (2), i.e., slower reading times at the direct object position us in the wh-fronting condition (2a) than in a control condition that did not involve wh-fronting (2b). This pattern of reading times suggests that the parser had already posited a gap following the transitive verb, before checking whether the direct object position was occupied.

(2) a. My brother wanted to know who Ruth will bring us home to ____ at Christmas.

b. My brother wanted to know if Ruth will bring us home to Mom at Christmas.

Converging evidence comes from an eye-tracking experiment by Traxler and Pickering (1996), who manipulated the thematic fit between the filler and the potential verb host, as in (3).

(3) We like the city/book that the author wrote unceasingly and with great dedication about _____ while waiting for a contract.

Traxler and Pickering found a plausibility mismatch effect at the critical verb in (3), i.e., the first fixation time at the optionally transitive verb wrote increased when the filler was an implausible object of the verb (i.e., the city), compared to when the filler was a plausible object of the verb (i.e., the book). This suggests that at least as early as the verb position, the parser postulates a gap and analyzes the filler as the object of the verb, even when the filler is a poor semantic fit to that role. In fact, there is ample time course evidence for active object gap creation, using a variety of dependent measures such as reading time and gaze duration measures (Crain and Fodor, 1985; Frazier, 1987; Frazier and Clifton, 1989; de Vincenzi, 1991; Pickering and Traxler, 2001, 2003; Aoshima et al., 2004; Phillips, 2006; Wagers and Phillips, 2009), cross-modal priming (Nicol and Swinney, 1989; Nicol, 1993; Nakano et al., 2002), visual world eye-tracking (Sussman and Sedivy, 2003) as well as event-related potentials (Garnsey et al., 1989; Featherston et al., 2000; Kaan et al., 2000; Felser et al., 2003; Phillips et al., 2005; Gouvea et al., 2010).

The work summarized above may suggest that filler-gap dependency completion is triggered only after the parser gains access to the verb and confirms that the verb is transitive and is able to syntactically accommodate an object. However, evidence that active dependency completion does not depend on verb information has been presented by studies that investigated (i) subject gap creation in English, as well as (ii) object gap creation in verb-final languages. For example, Lee (2004) used sentences like (4) to reveal a filled gap effect in the subject NP position.

(4) a. That is the laboratory which, on two different occasions, Irene used a courier to deliver the samples to ___.

b. That is the laboratory to which, on two different occasions, Irene used a courier to deliver the samples ___.

Here, the content of the wh-filler is manipulated in such a way that the wh-filler can plausibly be a subject (4a) or not (4b). The results showed a longer reading time at the subject NP Irene in (4a) than in (4b), suggesting that the parser had postulated a subject gap before encountering the actual subject NP. Although this interpretation has been challenged (Staub, 2010), it would in any case not be surprising that the parser actively creates a subject gap without having access to verb information, given that a subject is present in any sentence, regardless of verb properties. In this sense, if verb information were to play a role in the parser’s attempt to posit a gap, the critical empirical evidence should come from dependency completion at the object position, where the presence or absence of an object gap relies on properties of the verb.

Evidence for pre-verbal object gap creation has been reported for verb-final languages like Japanese in which the object gap position linearly precedes the verb. For example, Aoshima et al. (2004) examined processing of scrambling sentences in which a dative object NP was dislocated to the sentence initial position, and found a filled gap effect at a pre-verbal dative object position for the first verb phrase (VP) in the sentence (see also Omaki et al., 2014). Using similar sentences, Nakano et al. (2002) reported evidence for an antecedent priming effect for the scrambled NP at a pre-verbal gap position, although the priming effect was only found in the high working memory span group. These data indicate that the parser can in principle complete filler-gap dependencies before accessing verb information.

In verb-medial languages, no such evidence for pre-verbal object gap creation has been reported to date. This may reflect a real difference between languages in processing strategy, and pre-verbal object gap creation in verb-final languages may reflect the parser’s adaptation to the demands of processing these languages. Maintaining a structurally unintegrated filler in memory has been argued to impose a burden on working memory (King and Just, 1991; Gibson, 1998; Gordon et al., 2002; Haarmann and Cameron, 2005). Alternatively, the parser may be architecturally constrained to assign a thematic interpretation to the filler as soon as possible (Pickering and Barry, 1991; Aoshima et al., 2004). On this view, the parser should prioritize integrating the filler into the first grammatically permissible structural position that can potentially receive a thematic role. Given that filler-gap dependencies are potentially unbounded, waiting for the verb before constructing the ultimate object gap position could impose a large processing burden on speakers of verb-final languages.

In verb-medial languages like English, verbs become available relatively earlier in the sentence, such that the average working memory cost of waiting for the verb would be less than in verb-final languages. The advantage of waiting for the verb information is that the parser can reduce the likelihood of making risky commitments, because the verb may turn out to be intransitive and disallow an object NP analysis for the filler. In English, therefore, the parser may create an object gap position only after the verb is confirmed to be transitive. This still constitutes active gap filling, in the respect that the ultimate gap position may turn out to be somewhere later than the object position [e.g., after a late-arriving preposition gap, as in (2) and (3)]. Let us call this a conservative active gap filling mechanism, since the bottom–up subcategorization information from the verb still plays a critical role in the parser’s decision on whether to postulate an object gap or not. This view of active gap filling is rather standard for explaining filler-gap dependency completion in verb medial languages like English. For example, McElree and Griffith (1998) and McElree et al. (2003) have argued that the dependency completion process is triggered when the parser accesses information from the verb and initiates the retrieval process for the filler that is stored in working memory (see also Pickering and Barry, 1991; Lewis and Vasishth, 2005).

On the other hand, pre-verbal object gap creation in verb-final languages may reflect a language-general property of the processing architecture, although evidence for such mechanisms may be simply more difficult to obtain in verb-medial languages. In the English filler-gap case, for example, in any parser that adopts some form of left-corner strategy (Kimball, 1975; Abney and Johnson, 1991; Resnik, 1992; Shieber and Johnson, 1993; Stabler, 1994; Crocker, 1996; Lewis and Vasishth, 2005; Gibson, unpublished doctoral dissertation), the presence of the subject NP allows the parser to predict the presence of a VP. Given that a VP can contain an object NP position, the parser could project a VP with an object NP slot and assign the filler to this object position before confirming whether the upcoming verb is a transitive verb or not. Let us call this a hyper-active gap filling mechanism, because this involves a more risky predictive structure building process than is standardly assumed for active object gap creation in English. Filler retrieval and structural integration is still integral to the hyper-active gap filling mechanism, but the crucial difference is in what information triggers retrieval and integration, and consequently, at what point in the sentence this process is executed.

It is important to note that either of these two active gap filling mechanisms is compatible with the existing data on active object gap creation reviewed above. A filled gap effect only indicates that the gap had been created before the actual object NP is processed, and this result is compatible with both accounts, given that both hyper-active gap filling and conservative active gap filling mechanisms assume that object NP gap creation happens before or on the verb. A plausibility mismatch effect indicates that when the verb is potentially transitive, then the semantic fit between the filler and the verb is immediately assessed. This is also predicted by both accounts. The assessment of the semantic relation between the filler and the verb requires the parser to access the content of the verb, by which point the object gap position should have been created on either account. Thus, neither paradigm allows us to tease apart the two hypotheses on what kind of information is sufficient for triggering object gap creation.

In the current study we aim to tease apart the predictions of two hypothesized mechanisms for active object gap creation processes. If English speakers construct the gap site before encountering the verb, just like speakers of verb-final languages, then disruption should be observed in filler-gap configurations when the verb turns out to be intransitive, relative to transitive verbs (e.g., The party that the student arrived/planned…). According to the conservative active gap filling mechanism outlined above, the parser waits for a transitive verb before postulating the corresponding gap structure. Here, no disruption is expected at an intransitive verb, since the parser has not postulated a gap that would require a transitive verb.

Two previous studies are relevant to the two hypotheses about active object gap creation in English. Previous work by Pickering and Traxler (2003) examined the effect of subcategorization frequency in optionally transitive verbs (e.g., Those are the lines/props that the author spoke [about]…). It was found that readers did not take subcategorization frequency into account in deciding where to posit a gap, as there was a strong preference to posit a gap in the verb object position (NP complement) even with verbs that more frequently take a PP complement. The absence of subcategorization frequency effect in active object gap creation could be taken to indicate that verb information is not relevant for object gap creation processes. However, all of the verbs in Pickering and Traxler’s study could grammatically accommodate an NP complement, and the parser may therefore have relied on the transitivity information of the verb to create an object gap. Therefore, this finding does not distinguish the predictions of the two proposed mechanisms for active object gap creation.

To our knowledge, the only previous test of these two active object gap creation hypotheses is in Experiment 3 of Staub (2007). The test sentences in this experiment (5a–d) manipulated the transitivity of the verb (called vs. arrived) and sentence structure (relative clause with a gap vs. simple declarative with no gap). The filler was manipulated to be an implausible object of the transitive verb (gadget-called). Under the hyper-active gap filling hypothesis, the parser in effect predicts the presence of a transitive verb, and therefore the reading processes in the gap conditions should be disrupted in either intransitive or transitive condition, but for different reasons: when the verb turns out to be intransitive, and processing should also be disrupted when the verb is transitive because of the plausibility mismatch effect. On the other hand, the conservative active gap filling mechanism postulates a gap only after checking whether the verb is capable of hosting an object NP, and therefore reading disruption is predicted only in the transitive gap condition due to the plausibility mismatch effect.

(5) a. The gadget that the manager called occasionally about…

b. The manager called occasionally about the gadget …

c. The party that the student arrived promptly for …

d. The student arrived promptly for the party …

Staub (2007) found longer first-fixation durations in the transitive gap condition (5a) than in the transitive no-gap condition (5b), but no such difference was observed between the intransitive gap and no-gap conditions (5c) and (5d). This pattern of data supports the prediction of the conservative active gap filling hypothesis, suggesting that the parser does not create an object gap until it checks the transitivity information of the verb. One concern about this design, however, is whether the no-gap condition was truly a neutral baseline against which a transitivity mismatch could be measured, as the gap and no-gap conditions differed substantially in both the linear and structural position of the verb. As Staub (2007) points out, one piece of data suggesting that the control may not have been completely neutral is the fact that reading times on the intransitives were numerically (but non-significantly) shorter in the gap condition than in the no-gap condition. It is important to note here that the gap conditions (5a) and (5c) contain an extra NP (i.e., the head of the relative clause) prior to the critical verb region in comparison to the no-gap conditions (5b) and (5d). This may have led to a difference in the amount of contextual information available prior to the verb. Increased contextual information can facilitate processing for subsequent lexical items (Stanovich and West, 1983; Van Petten and Kutas, 1990; Kutas and Federmeier, 2000), and for this reason, lexical access for the intransitive verb in the gap condition may have become faster and masked the potential reading time slowdown associated with the structural manipulation. In an attempt to provide a better test of the predictions of the hyper-active and conservative active gap filling accounts, the current study used relative clause islands as a control condition, which allowed the target sentences to more closely match in informational content and word position.

Experiment 1

Experiment 1 was a self-paced reading study that was designed to test the predictions of the hyper-active and conservative active gap filling hypotheses, while addressing methodological concerns about previous work. We employed the transitivity mismatch paradigm used in Staub (2007) in order to test whether a verb transitivity manipulation affects reading time at the verb. Critically, in the baseline conditions the critical verb was embedded inside a relative clause structure, a syntactic ‘island’ domain that prohibits filler-gap dependency formation (Ross, unpublished doctoral dissertation; for a review, see Szabolcsi and den Dikken, 2003). A sample set of stimuli is shown in Table 1.

TABLE 1

TABLE 1. Sample materials and conditions for Experiment 1.

A number of previous studies have shown that the parser respects island constraints in real-time syntactic processing, such that it avoids actively constructing filler-gap dependencies that span syntactic island boundaries (Stowe, 1986; Kluender and Kutas, 1993; McKinnon and Osterhout, 1996; Traxler and Pickering, 1996; McElree and Griffith, 1998; Wagers and Phillips, 2009; Omaki and Schulz, 2011; Yoshida, unpublished doctoral dissertation). The relative clause island condition thus provided a baseline measure of reading times for the critical transitive and intransitive verbs, independent of processes of filler-gap dependency completion. The use of island configurations allowed us to address the methodological concerns with previous work.

First, this design allowed the baseline condition to present a filler NP prior to the critical region, such that the same amount of contextual information from the lexical items was present in advance of the critical verb region across the four conditions. Second, the word position for the critical regions (Regions 7 and 8 in Table 1) was closely matched across conditions (word positions 6 and 7 in the non-island conditions, word positions 7 and 8 in the island conditions), and it was also placed away from the early portion of the sentence.

Furthermore, following Staub’s design, we selected transitive verbs that are implausible hosts for the filler. Under this design, the hyper-active gap filling hypothesis predicted a reading time slowdown in both the non-island transitive and the non-island intransitive conditions relative to their island counterparts, but for a different reason in the two cases. In the transitive condition, the slowdown would reflect a plausibility mismatch effect triggered by the poor semantic fit between the filler and the verb. In the intransitive condition, the slowdown would result from a transitivity mismatch effect due to the mismatch between the expected subcategorization property of the verb (i.e., transitive) and the actual subcategorization property of the verb. On the other hand, the conservative active gap filling hypothesis predicted an interaction. A reading time contrast should be observed between the non-island transitive condition and the island transitive condition due to the plausibility mismatch effect, but no corresponding contrast should be observed between the two intransitive conditions, given that the parser should not actively create an object gap in either condition. Note that the lexical difference in the critical verb region across conditions was not problematic, since the critical contrast was between non-island and island conditions within each verb type.

Method

Participants

We recruited 32 native speakers of American English from the University of Maryland community. They received a course credit or were paid $10 for their participation and were naïve to the purpose of the experiment.

Materials

We used 28 sets of four sentences like those shown in Table 1. All of the stimuli from experiments reported in this paper are made available in Supplementary Materials. The transitive non-island and island conditions were taken from the implausible semantic fit conditions in Omaki and Schulz (2011), who used a modified version of the plausibility manipulation materials from Traxler and Pickering (1996). Omaki and Schulz replicated Traxler and Pickering’s plausibility mismatch effect with native and non-native speakers alike, confirming that the semantic fit between the filler and the verb affects the reading time for the verb when the verb is in a gap filling (i.e., non-island) environment, but not when the verb is inside a relative clause island. Critically, it was also found that the implausible verb-filler combination in a non-island environment (e.g., city-wrote) led to a significant slow down at the verb compared to its island counterpart with the same implausible verb-filler combination. Thus, even though the current experiment did not include a plausible counterpart of the implausible transitive verb condition, we could be confident that a reading time contrast between the transitive non-island and island conditions results from the semantic misfit between the filler and the verb. In other words, the finding in Omaki and Schulz’s study supports the notion that island conditions in general can be used as baseline conditions for a reading disruption associated with active object gap creation. The intransitive conditions were modeled after the transitive conditions by replacing the optionally transitive verb with unergative or unaccusative intransitive verbs (Levin and Rappaport Hovav, 1995).

The non-island and island conditions differed in the number of relative clauses. The non-island condition had only one relative clause (the city that the author wrote/chatted regularly about), such that the object position of the verb wrote/chatted was the first potential gap position after the embedded subject was encountered. In the island conditions, the critical verb was embedded inside another relative clause the author who wrote/chatted regularly, such that linearly this was still the first verb but grammatically the filler should not be accessible to the verb due to the relative clause island constraint. Thus, the first verb served as the critical region for testing the plausibility and transitivity mismatch effects. All the transitive verbs were optionally transitive, such that the sentences in the island conditions were all ultimately grammatical. The subcategorization frequency of the optionally transitive verbs was not controlled, since Pickering and Traxler (2003) have demonstrated that plausibility mismatch effects are attested for optionally transitive verbs regardless of subcategorization frequency. In all four conditions the same adverb immediately followed the verb, making it possible to observe potential spill-over effects. The 28 sentence sets were counter-balanced across four lists so that each participant saw only one version of the target items and consequently read seven tokens of each condition. In addition, 72 fillers of similar length and complexity were constructed and added to each list.

Procedure

The self-paced reading task was implemented on the Linger software developed by Doug Rohde (http://tedlab.mit.edu/~dr/Linger/ ). We used a word-by-word, non-cumulative moving window presentation (Just et al., 1982). In this design, each sentence initially appears as a series of dashes, and these dashes are replaced by a word from left to right every time the participant presses the space bar. In order to ensure that the participants were paying attention while reading the sentences, all sentences were followed by yes-no comprehension questions, and feedback was provided if the questions were answered incorrectly. Comprehension questions never addressed the critical filler-gap portion of the sentence. At the beginning of the experiment, participants were instructed to read at a natural pace and to answer the questions as accurately as possible. Seven practice items preceded the self-paced reading experiment, and the order of presentation was randomized for each participant. The experiment took ∼30 min. The experiment protocol for this study was approved by the Institutional Review Board at the University of Maryland.

Data Analysis

The data from two items were excluded from analyses due to coding errors. Only trials in which the comprehension question was answered accurately were included in the analysis, which affected 5.7% of the trials. We also analyzed the data without excluding the trials based on comprehension accuracy, but the overall pattern of results did not change.

Self-paced reading times for the target sentences were examined for each successive region, although the words after the auxiliary was were combined into a single region because these lay beyond the critical regions and were unlikely to show effects relevant for the critical manipulation. The critical regions where a potential plausibility or transitivity mismatch effect was expected consist of Region 7 (i.e., the verb wrote/chatted) and the following Region 8 (i.e., the adverb regularly), in which spill-over effects could be observed. Regions 1 through 6 were predicted to show no difference across conditions, since they were lexically matched. Regions 9 through 11 could reveal reading time differences after the filler-gap dependency is completed (Region 9 hosts the true gap site), and with a possible additional difference in the island conditions due to the structural complexity associated with the extra relative clause in these conditions.

Reading time data that exceeded three standard deviations from the group mean at each region and in each condition were excluded, affecting 1.7% of the data. The remaining reading time data were analyzed using linear mixed effects models (Baayen et al., 2008). These analyses were conducted in the R environment (R Development Core Team, 2011), using the lme4 package for R (Bates et al., 2014). The fixed effects of island structure type (non-island vs. island) and verb transitivity (transitive vs. intransitive) were coded using sum contrasts, with one level of the factor coded as -0.5, and the other as 0.5. This sum contrast coding makes the mixed effect model estimates roughly comparable to the actual average reading time contrasts. The model included random intercepts for participants and items. For random slopes, we used the following procedure to determine the optimal random effect structure (for discussions: Jaeger, 2011; Barr et al., 2013). First, we constructed a fully crossed model that included the fixed effects and an interaction term as random slopes for both participants and items. This fully specified model failed to converge, plausibly due to the complexity of the model and missing data points in some of the trials (Barr et al., 2013). Next, we simplified the random effect structure by only keeping the verb transitivity factor as a random slope for participants and items. In our experimental design, the island structure is invariant across all items, and it is also known to be robust across individuals, regardless of working memory capacity (see Sprouse et al., 2012). On the other hand, the verbs differed across items, and it is possible that the subcategorization bias differs across participants. This mixed effects model converged for all regions. We computed p values for linear mixed effects models using the lmerTest R package (Kuznetsova et al., 2014).

Results

Comprehension accuracy

The mean comprehension question accuracy for experimental items across participants and items was 93.0%. For the non-island conditions, the transitive items were answered with an accuracy of 93.7% (SE = 1.9), and the intransitive items with an accuracy of 94.6% (SE = 1.4). For the island conditions, the transitive items were answered with an accuracy of 91.5% (SE = 1.7), and the intransitive items with an accuracy of 92.0% (SE = 2.2). The mean accuracy did not differ reliably across conditions, although the fact that the mean accuracy for island conditions was numerically lower may reflect the complexity difference between non-island and island conditions.

Reading time data

The region-by-region mean reading time for the transitive conditions is presented in Figure 1, and the mean region-by-region reading time for the intransitive conditions is presented in Figure 2.

FIGURE 1

FIGURE 1. Mean reading time (ms) for the transitive non-island and island conditions. Error bars indicate standard error of the mean.

FIGURE 2

FIGURE 2. Mean reading time (ms) for the intransitive non-island and island conditions. Error bars indicate standard error of the mean.

In the non-critical Regions 1–6, there were no significant differences in Regions 1, 2, 4–6 (ps > 0.06). In Region 3 there was a main effect of verb type (Estimate = -17.3, SE = 7.6, t = -2.27, p < 0.05), due to slower reading times in the transitive conditions than in the intransitive conditions (381 vs. 358 ms). Since this region was lexically matched across conditions, we conclude that this is a spurious effect. But given that the effect was small and occurred well ahead of the critical regions, this unexpected effect was unlikely to have impacted the observations in the critical regions.

At the critical verb in Region 7 there were no significant differences (ps > 0.1). The following spill-over region (Region 8) revealed no main effect of verb type, but there was a main effect of structure type (Estimate = -92.0, SE = 16.4, t = -5.61, p < 0.001), reflecting the fact that the non-island conditions produced significantly slower reading times than the island conditions (529 vs. 435 ms). There was no significant interaction of verb type and structure type (p > 0.1).

Region 9 consisted of a second verb in the island conditions and a preposition in the non-island conditions. We observed a main effect of structure type in Region 9 (Estimate = 63.7, SE = 15.9, t = 4.01, p < 0.001), as well as in Region 10 (Estimate = 46.1, SE = 11.5, t = 4.0, p < 0.001), in these cases due to slower reading times in the island conditions (Region 9: 519 vs. 451 ms, Region 10: 451 vs. 406 ms). Region 11 revealed no significant differences (ps > 0.09).

Discussion

In Experiment 1, we tested the predictions of two hypotheses about active object gap creation. The hyper-active gap filling hypothesis predicted the presence of reading disruption at intransitive verbs, because encountering an intransitive verb in a filler-gap context would be incompatible with the object gap structure constructed earlier. On the other hand, the conservative active gap filling hypothesis predicted no such reading disruption, because the parser should first consult the transitivity information of the verb to decide whether to posit an object gap or not. As a baseline for estimating the degree of disruption at the verb, we used relative clause island constructions, which block the association of the filler with the critical verb. The results were consistent with the predictions of the hyper-active account: in the region following the verb, we observed slower reading times for intransitive verbs in non-island conditions than in corresponding island conditions.

Previous work has shown a filler-gap plausibility mismatch effect at the verb such that mismatched transitive verbs in a non-island environment elicit longer reading times than their plausible non-island or plausible/implausible island counterparts (Traxler and Pickering, 1996; Omaki and Schulz, 2011), and here we replicated this finding. This effect can be interpreted as the result of active association of the filler with the transitive verb, which in these stimuli resulted in a verb–object plausibility mismatch. On the other hand, the slowdown observed in the intransitive non-island condition relative to the intransitive island condition can be interpreted as a transitivity mismatch. This suggests that the parser does not wait for bottom–up evidence from the verb that the verb can syntactically license a gap, but rather attempts to construct the dependency before this information is available. This slowdown cannot reflect the cost of maintaining the filler in working memory, because a filler is also being maintained at this position in the baseline island condition.

It is also important to note that the shorter reading times in the critical regions of the island conditions are theoretically informative. These findings suggest that the reading time increase in the non-island conditions is specifically due to an expectation violation following premature gap creation. A plausible alternative explanation of the reading disruption in the non-island conditions is that it reflects a more general cost associated with delaying gap creation decisions. Under this alternative account, we should expect to observe reading disruption in the island conditions as well, because gap creation must wait until the verb that follows the relative clause island region (e.g., saw in Region 9). However, this prediction is not supported by the data, as the reading time in the adverb region (Region 8) of the island conditions was reliably shorter than in non-island conditions.

In Regions 9 and 10, the island conditions were read more slowly for both levels of verb type. Region 9 corresponds to the word that licensed the true gap site across all conditions, and hence this slowdown could reflect a difference in the so-called integration cost (Gibson, 1998, 2000) between non-island and island conditions. Previous work on filler-gap dependency processing has demonstrated that increased complexity and length differences result in increased processing difficulties at the gap site, as measured by reading time (Gibson and Warren, 2004; Wagers and Phillips, 2014) and reduced accuracy in speeded acceptability judgment tasks (McElree et al., 2003). However, the reading time difference in Region 9 may simply be due to lexical differences (prepositions in the non-island conditions vs. verbs in the island conditions), so the reading time contrast between the island and non-island conditions may not reflect an integration cost difference.

Note that it is unlikely that the reading time contrast between non-island and island conditions in Region 8 is related to the overall complexity of the constructions used in our stimuli, given that on all accounts that we are aware of, island domains have been argued to be syntactically more complex and more taxing for working memory resources (Deane, 1991; Kluender and Kutas, 1993; Kluender, 1998, 2004; Hofmeister and Sag, 2010). The fact that the putatively less complex non-island conditions were read more slowly allows us to attribute the slowdown to processes that uniquely occur in the non-island conditions, namely filler-verb association.

In summary, the presence of both a plausibility mismatch effect and a transitivity mismatch effect lends support to the hyper-active gap filling hypothesis, and argues against a conservative active gap filling hypothesis under which transitivity information is consulted before attempting to create an object gap. This finding directly contrasts with that of Staub (2007), who did not find evidence for a transitivity mismatch effect.

However, this conclusion is not warranted until two methodological concerns are addressed. First, the design in Experiment 1 was modeled after Staub (2007), who used a plausibility mismatch design for transitive verb conditions, and transitivity mismatch design for intransitive verb conditions. Our findings differed from Staub’s as we found mismatch effects for both transitive and intransitive non-island conditions, but it is possible that some nuisance factor common to both non-island conditions led to a slow-down across the board. Stronger evidence for the hyper-active gap filling hypothesis can be obtained if we replicate the transitivity mismatch slowdown in the intransitive non-island condition, while at the same time observing no reading disruption in the transitive non-island condition. Experiment 2 accomplished this by making the filler and the verb semantically fit in the transitive conditions. The absence of reading disruption in the transitive conditions would suggest that the disruption in the non-island, intransitive condition is due to the intransitivity of the verb.

Second, it is important to note that our evidence for reading disruption for transitive and intransitive verbs (i.e., the slowdown in non-island conditions compared to island conditions) was not observed until the spill-over adverb region. Spill-over effects are widely observed in self-paced reading experiments, and it is thus common to attribute spill-over effects to processes triggered in a preceding region. However, in our experiment there is an alternative explanation for the effect in the adverb region that would not require hyper-active gap filling. For the intransitive condition, the slowdown in the adverb region could indicate that the parser had expected the presence of a preposition, which would allow structural integration of the filler. Under this alternative account, the slowdown is not due to a transitivity mismatch on the verb, but rather to a word category expectation mismatch in the adverb region that was triggered by the verb itself. This account is consistent with the conservative active gap filling hypothesis, since the parser’s expectation regarding filler-gap dependency completion is based on the information from the verb. Incidentally, the reading disruption observed in the transitive conditions of Staub (2007) was at the verb region. One possible reason for this discrepancy is the difference in the dependent measure: Staub (2007) used an eye-tracking during reading method while we used self-paced reading in Experiment 1. An eye-tracking during reading method generally provides better temporal precision than the self-paced reading method (Rayner, 1998; Rayner and Pollatsek, 2006). Thus, an eye-tracking replication of Experiment 1 may yield a transitivity mismatch effect on the verb region, and provide stronger evidence for the hyper-active gap filling hypothesis. This is addressed in Experiment 2.

Experiment 2

Experiment 2 addressed two methodological concerns raised in Experiment 1 by removing sources of slowdown in the transitive conditions, and also by using the eye-tracking during reading method.