- Sungshin Women's University, Seoul, Republic of Korea
This study investigates Korean L2 speakers' sensitivity to English island constraints, focusing on the widely reported—but theoretically puzzling—asymmetry between wh/whether-islands and adjunct-islands. Using a factorial definition of island effects, we quantified island penalties while independently estimating the costs of dependency-length and structural-complexity, two factors commonly assumed to interact with the unacceptability of island violations. Acceptability judgment results showed that native speakers displayed robust island effects across all types, whereas L2 speakers displayed significant effects for adjuncts (because- and when-clauses) and whether-islands but no statistically reliable island effect for wh-islands. The overall magnitude of island effects was smaller in L2 speakers and decreased systematically with later Age of Arrival (AoA). Despite these differences, both groups exhibited the same gradient hierarchy of island strength (wh < whether < adjuncts) and comparable structural-complexity costs (wh > whether > adjuncts), indicating fundamentally similar island representations and processing demands across groups. L2 speakers, however, showed greater dependency-length costs—indicating greater difficulty with long-distance dependencies, especially with later AoA—which appear to reduce the contrast between non-island and island configurations, yielding smaller observable island effects. This pattern was most pronounced for wh-islands, which combine high structural complexity with the weakest island effects, creating the appearance of an L2-specific asymmetry despite otherwise native-like sensitivity. Overall, the findings suggest that L2 speakers' island sensitivity is native-like in kind but reduced in degree, reflecting quantitative rather than qualitative differences between groups.
1 Introduction
A central question in second language (L2) acquisition is what is easier or harder to acquire, and why. Island phenomena (Ross, 1967)—cases where extracting an element from certain embedded clauses renders the sentence unacceptable—offer a particularly revealing test case. One reason for their enduring interest is that, at least at first glance, island constraints appear unlearnable, presenting the classic learnability puzzle. For example, English-speaking children encounter sentences showing that the language allows wh-movement (e.g., Who does Mary love __?), embedded clauses (e.g., You think that Mary loves somebody.), and wh-extraction from embedded clauses (e.g., Who do you think that Mary loves __?). Given this exposure, they might reasonably generalize that wh-extraction is possible from any embedded clause, including wh-clauses (e.g., *Who do you wonder who __ loves __?) or adjunct clauses (e.g., *Who did you cry because Mary loves __?). Nothing in the input explicitly signals that wh-movement out of these domains is impossible, yet children nonetheless do not violate this restriction. The standard explanation is that island effects arise from inherent properties of the language system, either grammatical (e.g., Rizzi, 2013) or processing-based (e.g., Kluender, 2004). Accordingly, island effects are not learned—nor could they be, given the absence of negative evidence—but emerge naturally in all speakers.
This learnability challenge extends to L2 acquisition, where learners likewise receive no direct input or explicit instruction on island constraints. Given the common view that island effects reflect fundamental principles of how language operates, L2 speakers should show comparable island effects, once the relevant properties of the target language are acquired (e.g., wh-movement; clausal embedding in English). Empirical findings, however, present a more complex picture. Across diverse methodologies—including acceptability judgment tasks (e.g., Aldosari et al., 2024; Kim and Goodall, 2021, 2022; Rothman and Iverson, 2013), grammaticality-judgment tasks (e.g., Martohardjono, 1993; Perpiñán, 2020; White and Juffs, 1998), and online processing paradigms (e.g., Aldwayan et al., 2010; Cunnings et al., 2010; Kim et al., 2015; Omaki and Schulz, 2011)—many studies report native-like island effects in L2 speakers. Others, particularly traditional grammaticality-judgment studies (e.g., Bley-Vroman et al., 1988; Hawkins and Chan, 1997; Johnson and Newport, 1991; Schachter, 1989) and some recent acceptability judgment work (e.g., Kush and Dahl, 2022), have found inconsistent or non-native-like performance.
Crucially, these differences do not imply a complete absence of island sensitivity. Rather, L2 speakers often show a cross-island asymmetry: they tend to be target-like with adjunct (e.g., *Who did you cry because Mary loves?) and relative-clause islands (e.g., *Who did you meet the man who loves?) but less consistent with wh-islands (e.g., *Who do you wonder who loves?), often accepting such violations as grammatical (e.g., Bley-Vroman et al., 1988; Johnson and Newport, 1991; Li, 1998; Martohardjono, 1993; Schachter, 1989, 1990; see also Belikova and White, 2009 for a review). Age of Arrival (AoA) further modulates this pattern—earlier learners approach native performance, whereas later learners show reduced sensitivity (e.g., Johnson and Newport, 1991).
A classic illustration comes from Johnson and Newport (1991), who asked whether L2 sensitivity to island constraints declines with later AoA. They examined three island types subsumed under Subjacency—complex noun phrases (CNPs), relative clauses (RCs), and wh-complements (wh-islands)—using a yes/no grammaticality-judgment task. Native speakers almost never accepted island violations as grammatical, whereas Chinese learners of English (AoA = 4–38; ≥5 years residence in the U.S.) often did, and this tendency increased with later AoA. Sensitivity also varied across island types: learners showed above-chance rejection of RC violations, but were substantially more permissive with CNP and wh-islands, often judging them as grammatical, with this asymmetry most pronounced among later arrivals.
However, this selective sensitivity—L2 learners showing differential performance across island types while native speakers show uniform sensitivity—is hard to reconcile with standard accounts, grammatical or processing-based. Under classical generative approaches (pre-Minimalist, Chomsky, 1995), island effects follow from constraints encoded in Universal Grammar (UG), such as Subjacency (Chomsky, 1973, 1986). If L2 learners have access to UG, they should be sensitive to all constraints governed by it; if not, they should lack them altogether. The observed pattern—target-like for some island types but not others, despite all being governed by the same constraint—therefore remains unexpected on this view. On Minimalist approaches, which derive island effects from general properties of the computation that builds syntactic structure (e.g., Chomsky, 2005, 2008; Nunes and Uriagereka, 2000), once the relevant operations (e.g., wh-movement) are acquired, island effects should follow naturally from computational efficiency. A selective absence of wh-island effects would therefore imply a special computational advantage allowing L2 speakers—but not native speakers—to bypass such constraints, which seems implausible.
Processing-based accounts attribute island effects to resource limitations during sentence processing (e.g., Pritchett, 1992; Kluender and Kutas, 1993; Kluender, 1998, 2004; Hofmeister and Sag, 2010). Establishing long-distance filler–gap dependencies taxes working memory, and this cost increases when the dependency crosses an island boundary, lowering acceptability (recognized as island effects). On this view, any speaker with intact processing capacities and basic syntax (e.g., wh-movement) should exhibit island effects. The L2 asymmetry would therefore require that, for L2 speakers, wh-islands somehow fail to reach the processing-cost threshold that triggers island effects, whereas adjunct-islands do—an outcome that does not arise in native speakers, who show sensitivity to all island types. Although the precise nature of L2 processing differences remains debated (e.g., Clahsen and Felser, 2006, 2018; Cunnings, 2017; Kaan and Grüter, 2021), it is not straightforward to conceive of processing as native-like for one island type but not another, given that all island types share fundamental properties—long-distance dependencies and complex syntactic structure.
Note that this reasoning does not deny cross-linguistic variation. We focus here on English, which exhibits both wh- and adjunct-islands and where native speakers uniformly judge extraction from these domains as unacceptable. Such patterns—whether grounded in grammar, processing, or both—reflect inherent limitations that even native speakers cannot override. Given that L2 speakers tend to face heavier processing demands and less stable representations, it is difficult to imagine that they would outperform native speakers in this respect. Once the relevant English properties are acquired (e.g., wh-movement; clausal embedding), island effects should therefore arise naturally.
The apparent asymmetry may instead reflect differences in how L2 speakers represent or process complex island configurations rather than a true absence of island effects. Two structural factors are known to contribute to island unacceptability: (1) long-distance dependency and (2) intervening structural complexity. Grammar-based accounts attribute island effects to interactions between these factors and an inherent constraint, whereas processing-based accounts link them to cognitive load exceeding the sum of these factors alone. If L2 speakers differ from native speakers in either component, their island effects may diverge despite shared underlying mechanisms. To accurately estimate island sensitivity, these components must therefore be measured independently. Yet most prior L2 studies tested island violations in isolation, conflating multiple sources of unacceptability (e.g., dependency-length, structural-complexity, and the island constraint itself) and thus obscuring their separate contributions.
A further possibility is that the asymmetry reflects the weak–strong distinction among island types: wh/whether-islands are typically weak and adjunct-islands strong (Cinque, 1990; Rizzi, 1990).1 As noted by Martohardjono (1993); Schwartz and Sprouse (2000), L2 learners' reduced sensitivity to wh-islands may thus reflect this weak-strong gradience rather than a categorical absence of island effects. However, most prior L2 research relied on categorical grammaticality judgments that collapse gradient differences into binary “grammatical/ungrammatical” outcomes. Real-time processing studies, meanwhile, have typically targeted only a single strong island type—most often subject- or relative-clause islands—providing valuable evidence about online sensitivity but not about cross-island contrasts or gradient magnitudes.
To address these gaps, the present study employs a fine-grained acceptability judgment task combined with a factorial definition of island effects (Kluender and Kutas, 1993; Sprouse et al., 2011, 2012), an approach that has proven particularly effective in recent island research. This design independently estimates the costs of dependency-length and structural-complexity while isolating the residual island penalty beyond these components, providing the resolution needed to capture gradient differences across island types and speaker groups. We further examine whether Age of Arrival (AoA) modulates these effects, given evidence that later AoA attenuates island sensitivity. To test for potential L1 transfer, we focus on Korean learners of English, whose L1 exhibits wh-island but not adjunct-island effects (Kim and Goodall, 2016); if transfer were responsible, the predicted asymmetry would be the reverse of that typically reported in L2 studies.
In sum, the study addresses three questions: (1) Does the classic L2 asymmetry—reduced sensitivity to wh/whether-islands relative to adjunct-islands—persist once component costs are independently controlled?; (2) Does the classic asymmetry reflect a genuine absence of island effects or a gradient difference in island strength?; (3) How does Age of Arrival (AoA) modulate these effects?
2 Experiment
2.1 Method
A 7-point scale acceptability judgment task was conducted on a computer in a university laboratory in the U.S. Participants were instructed to rate their immediate impression of each sentence—how good or bad it sounded—without analyzing its structure.
2.2 Materials
Four island types were tested: whether-island (e.g., *Who did you wonder [whether Lisa bothered ___]?), wh-island (who-clause) (e.g., *Who did you wonder [who bothered ___]?), and two adjunct-islands: when-clause (e.g., *Who did you scream [when Lisa bothered __]?) and because-clause (e.g., *Who did you scream [because Lisa bothered __]?).
Each island type followed the 2 × 2 design, crossing STRUCTURE (non-Island [that-clause] vs. island-clause) and DEPENDENCY-LENGTH (matrix extraction [short] vs. embedded extraction [long]). The island penalty is defined as the super-additive drop in acceptability when long-extraction occurs inside an island, beyond the sum of the independent penalties for dependency-length and structure (Sprouse et al., 2012), statistically captured by the STRUCTURE × DEPENDENCY-LENGTH interaction.
The four conditions are illustrated below with a wh-island example:
1. (Non-Island | Matrix-clause) Who ___ thought [that Lisa bothered you]?
2. (Non-Island | Embedded-clause) Who did you think [that Lisa bothered ___]?
3. (Island | Matrix-clause) Who ___ wondered [who bothered Lisa]?
4. (Island | Embedded-clause) *Who did you wonder [who bothered ___]?
Condition 1 (Non-Island/Matrix) served as the baseline, containing neither a long-distance dependency nor an island structure, and was expected to be the most acceptable. Condition 2 (Non-Island/Embedded) introduced a long-distance dependency without an island, isolating the dependency-length effect relative to Condition 1. Condition 3 (Island/Matrix) introduced an island structure but only matrix extraction, isolating the structure effect relative to Condition 1. Condition 4 (Island/Embedded) combined both factors, representing the critical island-violation condition predicted to yield the lowest acceptability ratings.
Each participant judged 105 sentences: 50 target sentences and 55 fillers. The 50 target sentences consisted of 10 that-clause controls (5 tokens each of Conditions 1 and 2) and 40 island sentences (10 for each of the four island types; 5 tokens each of Conditions 3 and 4). The materials were distributed across two Latin-square lists, with Conditions 1–2 (the non-island baselines) shared across all island types. Each list was presented in two reversed orders, yielding four versions in total. The 55 fillers comprised 30 acceptable (e.g., Who will you marry?) and 25 unacceptable sentences (e.g., Who did she went home after the party?). The experimental materials are provided as Supplementary Material.
2.3 Participants
A total of 114 participants took part: 54 highly proficient Korean L2 speakers of English (mean age: 21, range: 18–37) and 60 native English speakers (mean age: 21, range: 18–36). The L2 participants were born in Korea and moved to the U.S. between ages 1–14 (mean length of residence = 14 years, range 7–25).
All participants completed an English proficiency test designed for this study, which included a multiple-choice vocabulary section and two cloze passages, scored as percentage correct. A one-way ANOVA revealed no significant difference in mean proficiency scores between native speakers (M = 80.8%, range = 72.6–88.6) and L2 speakers (M = 78.7%, range = 70.4–85.7), suggesting that L2 participants were highly proficient in English, performing at an advanced level comparable to native controls. For the L2 group, a correlational analysis revealed no significant relationship between AoA and proficiency scores, suggesting that AoA did not predict proficiency outcomes within this sample. All participants provided written informed consent prior to participation, in accordance with ethical guidelines.
2.4 Analysis
Raw acceptability scores were z-scored prior to analysis to minimize individual scale bias and analyzed using linear mixed-effects models in R (lme4; Bates et al., 2015; lmerTest; Kuznetsova et al., 2017). Models included fixed effects of STRUCTURE, DEPENDENCY-LENGTH, and their interaction, plus a maximal random-effects structure wherever possible.
The magnitude of the island effect was quantified using a Difference-in-Differences (DD) score (Maxwell and Delaney, 2004; Sprouse et al., 2012): DD = [(Non-Island/Embedded – Island/Embedded) – (Non-Island/Matrix – Island/Matrix)]. Positive DD values indicate super-additive island effects, where larger values correspond to stronger penalties—that is, greater unacceptability for island violations relative to other baseline conditions. This metric offers a transparent, directly interpretable index of island-effect magnitude, allowing simple and reliable group comparisons without added model complexity.
Component costs were calculated as follows: (i) Dependency-length cost: Condition 1 [Non-Island/Matrix] – Condition 2 [Non-Island/Embedded], indexing the cost of long-distance dependencies while holding structure constant. (ii) Structural-complexity cost: Condition 1 [Non-Island/Matrix] – Condition 3 [Island/Matrix], indexing the cost of complex island structures while holding dependency length constant. Larger values on either measure indicate greater difficulty associated with long-distance dependency or structural complexity, capturing graded differences among otherwise grammatical sentences.
Group differences in DD and component costs were assessed using independent-samples t-tests comparing L2 and native speakers. Linear regressions examined the effect of Age of Arrival (AoA) within the L2 group.
2.5 Research questions and predictions
2.5.1 RQ1. Presence of the classic L2 asymmetry
Does the classic L2 asymmetry (reduced sensitivity to wh/whether relative to adjunct) replicate under a factorial paradigm that isolates component costs?
Prediction: If the asymmetry is genuine, L2 speakers will show no significant STRUCTURE × DEPENDENCY-LENGTH interaction and negative Difference-in-Differences (DD) scores for wh/whether-islands, but a significant interaction and positive DD scores for adjunct-islands. Native speakers are expected to show significant interactions and positive DD scores across all island types.
2.5.2 RQ2. Island-Effect Magnitude and AoA Modulation
Does the L2 asymmetry reflect a categorical absence of island sensitivity or a gradient weak–strong distinction shared with native speakers, and how does Age of Arrival (AoA) modulate the overall magnitude of island effects?
Prediction: If the L2 asymmetry reflects a gradient weak–strong distinction rather than a categorical absence of island sensitivity, both groups will show the same hierarchy of island strength, with smaller island-penalty magnitudes (DD scores) for wh/whether-islands and larger for adjunct-islands. Given previous findings that island sensitivity decreases with increasing AoA (Johnson and Newport, 1991), we further predict smaller DD scores in later learners.
2.5.3 RQ3. Component costs and AoA modulation
Do group differences in island sensitivity stem from differences in component costs—dependency-length and structural-complexity—and are these costs modulated by Age of Arrival (AoA)?
Prediction: L2 speakers are expected to show larger dependency-length and structural-complexity costs than native speakers, reflecting greater difficulty maintaining long-distance dependencies and navigating complex embedded structures. Given previous findings that performance on English grammaticality judgments declines with increasing AoA (Johnson and Newport, 1989), and that later AoA is associated with reduced online efficiency in sentence processing (Cunnings, 2017; Clahsen and Felser, 2018), these component costs are expected to increase with later AoA.
2.6 Results
2.6.1 Whether-island and wh-island effects
For whether-islands (Figure 1), both groups rated island structures lower than non-islands and long-distance extractions lower than short ones, with island-violating sentences least acceptable. Significant main effects of STRUCTURE (Native = −0.919, SE = 0.094, t = −9.69, p < 0.001; L2 = −0.504, SE = 0.095, t = −5.25, p < 0.001), DEPENDENCY-LENGTH (Native = 0.561, SE = 0.044, t = 12.55, p < 0.001; L2 = 0.893, SE = 0.050, t = 17.75, p < 0.001), and their interaction (Native = 0.816, SE = 0.063, t = 12.83, p < 0.001; L2 = 0.262, SE = 0.071, t = 3.69, p < 0.001) confirmed reliable whether-island effects. DD scores were higher for native speakers (DD = 0.82) than for L2 speakers (DD = 0.26; p < 0.001) and declined with later AoA (r2 = 0.18, p < 0.001), indicating weaker sensitivity among later learners.
Figure 1. Average z-scored acceptability judgments and a scatterplot of AoA and DD scores for whether-islands.
For wh-islands (Figure 2), both groups rated island and long-distance dependency structures lower than non-island and short counterparts, with island-violating sentences lowest overall. Significant main effects of STRUCTURE (Native = −1.071, SE = 0.094, t = −11.34, p < 0.001; L2 = −0.751, SE = 0.082, t = −9.21, p < 0.001) and DEPENDENCY-LENGTH (Native = 0.564, SE = 0.046, t = 12.25, p < 0.001; L2 = 0.891, SE = 0.050, t = 17.97, p < 0.001) were observed in both groups, but the interaction reached significance only for natives (estimate = 0.402, SE = 0.065, t = 6.16, p < 0.001). Correspondingly, DD scores were higher for native speakers (DD = 0.40) than for L2 speakers (DD = 0.004; p = 0.004) and decreased with increasing AoA (r2 = 0.11, p = 0.001), indicating that wh-island sensitivity declines with later AoA.
Figure 2. Average z-scored acceptability judgments and a scatterplot of AoA and DD scores for wh-islands.
2.6.2 Adjunct-island effects (when- and because-clause)
For when-clause adjunct-islands (Figure 3), both groups rated island structures lower than non-islands and long-distance dependencies lower than short ones, with island-violating conditions lowest overall. Significant effects of STRUCTURE (Native = −1.170, SE = 0.078, t = −14.94, p < 0.001; L2 = −0.731, SE = 0.079, t = −9.29, p < 0.001) and DEPENDENCY-LENGTH (Native = 0.562, SE = 0.042, t = 13.44, p < 0.001; L2 = 0.894, SE = 0.047, t = 19.06, p < 0.001) were accompanied by significant interactions (Native = 1.251, SE = 0.059, t = 21.08, p < 0.001; L2 = 0.784, SE = 0.066, t = 11.82, p < 0.001).
Figure 3. Average z-scored acceptability judgments and a scatterplot of AoA and DD scores for when-adjunct-islands.
Native speakers showed higher DD scores (DD = 1.25) than L2 speakers (DD = 0.79; p < 0.001), and scores declined with later AoA (r2 = 0.23, p < 0.001).
For because-clause adjunct-islands (Figure 4), both groups again rated island and long-distance dependency sentences lower than controls, with island-violating conditions least acceptable. Significant main effects of STRUCTURE (Native = 1.217, SE = 0.075, t = 16.28, p < 0.001; L2 = 0.892, SE = 0.077, t = 11.62, p < 0.001) and DEPENDENCY-LENGTH (Native = 1.747, SE = 0.043, t = 40.74, p < 0.001; L2 = 1.653, SE = 0.046, t = 35.67, p < 0.001) were accompanied by significant interactions (Native = −1.183, SE = 0.061, t = −19.51, p < 0.001; L2 = −0.759, SE = 0.066, t = −11.58, p < 0.001). Native speakers again showed higher DD scores (DD = 1.18) than L2 speakers (DD = 0.76; p < 0.001), and scores declined with later AoA (r2 = 0.14, p < 0.001).
Figure 4. Average z-scored acceptability judgments and a scatterplot of AoA and DD scores for because-adjunct-islands.
Together, these results show that both native and L2 speakers exhibit clear adjunct-island effects, though the effects are consistently smaller in L2 speakers and decrease systematically with later AoA.
2.6.3 Dependency-length effect and structural-complexity effect
Dependency-length effect scores were significantly larger for L2 speakers (M = 0.90) than for natives (M = 0.55), t(80.4) = −4.38, p < 0.001. A linear regression further showed that this cost increased with later AoA (r2 = 0.13, p < 0.001), indicating that later-arriving learners experienced greater processing difficulty sustaining long-distance dependencies. The positive correlation between dependency-length scores and AoA is illustrated in Supplementary Figure S1.
Structural-complexity effect scores did not differ significantly between groups or vary with AoA. Both groups displayed the same hierarchy of structural-complexity costs, with the largest effects for wh-islands (Native M = 0.67; L2 M = 0.74), followed by whether-islands (Native M = 0.10; L2 M = 0.24), because-adjuncts (Native M = 0.03; L2 M = 0.14), and when-adjuncts (Native M = −0.08; L2 M = −0.06). This parallel ranking suggests broadly similar syntactic representations and processing demands for complex island structures across groups, regardless of AoA or island type.
3 Discussion
3.1 RQ1. Presence of the classic L2 asymmetry
The first research question asked whether the previously reported L2 asymmetry between wh/whether-islands and adjunct-islands would replicate under a factorial design that isolates component costs. If genuine, L2 speakers should show no significant STRUCTURE × DEPENDENCY-LENGTH interaction and negative Difference-in-Differences (DD) scores for wh/whether-islands but significant interaction and positive DD scores for adjunct-islands. Native speakers, by contrast, were expected to show robust interactions and positive DD scores across all island types.
This prediction was only partially confirmed. Both groups showed clear adjunct-island effects and whether-island effects, showing a significant interaction and positive DD scores. The two groups diverged only on wh-islands: native speakers showed significant interactions and positive DD scores, confirming robust island effects, whereas L2 speakers showed no significant interaction and DD scores hovered slightly above zero, indicating extremely weak, barely detectable effects.
These results align broadly with earlier findings (e.g., Bley-Vroman et al., 1988; Johnson and Newport, 1991; Martohardjono, 1993; Li, 1998; Schachter, 1989, 1990), but offer a more nuanced picture. L2 speakers are not categorically insensitive to wh-type islands altogether but differentiate within the wh-domain itself, showing stronger effects for whether- than for wh-islands. This within-domain gradience, obscured in earlier categorical designs, foreshadows the gradient pattern explored in RQ2.
3.2 RQ2. Island-Effect Magnitude and AoA Modulation
The second research question asked whether the L2 asymmetry reflects a gradient weak–strong distinction rather than a categorical absence of wh/whether-island sensitivity, and whether AoA modulates island effect magnitude. If the asymmetry is gradient rather than categorical, both groups were expected to display the same gradient hierarchy of island strength—smaller DD scores for wh/whether-islands and larger for adjunct-islands. For the L2 group, the overall magnitude of DD scores was predicted to decrease with later AoA.
This prediction was fully supported. Both groups showed the same gradient hierarchy of island strength: smallest DD scores for wh-islands, followed by whether-islands, and largest for adjunct-islands. This mirrors the weak–strong distinction in the literature (Cinque, 1990; Rizzi, 1990), indicating that L2 speakers, like natives, are sensitive to fine-grained gradience among island types. However, L2 speakers' DD scores were consistently smaller than those of native speakers across all island types and decreased systematically with later AoA. Nonetheless, this reduction was highly uniform—approximately 0.4–0.5 points lower across all types—showing that AoA influenced only the magnitude of island effects, not their relative hierarchy. In other words, later AoA attenuates the overall strength of island effects without diminishing sensitivity to subtle distinctions among island types. RQ3 examines what drives this quantitative reduction—specifically, whether it reflects differences in the component costs associated with dependency-length and structural-complexity.
3.3 RQ3. Component costs and AoA modulation
The third research question asked whether group differences in island sensitivity stem from differences in component costs—dependency-length and structural-complexity—and whether these costs vary with AoA. L2 speakers were predicted to show larger dependency-length and/or structural-complexity costs than natives, reflecting greater difficulty with long-distance dependencies or complex island structures, both of which expected to increase with later AoA.
This prediction was partially supported. L2 speakers showed larger dependency-length costs than natives, and these increased systematically with later AoA, indicating that later learners experience greater difficulty managing long-distance filler–gap dependencies. In contrast, there were no group differences in structural-complexity costs and no AoA effects. Both groups displayed the same ranking of structural-complexity scores—highest for wh-islands, followed by whether-islands, and lowest for adjunct-islands. This suggests that L2 speakers' representations of different island structures and associated parsing operations are broadly native-like regardless of AoA.
Interestingly, this ranking was the inverse of the hierarchy of island-effect magnitudes observed in RQ2: wh-islands, though most complex, produced the smallest island effects, whereas adjunct-islands, the least complex, yielded the largest. This pattern indicates that difficulty with long-distance dependencies, rather than structural complexity per se, appears to covary with the observable size of island effects. Because the factorial design independently estimates and controls for the additive influences of structure and dependency-length, this inverse pattern should not be interpreted as causal—larger dependency-length costs do not necessarily lead to smaller island penalties. Instead, AoA appears to modulate both processes independently but in opposite directions: as AoA increases, learners experience greater difficulty with long-distance dependencies (larger dependency-length costs) while simultaneously showing weaker residual island effects (smaller DD scores).
Increased difficulty maintaining long-distance dependencies may therefore reduce the contrast between grammatical long-distance dependencies across that-clauses and ungrammatical island violations, making the latter less salient. Consequently, island-effect magnitudes appear smaller for L2 speakers—particularly those with later AoA—accounting for their generally weaker island effects relative to native speakers. This impact is most pronounced for wh-islands, which exhibit the greatest structural complexity and the weakest island effects in both groups. L2 speakers' overall smaller island effects, together with their greater difficulty in sustaining long-distance dependencies, likely make such violations particularly hard to detect, creating the appearance of an L2-specific asymmetry despite otherwise native-like sensitivity. This interpretation also helps explain why earlier studies often reported an apparent wh–adjunct asymmetry: later learners, who experience greater difficulty with long-distance dependencies, may show smaller and less detectable island effects—especially in structurally complex configurations—even though their underlying sensitivity to island constraints remains intact.
An open question for future research is why AoA independently increases the difficulty of maintaining long-distance dependencies while simultaneously reducing measurable island penalties. As one reviewer suggested, future studies could test whether additional cues that facilitate filler–gap integration—such as D-linked wh-phrases (Goodall, 2015)—enhance observable island sensitivity in L2 speakers. Cross-linguistic factors may also play a role. Korean, the L1 of our participants, exhibits wh-island but not adjunct-island effects (Kim and Goodall, 2016), predicting the reverse asymmetry; thus, direct surface transfer cannot account for the present pattern. Nonetheless, Korean's wh-in-situ configuration may influence how L2 speakers process long-distance dependencies, interacting subtly with AoA to modulate the magnitude of island effects.
In sum, L2 speakers show systematic, broadly native-like sensitivity to island constraints. Group differences are quantitative rather than qualitative: L2 learners display weaker overall island effects—likely reflecting greater processing difficulty with long-distance dependencies, particularly in structurally complex configurations such as wh-islands—yet they maintain the same relative hierarchy of island strength (wh<whether < adjunct). Broadly, these findings reinforce the view that island effects arise from fundamental properties of the human language system and are thus inherently available to all speakers, native and non-native alike.
Data availability statement
The datasets supporting the conclusions of this article will be made available by the author upon reasonable request.
Ethics statement
The studies involving humans were approved by the University of California, San Diego. The studies were conducted in accordance with the local legislation and institutional requirements. The participants provided their written informed consent to participate in this study.
Author contributions
BK: Writing – review & editing, Writing – original draft.
Funding
The author(s) declare that no financial support was received for the research and/or publication of this article.
Acknowledgments
I am grateful to the handling editor and the reviewers for constructive feedback that greatly improved this article.
Conflict of interest
The author declares that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.
Generative AI statement
The author(s) declare that no Gen AI was used in the creation of this manuscript.
Any alternative text (alt text) provided alongside figures in this article has been generated by Frontiers with the support of artificial intelligence and reasonable efforts have been made to ensure accuracy, including review by the authors wherever possible. If you identify any issues, please contact us.
Publisher's note
All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article, or claim that may be made by its manufacturer, is not guaranteed or endorsed by the publisher.
Supplementary material
The Supplementary Material for this article can be found online at: https://www.frontiersin.org/articles/10.3389/flang.2025.1691687/full#supplementary-material
Footnotes
1. ^Note on terminology. The weak–strong distinction among island types has been defined in two main ways. One defines weak islands as those in which argument extraction is degraded but still relatively more acceptable than adjunct extraction (“selective” islands), and strong islands as those in which both types of extraction are equally unacceptable (“unselective” islands; Cinque, 1990). Another defines the contrast by violation severity, where strong islands yield greater unacceptability and weak islands milder degradation (Chomsky, 1986, Barriers). Both converge in classifying wh/whether-islands as weak and adjunct-islands as strong. The present study follows the latter definition, in line with most previous L2 research.
References
Aldosari, S., Covey, L., and Gabriele, A. (2024). Examining the source of island effects in native speakers and second language learners of English. Second Lang. Res. 40, 51–77. doi: 10.1177/02676583221099243
Aldwayan, S., Fiorentino, R., and Gabriele, A. (2010). “Evidence of syntactic constraints in the processing of wh-movement: A study of Najdi Arabic learners of English,” in Research in second language processing and parsing, eds. B. Van Patten and J. Jegerski (Amsterdam: John Benjamins), 65–86. doi: 10.1075/lald.53.03ald
Bates, D., Mächler, M., Bolker, B., and Walker, S. (2015). Fitting linear mixed-effects models using lme4: fitting linear. J. Stat. Softw. 67, 1–48. doi: 10.18637/jss.v067.i01
Belikova, A., and White, L. (2009). Evidence for the fundamental difference hypothesis or not? Island constraints revisited. Stud. Second Lang. Acquisit. 31, 199–223. doi: 10.1017/S0272263109090287
Bley-Vroman, R., Felix, S. W., and Ioup, G. (1988). The accessibility of universal grammar in adult language acquisition. Second Lang. Res. 4, 1–32. doi: 10.1177/026765838800400101
Chomsky, N. (1973). “Conditions on transformations,” in A festschrift for Morris Halle, eds. S. R. Anderson and P. Kiparsky (New York: Holt, Rinehart, and Winston), 232–286.
Chomsky, N. (2005). Three factors in language design. Ling. Inquiry 36, 1–22. doi: 10.1162/0024389052993655
Chomsky, N. (2008). “On phases,” in Foundational Issues in Linguistic Theory, Essays in Honor of Jean-Roger Vergnaud, eds. R. Freidin, C. P. Otero, and M. L. Zubizarreta (Cambridge, MA: MIT Press), 291–321.
Clahsen, H., and Felser, C. (2006). Grammatical processing in language learners. Appl. Psycholinguist. 27, 3–42. doi: 10.1017/S0142716406060024
Clahsen, H., and Felser, C. (2018). Some notes on the shallow structure hypothesis. Stud. Second Lang. Acquisit. 40, 693–706. doi: 10.1017/S0272263117000250
Cunnings, I. (2017). Parsing and working memory in bilingual sentence processing. Lang. Cogn. 20, 659–678. doi: 10.1017/S1366728916000675
Cunnings, I., Batterham, C., Felser, C., and Clahsen, H. (2010). “Constraints on L2 learners' processing of wh-dependencies: Evidence from eye movements,” in Research in second language processing and parsing, eds. B. VanPatten, and J. Jegerski (Amsterdam: John Benjamins), 87–110. doi: 10.1075/lald.53.04cun
Goodall, G. (2015). The D-linking effect on extraction from islands and non-islands. Front. Psychol. 5:1493. doi: 10.3389/fpsyg.2014.01493
Hawkins, R., and Chan, Y.-H. C. (1997). The partial availability of Universal Grammar in second language acquisition: the failed functional features hypothesis. Second Lang. Res. 13, 187–226. doi: 10.1191/026765897671476153
Hofmeister, P., and Sag, I. A. (2010). Cognitive constraints and island effects. Language 86, 366–415. doi: 10.1353/lan.0.0223
Johnson, J., and Newport, E. (1989). Critical period effects in second language learning: the influence of maturational state on the acquisition of English as a second language. Cogn. Psychol. 21, 60–99. doi: 10.1016/0010-0285(89)90003-0
Johnson, J., and Newport, E. (1991). Critical period effects on universal properties of language: the status of subjacency in the acquisition of a second language. Cognition 39, 215–258. doi: 10.1016/0010-0277(91)90054-8
Kaan, E., and Grüter, T. (2021). “Prediction in second language processing and learning: Advances and directions,” in Prediction in second language processing and learning, eds. E. Kaan and T. Grüter (Amsterdam, Philadelphia: John Benjamins Publishing Company), 2–24. doi: 10.1075/bpa.12.01kaa
Kim, B., and Goodall, G. (2016). Islands and non-islands in native and heritage Korean. Front. Psychol. 7:134. doi: 10.3389/fpsyg.2016.00134
Kim, B., and Goodall, G. (2021). “Age-related effects on constraints on wh-movement,” in Theory and Experiment in Syntax (Routledge), 239–250. doi: 10.4324/9781003160144-20
Kim, B., and Goodall, G. (2022). The island/non-island distinction in long-distance extraction: Evidence from L2 acceptability. Glossa 7, 1–42. doi: 10.16995/glossa.5857
Kim, E., Baek, S., and Tremblay, A. (2015). The role of island constraints in second language sentence processing. Lang. Acquis. 22, 384–416. doi: 10.1080/10489223.2015.1028630
Kluender, R. (1998). “On the distinction between strong and weak islands: a processing perspective,” in Syntax and Semantics 241–280. doi: 10.1163/9789004373167_010
Kluender, R. (2004). “Are subject islands subject to a processing account?,” in Proceedings of WCCFL, 475–499.
Kluender, R., and Kutas, M. (1993). Subjacency as a processing phenomenon. Lang. Cogn. Process. 8, 573–633. doi: 10.1080/01690969308407588
Kush, D., and Dahl, A. (2022). L2 transfer of L1 island insensitivity: the case of Norwegian. Second Lang. Res. 38, 315–346. doi: 10.1177/0267658320956704
Kuznetsova, A., Brockhoff, P. B., and Christensen, R. H. B. (2017). lmerTest package: tests in linear mixed effects models. J. Stat. Softw. 82, 1–26. doi: 10.18637/jss.v082.i13
Li, X. (1998). “Adult L2 accessibility to UG: an issue revisited,” in The generative study of second language acquisition, eds. S. Flynn, G. Martohardjono and W. O'Neil (Mahwah, NJ: Erlbaum), 232–286.
Martohardjono, G. (1993). Wh-movement in the acquisition of a second language: A cross-linguistic study of three languages with and without movement. Unpublished doctoral dissertation, Cornell University, Ithaca, NY, USA.
Maxwell, S. E., and Delaney, H. D. (2004). Designing Experiments and Analyzing Data: A Model Comparison Perspective. New York: Psychology Press. doi: 10.4324/9781410609243
Nunes, J., and Uriagereka, J. (2000). Cyclicity and extraction domains. Syntax 3, 20–43. doi: 10.1111/1467-9612.00023
Omaki, A., and Schulz, B. (2011). Filler-gap dependencies and island constraints in second language sentence processing. Stud. Second Lang. Acquis. 33, 563–588. doi: 10.1017/S0272263111000313
Perpiñán, S. (2020). Wh-movement, islands, and resumption in l1 and l2 Spanish: is (un)grammaticality the relevant property? Front. Psychol. 11:395. doi: 10.3389/fpsyg.2020.00395
Pritchett, B. L. (1992). “Parsing with grammar: islands, heads, and garden paths,” in Island Constraints (Springer Netherlands), 321–349. doi: 10.1007/978-94-017-1980-3_12
Ross, J. R. (1967). Constraints on Variables in Syntax. Doctoral dissertation, Massachusetts Institute of Technology.
Rothman, J., and Iverson, M. (2013). Islands and objects in L2 Spanish. Stud. Second Lang. Acquisit. 35, 589–618. doi: 10.1017/S0272263113000387
Schachter, J. (1989). “Testing a proposed universal,” in Linguistic perspectives on second language acquisition, eds. S. M. Gass and J. Schachter (New York: Cambridge University Press) 73–88. doi: 10.1017/CBO9781139524544.007
Schachter, J. (1990). On the issue of completeness in second language acquisition. Second Lang. Res. 6, 93–124. doi: 10.1177/026765839000600201
Schwartz, B., and Sprouse, R. (2000). “When syntactic theories evolve: consequences for L2 acquisition research,” in Second Language Acquisition and Linguistic Theory, eds. J. Archibald (Oxford: Blackwell), 156–186.
Sprouse, J., Fukuda, S., Ono, H., and Kluender, R. (2011). Reverse island effects and the backward search for a licensor in multiple wh-questions. Syntax 14, 179–203. doi: 10.1111/j.1467-9612.2011.00153.x
Sprouse, J., Wagers, M., and Phillips, C. (2012). A test of the relation between working memory capacity and island effects. Language 88, 82–123. doi: 10.1353/lan.2012.0004
Keywords: second language acquisition (SLA), island constraints, acceptability judgment, long-distance dependency, age of arrival (AoA), wh-islands, adjunct-islands
Citation: Kim B (2025) Reassessing L2 sensitivity to island constraints: asymmetries between wh-islands and adjunct-islands. Front. Lang. Sci. 4:1691687. doi: 10.3389/flang.2025.1691687
Received: 27 August 2025; Revised: 11 November 2025; Accepted: 12 November 2025;
Published: 09 December 2025.
Edited by:
Pedro Guijarro-Fuentes, University of the Balearic Islands, SpainReviewed by:
Heather Marsden, University of York, United KingdomJiayi Lu, Northwestern University, United States
Copyright © 2025 Kim. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.
*Correspondence: Boyoung Kim, Ym95b3VuZ0BzdW5nc2hpbi5hYy5rcg==