The Effects of L1 English Constraints on the Acquisition of the L2 Spanish Alveopalatal Nasal

This study examines whether L1 English/L2 Spanish learners at different proficiency levels acquire a novel L2 phoneme, the Spanish palatal nasal /ɲ/. While alveolar /n/ is part of the Spanish and English inventories, /ɲ/, which consists of a tautosyllabic palatal nasal+glide element, is not. This crosslinguistic disparity presents potential difficulty for L1 English speakers due to L1 segmental and phonotactic constraints; the closest English approximation is the heterosyllabic sequence /nj/ (e.g., “canyon” /kænjn/ ['khæn.jn], cf. Spanish cañón “canyon” /kaɲon/ [ka.'ɲon]). With these crosslinguistic differences in mind, we ask: (1a) Do L1 English learners of L2 Spanish produce acoustically distinct Spanish /n/ and /ɲ/ and (1b) Does the distinction of /n/ and /ɲ/ vary by proficiency? In the case that learners distinguish /n/ and /ɲ/, the second question investigates the acoustic quality of /ɲ/ to determine (2a) if learners' L2 representation patterns with that of an L1 Spanish representation or if learners rely on an L1 representation (here, English /nj/) and (2b) if the acoustic quality of L2 Spanish /ɲ/ varies as a function of proficiency. Beginner (n = 9) and advanced (n = 8) L1 English/L2 Spanish speakers and a comparison group of 10 L1 Spanish/L2 English speakers completed delayed repetition tasks in which disyllabic nonce words were produced in a carrier phrase. English critical items contained an intervocalic heterosyllabic /nj/ sequence (e.g., ['phan.jə]); Spanish critical items consisted of items with either intervocalic onset /ɲ/ (e.g., ['xa.ɲa]) or /n/ ['xa.na]. We measured duration and formant contours of the following vocalic portion as acoustic indices of the /n/~/ɲ/ and /ɲ/ ~/nj/ distinctions. Results show that, while L2 Spanish learners produce an acoustically distinct /n/ ~ /ɲ/ contrast even at a low level of proficiency, the beginners produce an intermediate /ɲ/ that falls acoustically between their English /nj/ and the L1 Spanish /ɲ/ while the advanced learners' Spanish /ɲ/ and English /nj/ appear to be in the process of equivalence classification. We discuss these outcomes as they relate to the robustness of L1 phonological constraints in late L2 acquisition coupled with the role of perceptual cues, functional load, and questions of intelligibility.


INTRODUCTION
A lasting question that has occupied a central role in the study of second language (L2) phonology across several decades asks which factors modulate the acquisition of L2 contrastive sounds that are not part of the first language (L1) grammar. A look at the collective body of research reveals that, while there is robust evidence that novel L2 sounds are acquirable (see e.g., Broselow and Kang, 2013, for a review), it is clear that not all sounds are equal when it comes to their acquirability.
A sound's degree of difficulty can depend on a number of variables, which span the existence or absence of a phonologically similar L1 sound, functional load of the L2 sound, markedness, articulatory complexity, and languagespecific constraints (featural and suprasegmental alike), among other factors.
In the present study, we examine L1 American English speakers' acquisition of the alveopalatal nasal /ñ/ in L2 Spanish. This sound is a challenge for L1 American English speakers for a number of reasons. First, this is a scenario in which the L2 sound does not exist in the L1. Second, it is the least frequent phoneme in the Spanish inventory (Melgar de González, 1976) and has low functional load in Spanish. Third, L1 segmental and phonotactic constraints complicate the L2 learning task: American English does not permit complex palatal segments and the closest approximation in the English inventory is the sequence /nj/ (e.g., "canyon" /kaenjn/ ['k h aen.jn], which is derived from Spanish cañón "canyon" /kañon/ [ka.'ñon]), which is restricted to heterosyllabic position. With these crosslinguistic differences in mind, to converge on the L2 Spanish target, the learner's grammar must come to allow a single alveopalatal nasal segment. The question, then, that follows, is whether these constraints can be overcome in the L2. While there are no L2 Spanish /ñ/ acoustic or perception data to inform our predictions for the current study 1 , we can look to a body of work that has examined L1 English speakers' acquisition of L2 Russian palatalized consonants to inform predictions for L2 Spanish learners. Specifically, L2 Russian learners have been reported to persistently rely on L1 /Cj/ sequences in both perception and production (e.g., Diehm, 1998), which leads to the prediction that L2 Spanish learners will pattern similarly to L2 Russian speakers and fail to reliably produce an alveopalatal nasal segment. In this study, we report acoustic data from a delayed repetition task completed in English and Spanish by L1 English learners of L2 Spanish at beginner and advanced levels of proficiency and in Spanish by a baseline comparison group of L1 Spanish/L2 English speakers. While L2 Spanish learners produce an acoustically distinct /n/ ∼ /ñ/ contrast even at a low level of proficiency, the beginners produce an intermediate /ñ/ that falls acoustically between their English /nj/ and the L1 Spanish /ñ/ while the advanced learners' Spanish /ñ/ and English /nj/ appear to be in the process of equivalence classification. We discuss these outcomes as they relate to the robustness of L1 phonological constraints in late L2 acquisition coupled with the role of functional load and questions of intelligibility. In the remainder of this section, we outline the phonetic and phonological properties of the nasal segments and sequences in English and Spanish and the L2 learning task, followed by a brief overview of the L2 research that informs our research questions and predictions and the questions and predictions themselves. 1 To our knowledge, there is only a single study of L2 Spanish /ñ/ (Díaz-Campos, 2004), which relies on impressionistic data. Although the author reports target-like production of the segment, it is not clear whether the target criteria differentiated between a single segment versus a two-segment sequence. That is, it is possible that the learners were producing heterosyllabic /nj/ rather than /ñ/.

Nasal Consonants in English and Spanish
Spanish and English each have three nasal phonemes that contrast by place of articulation; however, while the Spanish nasal inventory (/m/ /n/ /ñ/) includes an alveopalatal nasal that contrasts with the other nasals in word-medial onset position 2 , the English inventory (/m/ /n/ /η/, the latter of which is limited to coda position) does not.
1. /m/ cama /'kama/ 'bed' /n/ cana /'kana/ 'gray hair' /ñ/ caña /'kaña/ 'cane' According to Martínez Celdrán and Fernández Planas (2007), the alveopalatal /ñ/ is comprised of an alveolar nasal segment and a "partial" glide element. This glide element is posited to be phonologically associated with the nasal segment (e.g., Colina, 2009;Bongiovanni, 2019). An alveopalatal onset is illicit in American English due to two phonological constraints. First, as noted, American English does not allow palatal consonants with complex (simultaneous or sequential) points of articulation; instead, consonantal palatalization is realized non-contrastively as a sequence of distinct consonantal and glide segments (e.g., "music" [mju:zik], cf. [m j u:zik]) (e.g., Antonova, 1988, cited in Diehm, 1998. As a result, /nj/ will be the closest American English approximation to Spanish /ñ/. Second, American English 3 /nj/ cannot occupy onset position due to a ban on onset clusters that consist of a coronal segment and /j/ (see Kulikov, 2011). Rather, /nj/ is limited to a heterosyllabic context in which /n/ occupies a syllable coda and /j/ is phonologically associated with the following syllable onset (consider, for example, "canyon" /kaenjn/ ['k h aen.jn], which is derived from Spanish cañón /kañon/ [ka.'ñon]). Together, these L1 constraints yield a learning task in which the grammar must come to allow a single nasal segment alveolar and palatal places of articulation in syllable onset position.
To determine whether L2 Spanish learners produce a single segment (/ñ/) or a sequence (nj), we follow Bongiovanni (2019) by acoustically examining the vocalic portion that follows the nasal segment 4 . Specifically, we measure duration and first and second formant (F1 and F2) contours as correlates of the phonological association of a glide element. In her comparison of the production of /nj/ 5 and /ñ/ in Buenos Aires Spanish, Bongiovanni examined reported differences in gestural timing, specifically, sequential and quasi-simultaneous alveolar and palatal contact in /nj/ and /ñ/, respectively (see e.g., Recasens and Romero, 1997). While some speakers evidenced neutralization of /nj/ and /ñ/, the speakers who preserved the contrast exhibited FIGURE 1 | Waveform and spectrogram of an advanced L2 Spanish participant's production of the English nonce item /dεnja/ "denya." FIGURE 2 | Waveform and spectrogram of an L1 Spanish participant's production of the Spanish nonce item /deña/ "deña." formant contour trajectories that differed in the first (F1) and second (F2) formants, with /nj/ showing a rise in F2 and lowering of F1 and a later F1 minimum and F2 maximum compared to /ñ/. Duration of the vocalic portion in /nj/ was longer than in /ñ/, given the glide's independent status. To our knowledge, there are no crosslinguistic comparisons of Spanish /ñ/ and English /nj/ 6 . Therefore, we rely on the tautosyllabic Spanish data to form the logical prediction that the distinction between heterosyllabic English /nj/ and Spanish /ñ/ will be qualitatively similar to Spanish /nj/ vs. /ñ/ and potentially more pronounced given the association of English /j/ with the onset of the following syllable. Figures 1-3 illustrate the differences in formant trajectory and duration of the following vocalic portion by an advanced L2 Spanish participant of English /nj/ (Figure 1) compared with an L1 Spanish /ñ/ (Figure 2) and L1 Spanish /n/ (Figure 3), the latter of which we include as a baseline for comparison with /nj/ and /ñ/. 6 As far as we are aware, the only comparison of English /Cj/ with a language that has a complex palatal segment comes from Diehm (1998); her examination of F2 trajectories found that L1 English /C [labial] j/ formant transitions at consonantal release were longer and contained a shallower slope than L2 Russian /C j /.

L2 Acquisition of Complex Palatal(ized) Consonants
As mentioned, although this is the first study to our knowledge to examine L1 English speakers' acquisition of the L2 Spanish palatal nasal, we can look to a small body of research that has examined L1 English speakers' acquisition of L2 Russian palatalized consonants to inform predictions for the current study. Similarly to Spanish /ñ/, and unlike the English /nj/ sequence, Russian palatalized consonants are single complex segments with dual places of articulation that contrast phonemically with nonpalatalized counterparts. Therefore, the L2 learning task for L1 English speakers is similar. In onset position 7 , it seems as though L2 learners are able to perceive Russian /C j /∼/C/ contrasts (Larson-Hall, 2004;Kulikov, 2011). However, it is not clear from these studies whether learners accurately perceive the distinction as /C j /∼/C/ or whether the operation of L1 constraints instead persists in driving perception of the contrast as /Cj/∼/C/. Diehm (1998) and Lukyanchenko and Gor (2011) examined this question as it relates to perception of the /C j / ∼ /C j j/ contrast (akin to the Spanish /ñ/ ∼ English /nj/ distinction), with Diehm also reporting production data of within-language and between-language contrasts.
Regarding perception of the /C j / ∼ /C j j/ contrast, Lukyanchenko and Gor's data from an ABX task reflected above-chance accuracy in low-proficiency (∼73%) 8 and highproficiency (80%) L2 Russian learners. In an identification task, Diehm found that high-experience learners perceived /C j / and /C j j/ more accurately than low-experience learners when followed by a low vowel (23.5 and 10.2% more accurately, respectively) and that the increase in /C j / accuracy corresponded with a decrease in inaccurate identification of /C j / as /Cj/. While the learners in these two studies are neither at ceiling nor in line with L1 Russian accuracy, the data suggest that learners can develop perceptual acuity that at least partially circumvents the L1 phonological constraints that yield /Cj/ perception. Production data from Diehm, the only study to our knowledge to report a within-subjects comparison of L1 English and L2 Russian production data, points to partial acquisition in the oral modality. She found that even advanced L2 Russian learners' productions of /C j V/ and /C j jV/ syllables did not differ in F2 trajectory or duration from the point of consonantal release to the offset of the palatal element. This result lends support to the observation that L1 English learners decompose L2 /C j / into an L1-like /Cj/ sequence. Interestingly, a crosslinguistic comparison of a subgroup (n = 4) of advanced learners' L1 English and L2 Russian productions revealed that, while one learner produced L1 /Cj/ and L2 /C j / as (L1-like) /Cj/, the other three produced a distinct sequence in Russian that fell between their L1 English /Cj/ and the L1 Russian /C j / comparison. That is, although the L2 Russian /C j V/ F2 trajectories did not approximate those of the L1 Russian comparison group, they were shorter and had a more negative slope (i.e., a larger degree of gestural overlap) than their English /CjV/. Diehm posited that this intermediate representation was indicative of partial L2 acquisition.
Taking the L2 Russian data as a point of departure, it seems that (at least partially) overcoming the relevant L1 English constraints is possible. However, we recognize that the learning scenario is different in L2 Russian vs. L2 Spanish. Specifically, the functional load (i.e., the importance in marking contrasts in a language) of the Russian /C j V/ ∼ /CV/ contrast is higher than the functional load of the Spanish /ñ/∼/n/ contrast. Russian, which has 42 consonantal phonemes, has 15 pairs of consonants that are phonemically distinguished by palatalization; palatalized consonant phonemes range in frequency ranking from 14 to 42 (Smirnova and Chistikov, 2011). Recall that Spanish /ñ/, on the other hand, is the least frequent phoneme in the General Latin American Spanish inventory (Melgar de González, 1976), which contains 17 consonantal phonemes. Moreover, the phoneme is the only palatal consonant with dual articulation in Spanish and only contrasts in its palatalization with /n/. The phoneme thus has low functional load because it is infrequent but occurs in minimal pairs, with a low predictability of distribution. Functional load has been posited as a predictor of L2 phonological acquisition outcomes, whereby the probability of the acquisition of a contrast correlates with the functional load of that contrast (e.g., Best and Tyler, 2007). Relatedly, Archibald (Archibald, 2007(Archibald, , 2009) posits that, for learners to acquire a novel L2 contrast, there need to be sufficiently robust cues in the input to drive a revision to the representation that would allow for accurate perception of a single segment. Thus, it is wholly possible that L2 Spanish learners will not evidence the same success as L2 Russian learners. While the present study was not designed to explicitly test the effect of functional load and cue robustness, we will return to their potential role in the discussion.

Research Questions and Predictions
There are two research questions that drive the current study. The first regards whether learners acquire the relevant contrast in the L2, independent of how their /ñ/ productions compare with the L1 Spanish /ñ/.
(1a) Do L1 English learners of L2 Spanish produce acoustically distinct Spanish /n/ and /ñ/, as measured by duration and formant trajectories of the following vocalic portion?
Following the acoustic description of the nasal segments in section Nasal consonants in English and Spanish, distinct segments are predicted to take the form of a longer vocalic portion following /ñ/ than following /n/; /ñ/ is expected to present a higher F2 and a lower F1 than /n/, with an overall flatter shape for /n/ vs. /ñ/. In the case that learners distinguish /n/ and /ñ/, the second question concerns the acoustic quality of /ñ/.
(2a) If learners distinguish /n/ and /ñ/, (i) do they rely on an L1 representation to produce /ñ/ or (ii) have they overcome L1 constraints to establish a novel L2 representation? In this latter case, does the acoustic quality of the L2 representation pattern with that of an L1 Spanish representation? (2b) Does the acoustic quality of the L2 Spanish /ñ/ vary as a function of proficiency?
There are two logical outcomes. The first is that we will encounter evidence that learners have mapped Spanish /ñ/ in the input onto their representation of English /nj/. In the Speech Learning Model (SLM, Flege, 1995Flege, , 2002Flege and Bohn, 2020), this process of "equivalence classification" is predicted to eventually yield a representation (in SLM terms, a "phonetic category") that subsumes English /nj/ and Spanish /ñ/ and has shifted to accommodate properties of both sounds. If one or both groups' /nj/ and /ñ/ pattern together, we can compare their /nj/ and /ñ/ productions with an L1 English baseline and an L1 Spanish baseline to determine where the learners might be in the equivalence classification process. The second possible outcome is that the learners have overcome the relevant L1 constraints and acquired a novel representation of Spanish /ñ/ that is distinct from their L1 English /nj/. In this case, the difference is predicted to take the form of (a) shorter duration of Spanish /ñ/ than English /nj/ and/or (b) a formant contour wherein /nj/ has a lower F1 valley and higher F2 peak than /ñ/. Based on Diehm's (1998) L2 Russian production data discussed in section L2 acquisition of complex palatal(ized) consonants, we can make tentative predictions with the caveat that the L2 Spanish developmental trajectory may diverge from that of the L2 Russian trajectory due to the status of the palatalized segments in Spanish vs. Russian. We predict that learners will distinguish /n/ from /ñ/ in production even at beginner proficiency (RQ 1). They will approximate /ñ/ in earlier stages of acquisition via L1-like /nj/; in later stages the duration and formant trajectory of /nj/ in L2 Spanish will shift toward the L2 target but will not fall within the acoustic parameters of the L1 comparison group's productions (RQ 2), resulting in an intermediate representation.

METHODS AND MATERIALS Participants
Twenty-seven Spanish/English bilinguals participated in this study. Participants were all undergraduate or graduate students at a Midwest University at the time of testing ranging in age from 19 to 42 (M = 25.80, SD = 5.24). The Spanish/English bilinguals were divided into three groups based on order of acquisition and level of proficiency: (1) L2 Beginner (n = 9), 2) L2 Advanced (n = 8) and (3) L1 Spanish (n = 10). The L2 Beginner and L2 Advanced groups are comprised of L1 English speakers who learned Spanish as an L2. The L1 Spanish baseline comparison group mirror the L2 groups and are L1 Spanish speakers who learned English as an L2. The use of a Spanish baseline from a mirror-image bilingual group avoids the problematic comparison of bilinguals to monolinguals and acknowledges that a bilingual's systems do not act in isolation (see e.g., Grosjean, 2010). Further, as noted by an anonymous reviewer, the use of this baseline group is appropriate in the context of L2 learners in the United States, as it is often the case that learners' interactions are largely with bilingual Spanish speakers including, but not limited to, their instructors.
As measures of L2 language proficiency, L2 Spanish participants completed a 50-item multiple-choice test consisting of portions of the Diploma of Spanish as a Foreign Language (DELE) and Modern Language Association (MLA) that was first used in Slabakova and Montrul (Slabakova and Montrul, 2003) and has been widely used in L2 Spanish research; L1 Spanish participants completed a 50-item English proficiency cloze test adapted from the Oxford Placement Test. The L1 Spanish participants' English proficiency mirrors that of the L2 Advanced group's Spanish proficiency. Further, participants also completed the Bilingual Language Profile, BLP (Birdsong et al., 2012) as a proxy for language dominance. The BLP is a biolinguistic questionnaire that asks questions about bilinguals' language use, language acquisition, etc. and calculates a score for language dominance on a scale of−218 (Spanish dominant) to 218 (English dominant) with "0" indicating "balance" between the two languages. Further, as a part of the BLP, participants rate their Spanish and English proficiency with respect to reading, writing, speaking and understanding. Table 1 illustrates the participant demographics by group.

Materials
The experiment consisted of Delayed Repetitions Tasks (e.g., Trofimovich and Baker, 2006) in English and Spanish. There were 40 trials (10 critical, 10 control, 20 distractor) in each task. A trial consisted of a target disyllabic nonce word with penultimate stress embedded within a carrier phrase, i.e., Digo X para ti in Spanish and its equivalent "I'm saying X to you" in English. A 1,000 ms pause followed the carrier phrase, after which participants were prompted to repeat the original sentence with the question ? 'Qué me dices? In Spanish and its equivalent "What are you saying to me?" in English. All items were phonotactically licit in the target language: Critical and control items in Spanish consisted of (C)CV1.ñV2 and (C)CV1.nV2 structures, respectively; critical and control items in English consisted of (C)CV1n.jV2 and (C)CV1.nV2 structures, respectively. Across conditions, V1 was a mid or low vowel (/ε/ or /ñ/ in English; /e/ or /o/ in Spanish) and V2 was /a/ in Spanish and /ñ/ in English. Distractors followed the same general (C)CV.CV structure as the control and critical stimuli. English and Spanish stimuli were recorded by phonetically trained female native speakers of Midwest American English and Northern Peninsular Spanish, respectively. Table 2 illustrates the item composition for the English and Spanish tasks. Trials were presented using E-prime (Psychology Tools, Inc.); audio stimuli were presented over Sennheiser HD-280 PRO headphones through a MOTU Ultralite mk3 interface. Recordings were made in a sound-attenuated booth using a headmounted Shure SM 10A dynamic microphone and a Marantz PMD 661 solid-state recorder at a 44.1 kHz sampling rate.

Procedure
The experiment was conducted in a single session divided into Spanish mode and English mode segments, with the language mode order counterbalanced across participants. After providing informed consent, participants started the first segment with a 10-min interview in order to establish the first language mode, followed by the delayed repetition task in that language. The English mode segment ended with completion of the BLP and the Spanish mode segment ended with the Spanish written proficiency assessment. The L1 Spanish comparison group only participated in the Spanish mode segment and completed the interview, repetition task, and English written proficiency assessment, in that order.

Duration Analysis Acoustic Analysis
Following the research questions presented in section Research questions and predictions, this study examines the duration and formant trajectories of the following vocalic portion as acoustic indices to differentiate between nasal segments. To that end, we used Praat [6.1.16] (Boersma and Weenink, 2020) to segment and analyze the sound files. The theoretical ceiling of tokens was 710 or 30 per L2 learner (10 Spanish critical, 10 English critical, 10 Spanish control) + 20 per L1 control (10 Spanish critical, 10 Spanish control). Eighteen tokens were removed from data analysis due to non-target productions (participants repeating, skipping or producing different segments), creaky voice, or background noise for a final total of 692 tokens.
During segmentation, we used the following cues to determine the onset and offset of the vocalic portion: 1) the visual presence of an abrupt change in formant structure and frequencies (onset) and 2) a breaking up of the formant structure and a loss of energy and periodicity in the waveform (offset). Following Bongiovanni (Bongiovanni, 2019), boundaries between formant transitions or between the glide and the vowel /a/ were not marked. Figures 4, 5 illustrate two L2 productions of Spanish /ñ/ and their segmentation: Figure 4 aligns with the L1 /ñ/ in Figure 2, in which /ñ/ is represented by a steeper transition (i.e., slope) from the offset of the nasal into the following vowel /a/ (when compared with that of /nj/). Figure 5, on the other hand, aligns more closely with the L1 English /nj/ in Figure 1, in which the formant transition between the nasal segment and following vowel is marked by a raise in F2 frequency and a decrease in F1. After segmentation, we analyzed the sound files by using Praat scripts to extract the measurements (Hirst, 2012  (2012), for semi-automatic formant measurements). Formant measurements were taken at 20 points within the vocalic portion (i.e., every 5%).

Statistical Analysis
For duration of the vocalic portion, in order to normalize for potential between-participants differences in speech rate, we transformed raw duration to z-scores for each participant, with separate transformations for the L2 participants in English and Spanish. While English z-scores were transformed on /nj/ items and /n/ items, only /nj/ items were included in the analysis since the English /n/ data are not relevant to our research questions. In consideration of the sample size, rather than fitting the data to linear mixed-effects models, we follow Plonsky (Plonsky, 2015, p. 30) and instead rely on a combination of 95% confidence intervals (CIs) and effect sizes (here, Hedges' g, which corrects for bias from small sample size) to evaluate between-subjects and within-subjects differences. When using CIs for between-participants comparisons, a difference in means is significant when one group's mean does not fall within the comparison group's CI (Plonsky, 2015, p. 40). For within-groups comparisons, two means are considered significantly different if the CI of the mean of the two differences does not cross zero (Cumming and Finch, 2005). Small, medium, and large effect size thresholds are based on Plonsky and Oswald (Plonsky and Oswald, 2014), whereby between-participants thresholds are.40,0.60, and 1.00, and within-groups thresholds are 0.60, 1.00, and 1.40, respectively.
For the formant trajectories of the vocalic portion, we followed the analysis carried out in Bongiovanni (Bongiovanni, 2019). We transformed the formant values to Bark units and a Smoothing Spline ANOVA (SSANOVA) was fit to the data (time points and corresponding Bark units at each time point) in R, version 4.0.2 (R Core Team, 2020) with the gss package. As part of this analysis, a smoothing spline fits a smooth curve to the data and the SSANOVA determines whether the curves in question are statistically different from one another. Statistical significance is measured by non-overlapping confidence intervals around the splines. Following previous research (e.g., Simonet et al., 2008;Nance, 2014;Kirkham, 2017;Bongiovanni, 2019), we report only the graphical representations of the SSANOVA. g, Hedges' g. a CI of the difference between the two means. b Hedges' g >0.6. c CI of the difference does not cross zero.
FIGURE 6 | Z-score transformed duration of Spanish /n/ and /ñ/ produced by the beginner and advanced L2 groups and L1 Spanish group.

Research Question 1 Duration
To determine whether learners produce distinct /n/ and /ñ/ as measured by duration and whether durational differences are moderated by proficiency, the analysis included within-subjects comparisons of the learners' English and Spanish productions as well as between-groups comparisons of the beginner vs. advanced learners. As shown in Table 3 and Figure 6, the durational difference for /n/ and /ñ/-whereby the vocalic portion following /ñ/ would be predicted to be longer than that following /n/was not significant for the beginner group and the effect size was negligible, with a large CI indicative of substantial variation within the group. However, the difference was significant for the advanced group with a medium effect size. All between-group comparisons were significant: The advanced group falls between the beginner group and L1 Spanish group, who make an even larger durational distinction between /n/ and /ñ/.

Formant Trajectories
For the differences in formant structure, recall that with SSANOVA, statistical significance is indicated by non-overlapping confidence intervals plotted around the data-generated formant curves. With that in mind, Figures 7, 8 present the results of the SSANOVA for the beginner (Figure 7) and advanced (Figure 8) L2 groups' productions of English /nj/, Spanish /ñ/, and Spanish /n/. For the beginner L2 group, the formant trajectories are marked by non-overlapping confidence intervals. There is zero overlap comparing F1 in /n/ and /ñ/ and no overlap between 0 and 85% of the F2 curve, with maximum differences of 0.94 and 2.41 Bark units, respectively. Similar results present for the advanced L2 group: There is zero overlap in the confidence intervals for both F1 and F2 when comparing /n/ and /ñ/, with a maximum difference of 1.24 Bark units and 2.52 Bark units for F1 and F2, respectively.
To summarize the results for RQ 1: While the beginner group showed no significant difference between /n/ and /ñ/ in terms of duration, there was a significant difference in the formant trajectories. The advanced group showed a significant difference between /n/ and /ñ/ for both duration and formant trajectories.

Research Question 2
Duration Because the beginner group does not produce durationally distinct /n/ and /ñ/ and the advanced group does, we limit our duration analysis as it relates to RQ 2 to the advanced data. We examined whether the acoustic quality of the advanced group's duration of /ñ/ reflects (i) perceptual mapping of /ñ/ to English /nj/ or (ii) development of a novel L2 representation. We first compared the advanced learners' /ñ/ to their English /nj/ and to the L1 English baseline (i.e., the beginner group's English /nj/. Figure 9 provides a visual indication of the proximity of the advanced learners' /ñ/ to their /nj/ as well as the proximity of the advanced /nj/ to the L1 English baseline (beginner) /nj/. The visual trends are supported by the data in Table 4; the advanced group did not make a significant durational distinction and the effect size did not reach the minimum threshold for a small effect. Moreover, a between-group comparison of the beginner and advanced /nj/ ( Table 5) shows no difference.
Interestingly, Figure 6 also shows the advanced learners' /ñ/ trending with the L1 /ñ/, and while the data in Table 6 show that the advanced mean falls outside the L1 CI, the L1 CI falls on the edge of the advanced CI and the effect size does not approach the minimum threshold for a small effect. We also see in the same table that the L1 /ñ/ is not different than the advanced /nj/.

Formant Trajectories
To inform the nature of the L2 groups' production of Spanish /ñ/, the formant structures were subject to a within-groups comparison (L2 Spanish /ñ/ vs. L1 English /nj/) and a betweengroups comparison (L2 Spanish /ñ/ vs. L1 Spanish /ñ/). Recall that, in addition to non-overlapping confidence intervals, differences between English /nj/ and Spanish /ñ/ are expected to present in the form of a higher F2 peak and lower F1 valley for English /nj/ vs. Spanish /ñ/. Keeping that in mind, Figures 7,  8 illustrate the within-groups comparisons for Beginner and Advanced L2 groups, respectively, and Figure 10 illustrates the between-groups comparison.
For the Beginner L2 group, the within-group comparison (Beginner /ñ/ vs. Beginner /nj/) revealed no overlap in confidence intervals for F1 (maximum difference =0.47 Bark), with the exception of the point where the two curves cross at 35-45%. The same pattern presents for F2, with no overlap in confidence intervals between 0-35 and 70-100% (maximum difference =0.51 Bark) and an overlap between 35 and 70% where the formant curves cross. The formant trajectories reflect the expected differences between /nj/ and /ñ/, i.e., a lower F1 valley and a higher F2 peak for /nj/ vs. /ñ/. The betweengroups comparison (Beginner /ñ/ vs. L1 Spanish /ñ/) revealed no overlap in F1 confidence intervals between 0-55% and 85-100% (maximum difference =0.32 Bark), with an overlap between 55 and 85% where the curves cross. Further, there was no overlap in F2 confidence intervals between 0 and 65%, with a maximum difference of 0.51 Bark.
For the advanced L2 group, the within-groups comparison (Advanced /ñ/ vs. Advanced /nj/) revealed that the F1 confidence intervals overlap between 0 and 50% and then run adjacent to one another from 50 to 100%. The F2 confidence intervals overlap at the beginning of the formant trajectory (0-20%) and then again FIGURE 9 | Z-score transformed duration of Spanish /n/, Spanish /ñ/, and English /nj/ produced by the beginner and advanced L2 groups.  when the curves cross around 65%. There is no overlap between 20 and 55% nor between 70 and 100% (maximum difference =0.21 Bark). Visually, the formant trajectories illustrate a higher F2 peak for English /nj/ vs. Spanish /ñ/ but a lower F1 valley for Spanish /ñ/ vs. English /nj/. However, the confidence intervals are overlapping at both of these points, thus rendering this distinction non-significant. The between-groups comparison (Advanced /ñ/ vs. L1 Spanish /ñ/) revealed zero overlap in the F1 confidence intervals with a difference of 0.64 Bark units at their most different. The F2 confidence intervals do not overlap at the beginning and the end of the trajectory (between 0 and 20% and between 75 and 100%) with a maximum difference of 0.22 Bark units). Visually, the Advanced /ñ/ demonstrates a higher F2 peak (but not a lower F1 valley) when compared with the L1 control /ñ/.
To summarize the results for RQ 2: As with RQ1, there were no differences in duration within learner groups or between learner groups and the L1 Spanish group. In terms of the beginners' formant trajectories, there was a significant threeway distinction between the beginners' Spanish /ñ/, their English /nj/, and the L1 Spanish /ñ/. In contrast, the advanced data's considerable overlap between /nj/ and /ñ/ formant contours suggests a lack of difference. However, comparably limited overlap between the advanced /ñ/ and L1 /ñ/ contours indicate a significant difference in formant trajectories (see section structural equation modeling for details).

DISCUSSION
This study examined the speech production of beginner and advanced groups of L1 English/L2 Spanish learners and a comparison group of L1 Spanish/L2 English speakers to determine whether their production evidences a phonemic distinction between Spanish /n/ and /ñ/, and, if so, whether learners establish the contrast via L1 English /nj/ or creation of a novel L2 representation. Between-segment patterns were established via two acoustic indices: Z-score transformed durations of the vocalic portion that follows the nasal segment and formant trajectories of the same vocalic portion. Durational differences were predicted to take the form of longer duration for /nj/ than /ñ/ and for /ñ/ than /n/. The formant trajectory for /nj/ was expected to consist of an F1 with a lower valley and an F2 with a higher peak compared with /ñ/; /n/was predicted to evidence an earlier and higher F1 peak and an overall lower F2 contour.  Before we turn to the discussion of the results, there are two notes regarding our analysis: First, because this is the first study to measure these sounds in L1 English/L2 Spanish bilinguals and there are no data to inform relative cue strength for these contrasts, we avoid the arbitrary assignment of relative weights to the indices of duration, vowel height (F1), and vowel frontedness (F2). Instead, we treat them as three separate strategies that speakers may use to distinguish between these nasal segments. Second, we eschew an arbitrary quantification of how little overlap in the confidence intervals of the spline curves constitutes a meaningful difference between the nasal segments and focus our qualitative interpretation on the first half of the formant trajectories (0-50%). In doing so, we home in on the nature of the transitions in /nj/ vs. /ñ/ rather than differences in the vocalic portion, which is expected to differ due to crosslinguistic differences in the following vowel ([@] in English vs. [a] in Spanish).

Research Question 1
Our first research question asked whether L1 English learners' L2 Spanish production reflects a distinction between /n/ and /ñ/ in Spanish and if that distinction is subject to differences in proficiency. Beginning with the duration data, our beginner group did not evidence a significant difference in duration between /n/ and /ñ/ but the advanced group did, producing longer vocalic portions following /ñ/ than /n/ with a medium effect size. This difference between the learner groups suggests that the durational difference increases as a function of L2 proficiency. In comparison to the advanced group, however, the L1 Spanish group distinguishes via duration to a greater degree. Thus, while the advanced L2 Spanish and L1 Spanish both distinguish the segments via duration, the degree to which the L2 group does so does not approximate the L1 Spanish comparison.
If we were to use duration as the only acoustic index of the /n/∼/ñ/ distinction, we would conclude that beginner learners have not acquired the /ñ/ phoneme in Spanish since there is no durational difference between Spanish /n/ and /ñ/. However, the formant trajectory data indicate that the beginners and advanced learners alike utilize height (F1) and frontedness (F2) of the vocalic portion to distinguish /n/ and /ñ/. Figures 7, 8 illustrate majority non-overlap between the F1 and F2 formant contours for both beginners and advanced learners. In other words, in response to RQ1, the data indicate that yes, both groups of learners produce acoustically distinct Spanish /n/ and /ñ/ segments. Further, the distinction varies by proficiency, with the beginners relying largely on F1 and F2 structure and the advanced learners utilizing both F1 and F2 structure and duration.
Given that the Spanish /n/∼/ñ/ contrast varies by proficiency, the question that follows is: Why do beginner learners use vowel height and frontedness to make a distinction, but not duration? There are two points to consider. First, it could be the case that duration is not the primary cue that learners attend to in the input to distinguish the /n/∼/ñ/ contrast, but rather that it is a later-acquired cue that learners have available to them at advanced proficiencies (see e.g., Kong and Lee, for discussion of the effects of proficiency on L2 cue-weighting strategies). Second, we remind the reader of the large standard deviation values for the duration results (Table 3), which indicate substantial variation within each proficiency group. For a more nuanced understanding of the relationship between proficiency and the use of duration, we plotted each learner's durational difference by their Spanish proficiency score (Figure 11).
While the group data indicated that the use of duration to differentiate /n/ and /ñ/ was restricted to advanced proficiency, the individual data plotted by proficiency score indicate a range of durational difference across scores without a discernable pattern. That is, we do not see a clear relationship between an increase in proficiency score and an increase in duration difference to maximize the distance between /n/ and /ñ/. This visualization is bolstered by a very weak positive correlation [r (17) = 0.08, p = 0.760]. Thus, the individual data are suggestive of individual differences in cue weighting, which have been documented in L2 acquisition (e.g., Chandrasekaran et al., 2010;Clayards, 2018), and specifically in the use of duration vs. formants in vowel discrimination (Kim et al., 2018). To confirm this hypothesis, we will need data from the perception of stimuli that isolate the acoustic indices and their possible combinations. Perception data from L1 and L2 Spanish speakers will inform the relative cue strength used by early vs. late learners of Spanish and longitudinal examination will inform whether L2 cue-weighting strategies change as a function of proficiency, as reported in Kong and Lee (2018). In addition, L1 Spanish perception of the L2 Spanish production data will be necessary to confirm that the quantitative differences in duration and formant contours are meaningful (in this case, perceivable).

Research Question 2
Our second research question concerned the quality of the learners' Spanish /ñ/. That is, (a) do they rely on their L1 English /nj/ to approximate the novel Spanish contrast, or (b) have they overcome L1 constraints and established a single segment? We begin with the duration results, which are limited here to the advanced group since the beginner group did not use duration to contrast Spanish /n/ vs. /ñ/. Since the learners' Spanish /ñ/ was not different from their English /nj/, we posit that they do not use duration to differentiate them. Solely based on this outcome, we might conclude that the learners rely on English /nj/ to approximate the L2 Spanish /ñ/ target. However, neither of these was different from the L1 Spanish /ñ/. What might explain a scenario in which a learner's L1 and L2 sounds do not differ from each other and also do not differ from the L2 target? One possibility is that the learners' L2 Spanish /ñ/ has affected their L1 English /nj/. L2 influence on the L1 aligns with a scenario of equivalence classification in which /ñ/ is initially mapped onto /nj/ and, over time, the representation shifts in the direction of the L2 sound. Nevertheless, the advanced learners' English /nj/ did not differ from the English baseline (i.e., the beginner English /nj/ data). These inconclusive findings cast doubt on the reliability of duration as an acoustic correlate in this case, at least at the group level. A look at the individuallevel duration difference between the advanced learners' English /nj/ and Spanish /ñ/ (Figure 12, proficiency scores 40-50) supports the group data, with all but one advanced participant's differences clustered around zero. While there was a weak negative correlation between proficiency score and duration difference [r (17) = −0.30, p =0.237], the weakness is likely due to the variation in the beginners' duration differences (proficiency scores < 30).
Turning to the formant data, we first compare the L2 learners' English /nj/ and Spanish /ñ/, followed by the L1 Spanish and L2 Spanish /ñ/. The beginners use vocalic quality to distinguish between English /nj/ and Spanish /ñ/: They differentiate via vowel height (F1) and frontedness (F2) as illustrated by nonoverlapping formant contours in the first half of the vocalic portion that follows the nasal segment (see Figures 7, 8). The advanced learners, however, do not differentiate via F1, and the F2 contours overlap at the critical onset. Comparison of Spanish /ñ/ across the three groups (Beginner, Advanced, L1 Spanish) shows a clear difference between the Beginner /ñ/ and L1 Spanish /ñ/ via F1 and F2, with no overlap in the critical regions. The Advanced /ñ/ and L1 Spanish /ñ/ comparison, however, is less straightforward. For F1, there is no overlap, although the shape of the formant contour is similar; for F2, there is no overlap at the onset.
Based on these comparisons of duration and formant contours, our tentative response to RQ 2a is that (i) the advanced group relies on their L1 /nj/ representation when producing Spanish /ñ/ while the beginner group does not, and (ii) neither group approximates the L2 Spanish target as measured by an L1 Spanish baseline 9 . In the case of the beginner group, they appear to have established an intermediate representation, although, as we note in our discussion of RQ 1, we will need perception data to determine whether the attested quantitative differences are perceivable. Considering that all of the maximum differences fell below the JND threshold of 1 Bark unit, these data are particularly warranted. Regarding RQ 2b, the acoustic realization of Spanish /ñ/ varies as a function of proficiency: The beginners and advanced realizations differ from each other according to height and frontedness. Both groups differ from the L1 Spanish baseline along both parameters, although the advanced group approximates the L1 more closely than the beginner group.
The finding that beginner and advanced L2 learners differ from one another as well as from the L1 Spanish baseline is not unexpected; intermediate representations have been commonly documented in L2 production research (Zampini, 2008 for a review; see e.g., Broselow and Kang, 2013). In fact, recall that this is what Diehm (Diehm, 1998) found when comparing advanced L2 Russian learners' productions of palatalized consonants (section L2 acquisition of complex palatal(ized) consonants). The unexpected result, however, is that the advanced learners' productions (and not the beginners') show a persistent L1 effect. A common L2 developmental trajectory consists of initial 9 Recalling that our L1 baseline are bilingual Spanish/English speakers, it is possible that a comparison of our baseline group to Spanish monolinguals could reveal differences in /ñ/ production. However, we are limited in the current study to Spanish data from the baseline and future research will need to include their English data to examine the potential effect of L2 English /nj/ on L1 Spanish /ñ/. pervasive L1 influence on the L2. Over time, these effects are thought to lessen as the L2 grammar develops, eventually yielding an L2 representation that (often partially) converges on the L2 target. This attested pattern has been formalized in models such as Major's (2001) Ontogeny Phylogeny Model (OPM), which explains the relationship between transfer, universals, and similarity. Of particular relevance to the present case is the OPM's Similarity Corollary, which posits that L1 transfer effects are persistent in later stages of development when the L1 and L2 phenomena are similar. These effects are thought to limit the role of universals, access to which is necessary to overcome L1 constraints, slowing down the L2 acquisition process. While the advanced learners produce the relevant L1 and L2 sounds similarly and thus align with this pattern, consideration of the beginner and advanced data in tandem suggest a case of Ushaped learning that can be likened to phonological regression attested in child phonological development (see e.g., Tessier, 2019). That is, it is possible that learners initially establish a novel (albeit intermediate) representation. Later, they recognize that an established L1 representation can be redeployed in the L2 (and potentially without compromised intelligibility, see discussion in section Results), which triggers the mechanism of equivalence classification. As noted in section L2 acquisition of complex palatal(ized) consonants, the shared representation is predicted to eventually shift to accommodate properties of both sounds. As we did with duration above, we can gauge the potential shift of /nj/ by comparing the Advanced /nj/ with the L1 English baseline (here, Beginner /nj/) (Figure 13).
What we find is that the F1 contour is indeed different, but in the opposite direction of what would be predicted (i.e., a higher -rather than lower-F1 valley), and there is no difference in F2. Thus, there is no evidence of a shift toward Spanish. require attunement to target-like category boundaries"; rather, construction of the relevant feature matrices "requires only the detection of acoustic correlates of phonological features in the raw percepts" (p. 16) 11 . Triangulation of our production data with perception data that reflect both categorization and discrimination will provide further insight into the learners' developmental trajectories.
Returning to the question of intelligibility, if an L2 speaker's message is not at risk of being lost, what are the practical implications of these findings? Pronunciation pedagogy objectives have shifted away from adherence to native-speaker norms and toward intelligibility (see e.g., Levis, 2018, for discussion). With this shift in mind, if intelligibility is not compromised, we posit that the limited instructional time that teachers have to dedicate to pronunciation does not need to be spent on this contrast. In fact, if /ñ/ is addressed in Spanish pedagogical materials, it typically uses the English heterosyllabic /nj/ as a teaching tool, with statements such as "In speech, this letter [<ñ>] sounds like the middle sound in "canyon" and, in fact, the Spanish word for "canyon" is cañon" (Diversity Style Guide, 2020). This type of information could actually reinforce an intermediate /nj/ representation via conversion of explicit knowledge to implicit knowledge (see e.g., Ellis, 2015 for discussion of this relationship). As a complement the 11 See Baker (2004) for an overview of feature geometry analyses of Spanish palatals. longitudinal observation of perception and production we have proposed, debriefing data on learners' experience with explicit instruction will help elucidate the effects of formal instruction on fossilization.

DATA AVAILABILITY STATEMENT
The raw data supporting the conclusions of this article will be made available by the authors, without undue reservation.

ETHICS STATEMENT
The studies involving human participants were reviewed and approved by University of Illinois Office for Protection of Research Subjects. The patients/participants provided their written informed consent to participate in this study.