Edited by:
Reviewed by:
*Correspondence:
This article was submitted to Language Sciences, a section of the journal Frontiers in Psychology.
This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.
The speech of late second language (L2) learners is generally marked by an accent. The dominant theoretical perspective attributes accents to deficient L2 perception arising from a transfer of L1 phonology, which is thought to influence L2 perception and production. In this study we evaluate the explanatory role of L2 perception in L2 production and explore alternative explanations arising from the L1 phonological system, such as for example, the role of L1 production. Specifically we examine the role of an individual’s L1 productions in the production of L2 vowel contrasts. Fourteen Spanish adolescents studying French at school were assessed on their perception and production of the mid-close/mid-open contrasts, /ø-œ/ and /e-ε/, which are, respectively, acoustically distinct from Spanish sounds, or similar to them. The participants’ native productions were explored to assess (1) the variability in the production of native vowels (i.e., the compactness of vowel categories in F1/F2 acoustic space), and (2) the position of the vowels in the acoustic space. The results revealed that although poorly perceived contrasts were generally produced poorly, there was no correlation between individual performance in perception and production, and no effect of L2 perception on L2 production in mixed-effects regression analyses. This result is consistent with a growing body of psycholinguistic and neuroimaging research that suggest partial dissociations between L2 perception and production. In contrast, individual differences in the compactness and position of native vowels predicted L2 production accuracy. These results point to existence of surface transfer of individual L1 phonetic realizations to L2 space and demonstrate that pre-existing features of the native space in production partly determine how new sounds can be accommodated in that space.
Learning a foreign language in adulthood (and often much earlier) is generally associated with difficulties in producing sounds of this language (
The dominant psycholinguistic perspective (
Both surface and abstract transfer accounts (link A and B) agree that L2 sounds are processed as a function of their perceived similarity to the transferred L1 categories. Depending upon this similarity, L2 sounds are either perceptually assimilated to native sounds, that is, integrated into an existing L1 category, or not [Perceptual Assimilation Model (PAM for naïve listeners and PAM-L2 for L2 learners),
Assimilation of L2 sounds to L1 categories is assumed to take place not only in the perception of individual L2 sounds but also in the perception of L2 contrasts. According to the framework of
The L2 category representations (established in perception) are claimed to be used in L2 production (link C in
There are other studies, however, showing (1) no correlation (
In the current study we examine the links A, C, and D (presented in
Firstly, we assessed the role of the perceptual similarity of L2 contrasts to native phonological categories during L2 perception (abstract transfer account, link A). We used five-forced-choice identification (5FCI) task. This task, unlike an ABX or two-forced-choice identification task, does not limit responses to the two members of a L2 contrast and thus, makes it possible to assess the phonological perception of L2 sounds more broadly (i.e., obtain a confusion matrix). Isolated vowels were used, due to known consonant context effects on the perception of French front rounded vowels by L2 late speakers (
Secondly, we explored the relationship between perception and production (link C) of French contrasts; more specifically, we tested the hypothesis that L2 production performance is guided by L2 perception accuracy. In order to do so we (1) compared group and individual performance in L2 perception and production (i.e., with correlations), and (2) assessed the role of L2 perception in predicting L2 production in by-subject mixed-effects regression analyses. It should be noted, that only few studies have used multiple analyses in testing L2 perception-production relationship (
In order to explore this relationship we also assessed the production of /e/-/ε/ and /ø/-/œ/ contrasts. Two L2 production tasks, repetition and vowel naming, were used in order to tap into the acoustico-phonetic and phonological representations underlying L2 production, respectively. Repetition involves the activation of the auditory-motor loop: the phonological (sensory) auditory pattern is kept in working memory and transferred to the articulators. Importantly, people tend to unconsciously imitate the acoustic patterns that are phonologically irrelevant when asked to repeat (
Thirdly and crucially, we looked for evidence in favor of surface transfer in L2 production by exploring the role of individual-specific L1 phonetic categories (i.e., productions of individuals in their native languages) in the production of two L2-French vowel contrasts (link D). If L2 is indeed relatively independent from L2 perception, we need to find another mechanism to explain why L2 productions are marked by a recognizable L1 accent. Native productions are highly variable between and within-speakers. Between-speaker variability refers to the position of native categories in F1/F2 space (e.g., some L1-Spanish speakers, for example, produce the Spanish /e/ vowel with a more closed mouth, whereas others produce it with a more open mouth) and within-speaker variability to the variability in production of the same sound by the same speaker (i.e., acoustic compactness of native vowel categories). Is it possible that variability in accents across L2 speakers is partly due to variability in production of native sounds (individual use of L1 speech)?
Individual differences in the position and compactness of native vowel category in L1 production have been shown to predict perception performance on similar to this category L2 sounds. Spanish speakers whose individual /e/ category was closer to the French /e/ category (reflected by the acoustic distance between the two sounds), identified it better than those whose L1 category was farther from it (
Cross-language compactness of productions in L1 acoustic space has been shown to have an impact on the perception of foreign (in naïve listeners) sounds (
In order to test for the existence of surface transfer in L2 production (link D) we recorded the Spanish productions of the participants and analyzed them in terms of two properties of individual phonetic realizations: position of the /e/ vowel category in F1/F2 space and the compactness of vowel categories. The position of the /e/ vowel category in F1/F2 space was used to calculate the distance along the F1 and F2 dimensions between this native (individual) vowel category /e/ and each of the native target French vowel spaces (i.e., /e/ and /ε/ in the assimilated contrast) derived from recordings of native French speakers. We used Mahalanobis distance metric measure [distance score (DS)] that estimates distance between a point and a distribution. Therefore, it takes into account the natural variability of the target vowel space. This measure was also used to assess L2 production accuracy. The Mahalanobis distance has been previously used in techniques of speaker identification, where an unknown speech sample is assigned to a speaker on the basis of the minimum distance between a test speech sample and the reference samples (
The second property of native individual productions, the compactness of vowel categories, was measured in two ways: (1) a vowel-specific compactness score (CSV) corresponding to the variability of the /e/ vowel in the L1 acoustic space, and (2) a global compactness score (CSG) corresponding to the sum of the five CSV for the five Spanish vowels [see
The CSV was used to predict the production performance in the assimilated French contrast. Difficulties in the production of L2 assimilated contrasts are generally attributed to the use of L1 categories to which they assimilate (e.g., the English /i/ and /I/ vowels are produced as one Spanish /i/ vowel,
The CSG was used to predict the production accuracy of the uncategorized L2 vowels. The CSG is taken as a global measure of within-category variability in the acoustic space. We expected that speakers whose within-category variability is smaller (a compact acoustic space) would have larger between-category spaces, i.e., larger available slots. It captures, therefore, indirectly the size of available acoustic slots in native space. Vowels were checked by eye for each speaker for evenness of their distribution in the acoustic space. Like the SLM, we expect that: (1) the creation of a new category is easier when the L2 sound falls within an empty region of the L1 phonological space, i.e., this is what defines a new sound in the SLM; and (2) the L1 and L2 phonetic categories exist in one phonological space, and are related to one another at a position-sensitive allophonic level (
The present study firstly reports the results of an analysis of the perception and production of L2 French contrasts. Secondly, it reports the results of the acoustic analyses of vowel productions in Spanish. Finally, it examines the role of L2 perception and L1 production in L2 production.
Fourteen monolingual native female speakers of Spanish (mean age 16 years) from a Spanish middle school in Plasencia (Spain) took part in the study. They had, on average, 4 years of French classes at school and, according to their teacher, their proficiency in French corresponded to the intermediate B1 level of the Common European Framework of reference for languages. They all came from the same region in Spain and had never lived in a French-speaking country. Participants gave informed consent and were free to withdraw from the experiment at any time.
To create the stimuli used to assess the production and perception performance of the Spanish participants on French vowels, seven native female French speakers were recorded. They were all from the same French region (Paris, Ile-de-France) to minimize the effect of regional dialect. The productions of multiple speakers were used to increase the variability across vowels for two reasons; first, it encourages phonological rather than acoustic perception of vowels (
Six of these seven native French speakers were recorded while reading three lists of 10 French sentences of the type, “Je prononce /ε/ comme dans lait” ([ʒə pronõs ε kom dã lε], “I pronounce /ε/ as in milk”). Each list contained 10 sentences with one of the 10 oral French vowels examined. Each sentence contained one isolated vowel and also a word containing this same vowel (e.g., the vowel /ε/ in isolation and in the word /lε/ (lait, “milk”). The sentences in the three lists were identical except that the final word of the sentence began with a different consonant in each list: /l/ (lait-/lε/ “milk”), /s/ (sait-/sε/ “know”), or /p/ (paix-/pε/ “peace”). Recordings were made in a quiet room, using a Marantz PMD670 portable recorder and a Shure Beta 58A microphone, sampled at 22.05 kHz directly to 16-bit mono.wav files.
The vowels from the words were extracted from the sentences, normalized for intensity, and matched in length (mean 210 ms). These vowels were then rated by five French speech therapists; the vowels that were accepted as good prototypes of French by at least four judges were kept. In total, 120 vowel productions were retained for the identification test
Forty vowel productions were used for the repetition task; this included four isolated vowels (i.e., /e/-/ε/-/ø/-/œ/) * 5 different speakers * 2 exemplars per speaker. These 40 vowels were also used to define the acoustic spaces for the French vowels /e/-/ε/-/ø/-/œ/ (i.e., 10 tokens * 4 vowels) when computing the DS, a measure of L2 production accuracy that we elaborated to assess L2 speakers’ production performance (see Evaluation of L2 Production Accuracy for details).
In addition to sentence reading, the sixth native speaker read a list of 10 words that named the pictures in the familiarization phase of the identification task. Monosyllabic, non-cognate, concrete nouns which were likely to be known by the participants were used, each containing one of the 10 French vowels of interest.
The seventh French speaker was recorded (using the same recording parameters as above) while reading a list of other (than used in identification) 20 different words (i.e., five words for each of the four tested vowels /e/-/ε/-/ø/-/œ/) whose corresponding pictures were used in the naming task. These productions were used in the familiarization phase of the naming task. For example, the five proposed words for the vowel /ε/ were: chaise [ʃ
Words and corresponding pictures used in naming task for the vowel /ε/.
Participants first filled in a language-background questionnaire concerning, among other things, their L1, fluency in French and their proficiency in other languages. After that, they performed three tasks with the French stimuli; their order was randomized across subjects. All participants performed the Spanish reading task last. All tasks were performed individually on a Dell laptop using E-Prime 2 software (Psychology Software Tools, Pittsburgh, PA, USA). Participants’ productions were recorded with a professional digital recorder Marantz PMD670, using a Sennheiser PC151 headphone with microphone, they were sampled at 22.05 kHz directly to 16-bit mono.wav files.
A 5FCI task was used to assess the perception of L2 vowels. The 10 tested French oral vowels were separated into two groups of five vowels as function of their possible perceptual confusability with each other for Spanish speakers. Group one contained the mid-close-mid-open front unrounded (assimilated) contrast /e/-/ε/, along with the other three vowels /i/, /u/, /y/. Group two contained the mid-close-mid-open rounded front (non-assimilated) contrast /ø/-/œ/, along with the other three vowels /o/, /ɔ/ and /a/ (see
Words corresponding to the vowels and pictures used in the identification task.
During the familiarization phase, participants had to learn the association between the pictures presented on the screen and the vowels in their labels.
At the beginning of the familiarization phase, the five pictures of a group appeared on the screen and remained there for the duration of the familiarization phase for that group. On each trial, one of the five pictures was indicated by a rectangular frame and the corresponding vowel and word were heard (see
On each trial of the testing phase, the participants heard one of the five isolated vowels (recorded from the other five speakers, i.e., from 1 to 5
To test the participants’ L1 production, a reading task was used. The participants read a passage from a chapter of a SP translation of “The Godson” by Leo Tolstoy. They were instructed to read as naturally as possible at a moderate tempo. This task, rather than natural speech was used in order to control the phonetic environment of the produced vowels across the participants. The acoustic values of the five vowels (i.e., /i/, /e/, /a/, /o/, /u/) produced in Spanish were used to compute two measures of L1 compactness: a CSV and a CSG. They were also used to calculate the Mahalanobis DS measure between the native vowel category /e/ and French /e/ and /ε/ categories.
where a and b are 1/2 the length of the ellipse’s major and minor axes. Since the distribution of the productions in F1/F2 space was assumed to be elliptical, the angles of the major and minor axes of an ellipse centered on the mean of the productions were estimated (in order to determine the orientation of the axes). The formula used to calculate the CS was:
where
A CS was computed for each participant and for each vowel. The CS for the SP /e/ vowel is called CSV. Before computing the CSG, speakers’ mean F1 and F2 values for five Spanish vowels were checked by eye for the evenness of their distributions in the acoustic space, and that for each speaker. This revealed that all vowels were produced evenly occupying the five corners of Spanish vocalic space; there were no superimposed vowels. By this analysis, we can eliminate the possibility of there being a very compact space but no room for new sounds. A CSG was computed for each participant by taking the sum of the five CS, of the five SP vowels. The CSV and CSG were used as predictors in mixed-effects models analyses of the production of the French (FR) assimilated (i.e., /e/-/ε/) and uncategorized (i.e., /ø/-/œ/) contrasts respectively.
On each trial of the familiarization phase, a visual and an auditory stimulus were presented simultaneously. The former was a picture that appeared on the screen and the latter was the spoken word (via headphones) that corresponded to that picture. The productions of the seventh speaker were used for this task. Each vowel was represented by five different words (see Stimuli for details). The words and pictures used here were different than those used in the identification task. Each picture-word combination was presented twice, resulting in 40 trials in total: five pictures * 4 vowels * 2 times.
The testing phase immediately followed the familiarization phase. On each trial, a cross appeared on the screen for 500 ms, and was then followed by a picture that was displayed for 4000 ms. Participants were instructed to produce the vowel that the word (i.e., picture) contained (e.g., for the picture of an “arrow” – “flèche” [
To assess the accuracy of vowel productions in the naming and repetition tasks, Mahalanobis DS was computed which represented the distance between the vowel produced by participants to the corresponding FR target vowel acoustic space. The target space was derived from the productions of the 5 native French speakers (see Stimuli). The calculations were implemented in Matlab (2011a; The MathWorks Inc., Natick, MA, USA). We used this metric rather than the simpler Euclidean distance of the produced token to the mean target FR vowel in order to take into account the natural variability in speech production (i.e., in this case that of the target vowels). In addition, in order to assess the distinctness in production of the pairs of height-contrastive vowels within each contrast, an acoustic analysis (i.e., F1) of the produced French vowels by Spanish participants as compared to native French speakers was performed.
To assess the effect of contrast type (assimilated [/e/-/ε/] versus uncategorized [/ø/-/œ/]) on identification performance, accuracy scores (1 for correct and 0 for incorrect) were fitted to a mixed-effects logistic model that is traditionally used to analyze binomially distributed data (
Confusion matrixes with percent identification responses for the vowels /e/ and /ε/ in A and /ø/ and /œ/ in B.
Presented vowel | % Identification as |
|||||
---|---|---|---|---|---|---|
e | ε | u | i | y | ||
e | 59 | 35 | 5 | 1 | 0 | |
ε | 55 | 43 | 2 | 0 | 0 | |
ø | œ | o | ɔ | a | ||
ø | 40 | 33 | 11 | 14 | 2 | |
œ | 36.4 | 34.2 | 6 | 19.4 | 4 |
The material recorded in the Spanish reading task was analyzed by the first author using the Praat software (
The quality of the tokens produced during the L2 production tests was checked by the first author for intensity and presence of non-linguistic sounds (e.g., coughs, sneezes, sighs). As a result, 13 tokens produced in the repetition task and fifteen in the naming task were removed. In addition, 90 trials without answer (i.e., “no answer” trials) in the naming task were also discarded. The F1 and F2 values of the remaining tokens were calculated as in the previous analyses.
The DS was computed to assess the production accuracy in the L2 naming and repetition tasks between the produced tokens and the French target spaces. 847 DSs were obtained for both tasks (naming and repetition). Outliers and extreme values were detected using Quantile–Quantile (Q–Q) plots and were removed. The remaining DSs varied from 0.1 to 12.72, and had a standard deviation of 2.31. The following statistical parameters are reported: the coefficient estimate β, the standard error (SE), and the
To analyze the production of the assimilated and uncategorized FR vowels, the DSs were fitted to a general linear mixed-effects model (
Results showed that there was a significant effect of compactness (β = 2.481e-06, SE = 1.173e-06,
In order to assess the effect of the distance between the Spanish /e/ and the French /e/ and /ε/ vowels (DS-FR) on the production of the French similar vowels, separate general linear mixed-effects analyses were applied to the data set on assimilated vowels only. Similar to the previous analyses, the fixed-effects structure included the effect of compactness, task, and identification accuracy; in addition, we included the fixed effect of DS-FR and its interaction with the CSV. The random structure included by-subject and by-vowel slopes adjusted for the fixed factors. There was a significant effect of the DS-FR (β = 4.089, SE = 1.287,
An acoustic analysis of the produced French vowels was performed in order to examine how well L1-Spanish speakers produced the pairs of height-contrastive vowels within each contrast, compared to native French speakers. The F1 is the acoustic parameter that is the most indicative of height differences. F1-differences between the /e/–/ε/ vowels, and between the /ø/–/œ/ vowels were computed for each subject for the French vowels produced by Spanish and French speakers separately. Bartlett’s test did not show a violation of homogeneity of variances [χ2 (1) = 0.0834,
In order to examine the relationship between L2 perception and production, Spearman-rank correlation analyses (that are used when the relationship between the two variables is monotonic but not linear) were performed on individual’s mean identification and DS scores for each contrast separately. All correlations were not significant (
The compactness of L2 productions (CSL2) in the repetition task was computed for the /e/, /ε/, /ø/, and /œ/ vowels for each speaker using the same formula as the one used to calculate the compactness of Spanish productions. In the naming task, too few productions for each speaker were available to estimate compactness. Bartlett’s test showed a violation of homogeneity of variances [χ2(1) = 35.6,
Results of Pearson-correlation analyses between the DS and CS
Vowel | Pearson coefficient |
---|---|
/e/ | |
/ε/ | |
/ø/ | |
/œ/ |
Our results will be discussed with reference to
Our results revealed that the assimilated vowels were perceived better than the uncategorized ones (51% and 37.1% correct responses, respectively). In terms of PAM this suggests that the /e/-/ε/ vowels assimilated in SC manner to the Spanish /e/ category and that the /ø/-/œ/ vowels are not discriminated from each other (or from neighboring native categories, see below for details). Participants’ better perception on assimilated than on uncategorized vowels can partially be explained by our use of a larger response set in a 5FCI task rather than the typical two-choice identification or discrimination tasks. Our task makes it possible to evaluate perception not only within a contrast, but also within the larger L2 phonological space. For the assimilated contrast, Spanish speakers only rarely responded outside their contrast (see
The perception performance on the assimilated contrast suggests assimilation in a SC manner (rather than CG) to the Spanish /e/ category. On more than a half of the trials, both the /e/ and /ε/ vowels were identified as the /e/ vowel (59% and 55% of /e/ identifications respectively, see
The low performance on the uncategorized French /ø/-/œ/ contrast (37.1% correct responses) shows that participants have major difficulties in identifying these L2 vowels.
In sum, the results of our study suggest that when the perception of L2 French mid-close/mid-open vowel contrasts is assessed using a large L2 phonological response set (i.e., one that is not limited to the tested contrasts), Spanish speakers experience more difficulties with new contrasts than with those that are similar to sounds in the L1 phonological system. Moreover, our results also point to the importance, especially in the case of new sounds, of using larger response sets that make it possible to assess L2 perception outside the phonetic contrast, rather than using two-choice tasks as is typically done.
The production of two French vowel contrasts was assessed in repetition and naming tasks. We conducted three different analyses to examine participants’ performance. First, we assessed and compared vowel production accuracy on each task. Here we computed the distance (DS) separating the F1/F2 values of each produced L2 vowel from the target-category acoustic space and we analyzed the distinctness of the contrasting L2 vowels by calculating the F1-difference separating them. Second, we measured the compactness (i.e., inverse of variability) of the acoustic spaces in the production of the four tested /e/-/ε/-/ø/-/œ/ vowels on the repetition task.
We compared the production of the assimilated and the uncategorised contrasts on two tasks, repetition and vowel naming. We expected that Spanish participants’ productions would be more accurate in the repetition task than in the naming task for both types of contrasts. The results revealed no effect of task on the production accuracy of either French contrasts. These results suggest that the acoustico-phonetic and phonological representations that underlie the production of the vowels in the repetition and naming tasks, respectively, are of similar quality/accuracy.
Like for perception, the production of the assimilated contrast was more accurate than that of the uncategorized one. This pattern can be attributed to differences in the number of existing French minimal pairs containing the vowels /e/-/ε/ and /ø/-/œ/ respectively. In French, minimal pairs based on /e/-/ε/ are abundant and contain high frequent words (e.g., mes-mais, tes-tais, ses-sais, gré-grès for “my”-“but,” “your”-“keep silent,” “his”-“I know,” “will”-“sandstone”;
Difficulties in production of French front rounded vowels can also be due to difficulties in mastering the frontness dimension (F2) for rounded vowels.
The results of the analyses on the distinctness of the L2 vowels within a contrast, that is, on the acoustic distance separating them, revealed an effect of task for both contrasts. The distinctness, as reflected by the height distance (F1) between the contrasting vowels, was larger on the repetition than on the naming task. The results on the distinctness of productions suggest that the acoustico-phonetic representations underlying the production of both assimilated and uncategorized contrasts are more accurate in a repetition task than are the internal phonological representations tapped by the vowel naming task. Nevertheless, even on the repetition task the vowels are not produced as distinctly as they are by native speakers. Vowels in both contrasts were repeated with comparable (no statistical effect of contrast) but limited distinctness. This result suggests that the mid-close/mid-open height distinction that does not exist in Spanish is very difficult for Spanish speakers of French, independently of the perceptual similarity of the L2 contrasts to L1 sounds. Levy and Law assessed production of front rounded vowels by AE highly experienced speakers of French (e.g., a mean of 8 years of formal education and 3.5 years of immersion into a French-speaking country). They concluded that an uncategorized vowel /œ/ that assimilated to several native categories is of particular difficulty even for L2 experienced speakers. Native French speakers judged relatively low AE productions of this vowel and that independently of the consonant context used (
The observed differences in the effects of task on the measures of accuracy (repetition = naming) and of distinctness (repetition > naming) could be due to two factors. First, the statistical analyses used to assess the task effects on the production accuracy included both within-subject and between subject variability (i.e., mixed-effects model), whereas they included only between-subject variability for the distinctness measure. There was therefore more variability in the former analysis which could have prevented the effect from emerging. Second, the accuracy measure (i.e., DS) compares the F1 and F2 values of each vowel (i.e., /e/ and /ε/, and /ø/ and /œ/) to those of the corresponding target vowel produced by native French speakers; the distinctness, on the other hand, only reflects the F1-differences in the production of contrasting vowels for each subject. The latter measure therefore allows us to capture small differences in the height dimension that might be masked by using a joint measure including F1 and F2 in the DS measure.
Only the productions in the repetition task were analyzed for their compactness since there were too few productions for each speaker in the naming task to estimate their compactness. Although the statistical analyses did not reveal differences in compactness between two contrasts (most likely due to large cross-speaker variability), it can be seen from
Correlation analyses revealed that compactness was strongly correlated with the L2 production accuracy measure (DS) for three L2 vowels (i.e., /e/, /ø/, and /œ/): speakers whose productions were compact in L2 space were also more accurate. Our recent L2 production training study has shown that successful learning of L2 sounds is accompanied by reduced variability (increased compactness) of the trained sounds (Kartushina et al., submitted). The absence of such a correlation for the French /ε/ vowel that is highly similar to the Spanish /e/ vowel (
In examining the relationship between L2 perception and production (link C), we have combined three approaches. First, we looked at group performance by comparing average accuracy on the perception and production of L2 contrasts. Second, we explored individual performance by a) running correlational analyses between L2 perception and production performance across speakers, and b) running by-subject mixed-effects regression analyses to test the effect of L2 perception accuracy on L2 production, taking into account individual variability. Unfortunately, only few studies have compared L2 perception and production performance directly, with even fewer having used the two former approaches (e.g.,
The average results for both L2 perception and production tasks revealed an effect of the type of contrast: vowels in the assimilated contrast were perceived and produced better than those in the uncategorized contrast. This result could be taken to suggest that L2 perception and production are related. However, the results of the correlation and of mixed-effects generalized regression analyses revealed no relationship between L2 perception and production; speakers’ identification accuracy did not predict their production accuracy.
This divergence as a function of the type of analysis conducted is neither novel nor unexpected. For example, in a study on the effect of L2 perception training on L2 production, a group effect was observed (i.e., improved perception of the trained vowels led to improved production), but there was no relationship between the improvement in these capacities across individual speakers (
Other L2 studies have shown that perception and production may be aligned only in proficient L2 speakers (
One of the main aims of our study was to evaluate the relation between intra-speaker variability in the production of native sounds and the production accuracy of L2 contrasts (link D in
Two CSs were considered. The first, CSV, refers to the compactness of the acoustic space for the Spanish /e/ vowel and was used to predict the production performance on the French /e/ and /ε/ vowels that assimilate to this Spanish vowel. The second, CSG, refers to the global compactness score (i.e., global indicator of within-category variability) for the production of all Spanish vowels (/i/, /e/, /a/, /o/, and /u/) and was used to predict L2 production performance on the French /ø/ and /œ/ vowels that have no clearly similar category in L1 space and therefore are uncategorized. The results revealed significant effects of L1 compactness on L2 production accuracy for both assimilated and uncategorized contrasts: (1) Spanish speakers with more compact distributions for the Spanish /e/ vowel (CSV) were better at producing the similar French /e/ and /ε/ vowels; (2) Spanish speakers with more compact overall distributions for the five Spanish vowels (CSG) were better at producing the uncategorized French /ø/ and /œ/ vowels.
The results for the assimilated vowels corroborate and extend to production our previous findings on the effect of L1 compactness on L2 perception (
The results on the French uncategorized /ø/ and /œ/ vowels revealed that speakers whose vowel productions in L1 space were more compact produced them more accurately than those whose L1 space was more dispersed. In other words, speakers whose Spanish front and back vowels were less variable and mainly restricted acoustically to the front and back positions
The distance score measure (DS-FR) that estimated the distance between the native Spanish /e/ vowel and the non-native French /e/ and /ε/ vowels showed that the closer the L1 /e/ category is to the target French vowel, the higher the production accuracy for this target vowel is. These results suggest that L2 speakers re-use exemplars of the native /e/ category to produce similar L2 categories. However, it is important to note that the L2 productions are not prototypical L1 exemplars. We computed additional DSs between the Spanish (i.e., /e/) and French (i.e., /e/ and /ε/) productions that confirm this: on average, the L2 /e/ and /ε/ productions were 5 and 4.5 DS units away from the native Spanish /e/ category, respectively. However, since the Spanish vowels produced in the reading task are more prone to co-articulation effects than the French vowels that were produced in isolation, these comparisons should be further confirmed with acoustic analyses in which both L1 and L2 vowels are recorded in similar phonetic environments. It should be noted however, that we measured the F1/F2 of the Spanish vowels produced in the reading task at the mid portion of the vowel in order to reduce such coarticulation effects.
One of the SLM’s claims is that bilinguals strive to maintain a contrast between L1 categories and similar L2 categories in a common interphonological space. Although our participants were not fluent Spanish-French bilinguals but were quite advanced L2-French learners, they nevertheless exhibited a tendency to maintain the contrast between the Spanish /e/ and the similar French /e/ and /ε/ vowels on both tasks. The ellipses representing 1 standard deviation distributions of the vowels (i.e., vowel acoustic space) tend to not overlap between the native /e/ and the similar /e/ and /ε/ L2 categories. Moreover, even in the naming task (when no auditory example is given), participants tend to use realizations that are different from those of the L1 /e/ vowel in producing the similar L2 sounds: their L2 /e/ and /ε/ productions are more closed and opened respectively. This result taken together with that of the DS-FR suggests that Spanish speakers re-use non-prototypical L1 tokens to produce similar L2 vowels, and that they try to maintain the differences between the L1 sounds and similar L2 sounds.
The significant interaction between the compactness of the L1 /e/ category and its distance from the target French vowels suggests that the effect of compactness on production accuracy is stronger when the acoustic distance between the Spanish and French sounds is smaller. This can be due to the fact that when such distances are smaller, greater compactness is necessary in order to still retain some “blank space” around the native category in order to ensure accurate production of a newly formed L2 category/sound.
In this section we propose some improvements for studying L2 production both in terms of the tasks and measures used to evaluate its accuracy. We used two different production tasks, naming and repetition, in order to tap into phonological and acoustico-phonetic representations, respectively. A similar methodological approach could be taken for the perception tasks as well. The 5FCI task that we used to assess L2 perception is well suited for evaluating internal L2 phonological representations. However, it cannot provide much information about the detailed properties of the phonetic representations underlying the perception of L2 vowels. A categorization task (using continua) that more finely reveals the categorical perception (i.e., category boundaries) of L2 sounds could be included to assess phonetic processing. By using tasks that reflect the perception and production of L2 sounds at different representational levels (e.g., acoustic, phonological, lexical), we could draw more solid conclusions about the relationship between the two capacities at comparable representational levels.
In computing the global compactness of L1 acoustic space, we added together the specific compactness of the five Spanish vowels, and inferred that those speakers who had more compact vowels (i.e., little within-category variability) were likely to have larger “empty” acoustic spaces. Another approach would be to assess the size of the “empty” spaces more exactly by subtracting the CS of the five Spanish vowels from the “total” space (i.e., the area of a triangle delimited by the extreme vowels /i/, /a/, and /u/). Alternatively, we could assess the ratio between the between-category and the within-category variability, as suggested by the editor of this issue. Also, other properties of individuals’ L1 productions could be predictive of L2 production and perception performance. These include the distribution of vowel categories in acoustic space (e.g., the extension of the categories in openness and height dimensions, i.e., from the lowest F1 and F2 to the highest F1 and F2 respectively), and their distinctness in production (e.g., whether they overlap with each other or not).
Last, we used compactness measures to evaluate the variability of productions in L1 space. These measures partially depend upon the articulatory skill of the speakers: the more skilled and precise the speakers, the more compact their productions. If our results are indeed attributable to articulatory skill, then we would expect speakers who are “articulatorily precise” in L1 to also be precise in L2. However, additional correlational analyses did not reveal any relationship between the compactness in the L1 and L2 productions. This result, however, should be taken with caution since compactness was assessed on productions using different tasks in L1 and L2. Alternatively, individual differences in compactness can be due to factors other than articulatory skill.
Our study has addressed three questions. First, what is the role of the similarity between L2 sounds and L1 categories in L2 identification? Second, what is the relationship between the perception and production of non-native (L2) sounds? And third, what is the role of individual, native (L1) productions in determining L2 pronunciation accuracy?
Native Spanish speakers learning French in high school were assessed on their perception and production of two French mid-close/mid-open contrasts, (1) /e/-/ε/, that perceptually assimilates to one Spanish /e/ category; and (2) /ø/-/œ/, that is dissimilar to any existing Spanish category (i.e., an “uncategorized” contrast). Native Spanish productions were also recorded, and two phonetic measures were computed for each individual: the variability in the production of these native vowels (represented by the compactness of vowel categories in F1/F2 acoustic space), and the position of the vowels in the acoustic space. Productions of French vowels by native French speakers were also recorded, and compared to productions of these French vowels by the Spanish speakers.
We found, first, that the assimilated contrast /e/-/ε/ was identified better than the uncategorized one /ø/-/œ/. This result is at odds with the SLM which predicts that new sounds are acquired better than similar sounds. It supports the PAM-2 that predicts poor performance on uncategorized contrasts if L2 phones are perceived as being very close to each other and to the same L1 sounds.
With regard to the second question, the group results revealed that the poorly identified contrast was generally produced poorly. However, the results of the analyses that took into account individual variability in the production and perception of L2 sounds revealed no relationship between these two modalities. This goes against the traditionally held view that L2 production depends upon L2 perception, and is in line with the growing body of research showing at least partial dissociations between L2 perception and production, and between their underlying neural mechanisms.
Finally, we have shown that the phonetic properties of an individual’s L1 productions (i.e., acoustic compactness of L1 categories, and their position in the individual’s L1 space) predicted L2 production accuracy. These results are consistent with the claim by SLM of a shared inter-phonological space between languages, since they show that L1 and L2 sounds are related to each other at the acoustico-phonetic level, within individuals. Our results are in line with surface transfer hypothesis and suggest a transfer of individual phonetic categories, in addition to more abstract, phonological ones to L2 production. Other studies have also pointed to the role of L1 phonetic properties in L2 perception. For example,
Our results provide evidence in favor of the role of individual phonetic properties of native productions in predicting L2 production accuracy, but also point to the need for further investigation of individual-specific factors affecting L2 production. Together with our previously published findings showing that L1 production also influences L2 perception, these results highlight the malleability of phonetic processing, and demonstrate that pre-existing features of the native space in production partly determine how new elements can be accommodated in that space.
The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.
We are grateful to the students of the Instituto de Educación Secundaria Parque de Monfragüe in Plasencia, Spain for their participation and to the French teachers and to the director of the school for making this study possible. We are very grateful to Dr. Narly Golestani for her valuable comments and suggestions on an earlier draft of this manuscript. We are also grateful to Dr. Alexis Hervais-Adelman for his assistance in acoustic analyses. Finally, we would like to thank the reviewers and the editor, Noël Nguyen, for their helpful comments on earlier versions of this manuscript.
It is important to mention that the relationships presented in this figure are not exclusive neither sufficient in understanding L2 acquisition phenomena. Only for the sake of clarity, other links and bidirectional arrows were removed.
Note that the difference between language-specific versus individual-specific L1 categories in
Note that 20 vowels (two exemplars per vowel) produced by one (i.e., the sixth) speaker were only used for the familiarization phase in the identification task.
The identification performance on the four vowels of interest only is reported in this study.
The use of different speakers during testing and familiarization phases was done in order to make sure that the perception performance during the testing phase was not biased by learning that might have occurred in the familiarization phase.
Note, that there were no speakers whose productions were superimposed and/or collapsed in the central region.