On the temporal and functional origin of L2 disadvantages in speech production: a critical review
- 1 Departamento de Psicología Básica, Universitat de Barcelona, Barcelona, Spain
- 2 Department of Information and Communication Technologies, Universitat Pompeu Fabra, Barcelona, Spain
- 3 Laboratoire de Psychologie Cognitive, Université de Provence, Marseille, France
- 4 Institució Catalana de Recerca i Estudis Avançats, Barcelona, Spain
Despite a large amount of psycholinguistic research devoted to the issue of processing differences between a first and a second language, there is no consensus regarding the locus where these emerge or the mechanism behind them. The aim of this article is to briefly examine both the behavioral and neuroscientific evidence in order to critically assess three hypotheses that have been put forward in the literature to explain such differences: the weaker links, executive control, and post-lexical accounts. We conclude that (a) while all stages of processing are likely to be slowed down when speaking in an L2 compared to an L1, the differences seem to originate at a lexical stage; and (b) frequency of use seems to be the variable mainly responsible for these bilingual processing disadvantages.
Most of us have experienced difficulties at one level or another when speaking in a second language. Accordingly, it has been widely documented in several measures that bilinguals are less efficient when speaking in their L2 than in their L1. Compared to L1 speech, bilinguals speaking in their L2 are slower and less accurate in retrieving object-names, it takes them longer to articulate complete words and phrases, and they often speak with a more or less perceptible foreign accent (e.g., Kohnert et al., 1998; Gollan and Silverberg, 2001; Roberts et al., 2002; Gollan et al., 2007, 2008; Ivanova and Costa, 2008). A lot of research has been devoted to assess the presence of these phenomena in a broad range of tasks using both behavioral and neuroscientific measures, leading to the development of more or less detailed theoretical accounts regarding the origin of the L2 speech production disadvantage. The aim of the present article is to examine these proposals in the light of the available evidence to see whether it is possible to establish which mechanism is mainly responsible for the L2 disadvantage and at what point in time it starts to have an impact. It should be made clear that the focus of the current review is to examine what factors are important in differentiating L2 speech from L1 speech independently of other variables. Therefore, we will part from the assumption that at least part of the bilingual speech production disadvantages across speakers (e.g., low and high proficient, early and late) and phenomena (e.g., slower naming speed, decreased naming accuracy, decreased verbal fluency, etc.) have a common source, which also seems to be an implicit assumption of the theoretical accounts that we will contrast in the present review. As a consequence of this approach, we will evaluate the theoretical proposals also in their ability to account for the totality of L2 disadvantages observed in speech production. However, it should be noted that such an approach does not preclude differences across speakers or measures, nor does it imply that such differences cannot be important from other points of view.
Current Accounts of L2 Speech Production Disadvantages
To date, three different theoretical accounts have been put forward to explain L2 speech production difficulties. The first explanation relies on the general principles of frequency effects in speech production and assumes that the L2 disadvantage is a frequency effect in disguise (e.g., Gollan et al., 2008). In a seminal paper by Oldfield and Wingfield (1965), it was shown that speed of lexical retrieval is negatively correlated with word frequency, with comparatively longer naming times for low-frequency words. Gollan et al. (2008) argue that since most bilinguals use their non-dominant language less frequently than their dominant language, the links between representations of the semantic and the lexical system in L2 are weaker than in L1 (hence the use of “weaker links” to refer to this account), turning lexical representations in L2 less accessible than those of L1. Note that, while for low proficient bilinguals this explanation becomes obvious, it can also account for disadvantages in highly proficient bilinguals who use both their languages on a daily basis (e.g., Spanish–Catalan bilinguals in Cataluña or English–Spanish bilinguals in the USA). That is, even though in these communities both languages are often used, there nevertheless remain some proportional differences in the amount that each language is spoken. The weaker links account assumes that even these subtle differences in frequency of use are able to create an imbalance between the dominant and non-dominant language reflected in a processing disadvantage for the latter.
The second account is what we will refer to as the executive control account. According to such a view, the L2 disadvantage is the consequence of applying language control during speech production (e.g., Abutalebi and Green, 2007; Bialystok et al., 2008). The rationale behind this assumption is that since words from both languages of a bilingual become activated during language processing (e.g., Colomé, 2001; Thierry and Wu, 2007; Wu and Thierry, 2011), a powerful control mechanism is necessary in order to select the correct word for production while preventing interference from the non-target language (e.g., Green, 1998; Costa et al., 1999), and additional executive control resources are assumed to slow down language production. Importantly, since speaking in a weak language should require more control compared to a strong language (that is, the stronger language should always become more active), the disadvantages caused by these control mechanisms are expected to be greater in L2 compared to L1.
A final and rather specific account, which we will descriptively label as the post-lexical account, attributes any differences in the speed of naming between L1 and L2 to stages posterior to lexical access such as for instance syllabification (e.g., Indefrey, 2006; Hanulovà et al., 2011). Indeed, phenomena such as “foreign accent” (e.g., Flege and Eefting, 1987; Flege et al., 2003) provide good reasons to assume difficulties for phonological and phonetic encoding, syllabification and even motor-planning and articulation. Hanulovà et al. (2011) highlight several possibilities why this may be the case: (a) phonological encoding might be more effortful in a second language if the phonotactic constraints on syllable structure of L1 are carried over to L2 production (e.g., illegal phonotactic syllable structures in L1 might not be so in L2); (b) syllables that do not overlap across languages might be constructed on-line when speaking in L2, while in L1 syllables might be stored in a mental syllabary and thus easily available to the speaker; (c) compatible with the two other accounts, though restricted to post-lexical processes, the mechanisms responsible for the more effortful L2 processing may be explained in terms of frequency (e.g., syllable frequency, motor-program frequency, etc.) and/or the need to apply language control to avoid interference from the non-target language.
It is important to note that although these three accounts are rather different in terms of loci and/or sources of the L2 disadvantage, they are not mutually exclusive. It is evident that few bilinguals use their two languages equally often, that their two languages frequently have different post-lexical properties, and that they need to control in which language to speak. Thus, most probably all three explanations introduced above are involved (at least to some extent) in producing the L2 speech disadvantage. Nevertheless, an important endeavor is to determine whether any of the three has a more prominent role in doing so, and which processing stage or stages are affected by the bilingual disadvantage. Both weaker links and the executive control account could be implemented at any stage or all stages of processing since the mechanisms they rely on are not necessarily bound to a particular process. In turn, while the post-lexical account offers a unique locus where L2 is slowed down, the mechanism responsible is not specified. A better understanding of the extension of the bilingual disadvantage within the system as well as its source will surely help to develop more accurate models of bilingual speech production.
In what follows we will selectively and critically review the available hemodynamic, behavioral, and electrophysiological evidence with the goal of better characterizing the bilingual disadvantages both in terms of predominantly responsible mechanisms as well as the stages where these have their impact.
What Can Hemodynamic Studies Tell us about the Origin of the Bilingual Disadvantage?
Although many hemodynamic studies have reported activation differences between L1 and L2 production, very few of them coincide in the cortical regions where the differential activation is observed. Moreover, whether or not any differences are observed at all seems to depend on the type of L2 speakers that are tested (i.e., high or low proficient, early or late bilinguals). Currently the picture emerging from the neuroimaging literature is that (a) L2 speech production entails the same brain areas as L1 speech; and (b) the left inferior frontal gyrus (LIFG) is the only region showing a reliably stronger activity in L2 compared to L1 speech across all studies, but only for bilingual speakers with either low proficiency, little exposure, or late acquisition of their L2 (e.g., Indefrey, 2006).
Several interpretations have been proposed for this stronger involvement of frontal areas during L2 speech. In support of the executive control account, Abutalebi and Green (2007) argued that, since the same neural structures are used for processing both L1 and L2, areas associated with executive control (which is thought to involve the LIFG) have to be recruited more extensively in L2 to prevent interference from L1. As speakers become more proficient, the employment of this executive control network would become more or less equal between a bilingual’s two languages, hence only low proficient speakers should display increased hemodynamic brain activity in areas such as the LIFG. On the other hand, in support of the post-lexical account, Indefrey and collaborators interpreted the enhanced LIFG activation for low proficient L2 speakers in terms of non-lexical compositional processes such as syllabification (e.g., Indefrey, 2006; Hanulovà et al., 2011). Indefrey and colleagues hypothesized that the LIFG might be particularly tailored for native language speech with its specific post-lexical rules, thus being less efficient for L2 and consequently the prime suspect in causing delays. To support their interpretation the authors refer to the meta-analysis conducted by Indefrey and Levelt (2004) in which it was found that the LIFG was the only reliable active area across all overt and covert speech production studies. Since syllabification but not articulation is necessary in both overt and covert production, it was argued that the reliable activation of the LIFG in all production tasks could be indicative of syllabification processes (see also Indefrey and Levelt, 2000).
Nevertheless, one must be cautious when assigning such uniform functionality to the LIFG as both of the accounts we just discussed do, given that the LIFG seems to be a multi-functional region. For one, the LIFG appears to be involved in other operations such as syntax, the binding of linguistic information, and the selection of competing words (e.g., Thompson-Schill et al., 1997; Friederici, 2002; Hagoort, 2005; Schnur et al., 2009). Moreover, and crucially here, the LIFG also displays different activation patterns in function of word frequency, indicating that this brain area may also be associated (be it partially) with the mental lexicon (e.g., Graves et al., 2007). Converging evidence of the latter was provided by Sahin et al. (2009). Using intracranial recordings these authors found effects of lexical frequency around 200 ms, grammatical effects around 300 ms, and phonological effects around 400 ms after stimulus onset, all in the LIFG. If this area is indeed involved in all these different processes, interpreting the enhanced activity during L2 speech as either executive control or post-lexical syllabification seems premature. For such a claim to be made, it is necessary to demonstrate that the increase in activity in the LIFG associated with L2 speech is selectively present for an independent variable targeting control or post-lexical stages (while not for other variables). However, the different activation patterns of the LIFG reported for bilingual speech production stems from overall comparisons between L1 and L2 naming. In our opinion this observation can be associated with any language-related operation. In other words, the available neuroimage data, which has been used to support both the executive control and the post-lexical accounts of the L2 naming disadvantage, cannot be taken as a conclusive argument1. We see no reason why, for instance, the increased activity in the prefrontal cortex for L2 naming could not be an index of reduced frequency of use, thus presumably with a first impact during lexical processing. Put differently, at present the hemodynamic data can be compatible with all three accounts of the L2 speech disadvantage.
In addition, interpreting the data stemming from fMRI studies as directly mapping onto the behavioral differences that have been observed in the literature is problematic: no consistent differences in brain activity are found for early and/or high proficient bilinguals, yet differences are found behaviorally. Assuming that any differences between L1 and L2 become smaller with the gain of L2 proficiency and exposure, the lack of hemodynamic differences might be due to the lack of temporal sensitivity of the technique: While the overall brain response during L1 and L2 speech may be quite similar for highly proficient bilinguals, subtle differences in time might not be detectable with the slow bold response. Thus, while this technique might be useful to highlight the most pronounced differences between L1 and L2 speech, it is not likely to provide us with a complete picture of the mechanisms responsible for and the loci affected by the bilingual disadvantage (but see footnote 1). We will now discuss certain behavioral and electrophysiological studies which seem to be in a better position to uncover the origin of bilingual processing differences between L2 and L1 speech production.
What Can Behavioral Studies Tell us about the Origin of the Bilingual Disadvantage?
One way of examining the locus of the L2 disadvantage is to take a closer look at its manifestations beyond naming speed. As already mentioned, the L2 disadvantage has been observed in a variety of measures in speech production. Apart from the increased reaction times in picture naming (e.g., Gollan et al., 2008; Ivanova and Costa, 2008, for a summary see Table 1 in Hanulovà et al., 2011 and for an example see Figure 1), decreased performance of L2 production compared to L1 has been demonstrated in several tasks: For example, in a timed verbal fluency task, in which bilinguals were asked to generate as many exemplars as possible of a given semantic category (e.g., fruits), bilingual speakers retrieved less category members in L2 than in L1 (Sandoval et al., 2010). If the differences between L1 and L2 occur at a post-lexical level but not within the lexicon itself, it is difficult to explain why word accessibility in general is affected. That is, from a post-lexical perspective it is expected that in L2 speech retrieving post-lexical information should be more effortful and slower than in L1 (e.g., the phonetic realization of the/z/in zebra for a speaker whose L1 lacks a voiced “s”), but it does not predict that the access to the words themselves should become impaired. Of course, the fact that the task was administrated under time-pressure might invalidate such argument: the small delay caused by post-lexical processing difficulties might result in participants producing fewer words when time is limited. Nonetheless, this explanation cannot account for another related phenomenon which has been found to be sensitive to processing difficulties in L2, namely the so called tip-of-the tongue (ToT) state (i.e., feeling of knowing an object’s name, but being unable to retrieve it immediately). When bilinguals had to retrieve names of low-frequency objects in an un-timed picture naming task, they experienced more ToT’s in L2 than in L1 (e.g., Gollan and Silverberg, 2001). In a similar vein as argued before for the verbal fluency data, it is not straightforward why post-lexical processing difficulties should result in a reduced accessibility of words. That is, if the L2 disadvantage only stems from a less efficient post-lexical processing, then production might be both quantitatively (slower) and qualitatively (less native-like) modulated but not absent. Finally, in the standardized Boston Naming Test, L2 speakers scored fewer correct responses than L1 speakers (e.g., Kohnert et al., 1998; Gollan et al., 2007), which again might reflect a reduced accessibility of words in L2 that is not easily accommodated by a post-lexical account of L2 disadvantages. It must be pointed out though that these findings by themselves are far from conclusive and alternative interpretations could be entertained. For instance, all three data patterns could be explained in terms of vocabulary size: Words we do not know in our second language cannot be retrieved at all. If this is the case, these data say little about the locus of L1–L2 processing differences. Nevertheless, and as we will now see, the fact that similar disadvantages are observed for early and highly proficient bilinguals speaking in their first and dominant language when compared to monolingual speakers, makes an interpretation associated with vocabulary size implausible.
Figure 1. Figure taken from Ivanova and Costa (2008). Overall mean picture naming latencies for the Spanish Monolinguals (Group 1), the Spanish–Catalan Bilinguals (Group 2), and the Catalan–Spanish Bilinguals (Group 3) tested in Ivanova and Costa (2008), averaged across high-frequency and low-frequency picture names. Error bars represent the SE.
Basically, all phenomena we discussed so far with respect to a hampered L2 performance are also found in L1 when comparing bilingual versus monolingual speakers (e.g., Gollan et al., 2008; Ivanova and Costa, 2008; Sadat et al., in press). Although the L2 disadvantages do not necessarily have to stem from the same source as those in L1, the correspondence between data patterns and the fact that these are modulated by the same variables (e.g., lexical frequency, cognate status; see below) opens up the possibility of a common origin. If so, this poses difficulties for an account placing differences between L1 and L2 speech solely at a post-lexical level. This is because bilinguals speaking in their dominant language do not have a foreign accent nor are there reasons to suspect that they should experience difficulties in retrieving their natively acquired language specific post-lexical rules. While it could be argued that at high levels of proficiency (or for reversed language dominance) a bilingual’s native language gets influenced by the L2 and therefore leads to a post-lexical L1 disadvantage compared to monolinguals, this is not the pattern revealed empirically. That is, the hemodynamic differences between L1 and L2 thought to be related to post-lexical processes such as syllabification and phonotactics are only reliably observed for low proficient or late bilinguals; speakers for which the weak second language should have no or only a minimal impact on L1. In line with the idea that the L1 disadvantage for bilinguals originates prior to post-lexical stages, Pyers et al. (2009) observed that bimodal English – American Sign Language bilinguals showed more ToT states than English monolinguals. Although this result does not preclude phonological processing differences between two verbal languages, it is nevertheless interesting to see that a similar disadvantage as that reported for unimodal bilinguals is found even though the non-target language cannot compete post-lexically with the target language. This finding suggests that bilingualism also hampers the processing of modality-independent representations (e.g., “shared lemmas” across modalities) and not just modality-specific representations such as phonemes and syllables. This leaves two prime candidates for allocating the origin of the bilingual disadvantage, namely the conceptual and the lexical level. Regarding the former, Gollan et al. (2005) observed that bilinguals named object pictures more slowly than monolinguals, but both groups classified the object pictures equally rapidly into categories. The authors argued that monolingual and bilingual speakers accessed the objects’ semantic information similarly, and that bilingual disadvantages in naming emerged from post-semantic processing. Taken together, the available evidence comparing L1 speech production between bilinguals and monolinguals indicates that the bilingual disadvantage initiates somewhere between the semantic and phonological level. Consequently, if these differences are of the same sort as those revealed between L1 and L2 in bilinguals, a similar lexical account should be entertained for the latter. Such a unitary account of the bilingual disadvantages merits further testing since it offers a parsimonious way of disambiguating where first and second language production differ.
Arrived at this point, it is important to clarify that differences between L1 and L2 could also be expected at later stages. If we assume that effects percolate from early to later processing stages, bilingualism will affect all levels of linguistic processing. This can be illustrated by yet another measure which has been found to be sensitive to differences between L2 and L1 speech in comparisons between bilingual and monolingual speakers; namely the durations of the actual utterance. In tasks requiring single word or noun phrase production, bilingual speakers exhibited longer articulatory durations than monolinguals. For example, Sadat et al. (in press) observed that bilinguals required more time than monolinguals for the articulation of a bare noun (“car”) or noun phrase (“the red car”) when naming pictures. This finding illustrates that post-lexical processes such as articulatory programming are also less efficient during bilingual speech production (at least when compared to monolingual speech), suggesting that bilingualism affects language processing across the board. The question then is whether this effect should be considered as indexing independently originated post-lexical processing differences, or whether it is a mere consequence of the less efficient processing at the lexical stage. Both options are indeed possible since, aside from articulatory programming and other post-lexical processes, effects in articulatory durations have been associated with lexical processes (e.g., Kello et al., 2000; Gahl, 2008; Bell et al., 2009; Hanulovà et al., 2011). Future investigations will have to clarify whether such differences are merely due to spill-over effects from processing difficulties at previous stages or whether they constitute a different and independently contributing cause of the L2 disadvantage.
Having argued that the level where bilingualism starts but does not stop to exert influences is the lexicon, let us now turn to the potential mechanisms behind this disadvantage. Two potential accounts remain that are able to explain why lexical processing (as well as that of later stages) will be harder in a second language: The weaker links account and the executive control account. One set of studies that could be informative to differentiate between these two accounts are those manipulating the degree of cross-language interference induced by the task. If such interference and the consecutive engagement of executive control resources are responsible for the bilingual disadvantage, inducing a stronger competition should result in a greater disadvantage. Contrary to this prediction, several studies have found that the bilingual disadvantage is diminished for words that bilinguals can translate to their non-dominant language compared to words that they only know in their dominant language (e.g., Gollan and Acenas, 2004; Gollan et al., 2005). This finding is the opposite of what would be predicted by an interference based account since words that are not known in the non-target language cannot compete for selection and should thus be easier to retrieve in the target language. Another piece of evidence that is hard to accommodate within an interference based model is the fact that in some studies in which bilinguals are allowed to use both of their languages, their disadvantage relative to monolinguals is attenuated (e.g., Gollan and Silverberg, 2001). Given that the possibility of using both languages should presumably lead to higher activation levels for the non-target language than in a monolingual setting, more interference should be expected. And last but not least, in tasks of language switching where competition across languages arguably is at its maximal level, the disadvantage in L2 with respect to L1 does not only disappear but is even reverted in some studies (e.g., Costa and Santesteban, 2004; Christoffels et al., 2007; Gollan and Ferreira, 2009).
Another set of studies that have aimed at discriminating between both mechanisms are those manipulating lexical frequency since it has been argued that weaker links and the executive control accounts make different predictions regarding how this variable should modulate the bilingual disadvantage. The weaker links account claims that because bilinguals have used words in each language less often than monolinguals, all words would have a slightly lower frequency value for bilinguals than for monolinguals. Due to the logarithmic relationship between lexical frequency and naming speed, this frequency lag might not have a big impact on words that are used very frequently (i.e., high-frequency words such as “car”), while words that are used very rarely (i.e., low-frequency words such as “pestle”) might become almost inaccessible. In this way, the weaker links hypothesis predicts that bilinguals should show larger frequency effects than monolinguals (i.e., a greater disadvantage for low-frequency than for high-frequency words) and these effects should be larger in the non-dominant than in the dominant language. On the contrary, it has been argued that an executive control account of the bilingual disadvantage should predict greater disadvantages for high-frequent words. The argument here is that words that are used often are assumed to reach higher levels of activation when they act as translation competitors and should thus induce more interference and need more resources of executive control.
Several studies have tested these predictions and provided us with an interesting but complex set of results. For example, when comparing bilinguals’ and monolinguals’ performance in picture naming it has been found that bilinguals’ larger naming latencies are even more pronounced for low-frequency words (e.g., Gollan et al., 2008; Ivanova and Costa, 2008; but see Sadat et al., in press and Duyck et al., 2008, in word recognition), thus confirming the predictions of the weaker links hypothesis. However, the predictions regarding the non-dominant language have not been borne out in an equally consistent manner: while Gollan et al. (2008, 2011) found the expected larger frequency effect for non-dominant language picture naming, Ivanova and Costa (2008) failed to replicate this result. On the other hand, one study has been taken to support an executive control account of the bilingual disadvantage, namely that of Sandoval et al. (2010). In a verbal fluency task, it was observed that bilinguals tended to produce more low-frequency words than monolinguals (Sandoval et al., 2010). That is, in contrast to the increased disadvantage for low-frequency words in picture naming, bilinguals spontaneously produced a proportionally higher amount of low-frequency words than monolinguals, a finding that in principle seems to support the executive control account and cannot be explained by weaker links. This would imply that frequency of use would be the mechanism responsible for the greater frequency effect in bilinguals in picture naming, while executive control would be responsible for the proportionally higher amount of low-frequency words produced in the verbal fluency task. Although it is possible that the task can have a decisive role for the mechanism behind the bilingual disadvantage, the results of Sandoval et al. (2010) require replication before jumping to such dual mechanism conclusions. Also, and aside from these contrasting findings, the manipulation of lexical frequency might not be that useful to disentangle the different accounts of the bilingual disadvantage as originally thought. This is because one could easily conceive an executive control account predicting that low-frequency words should suffer more from lexical competition than high-frequency words. That is, if we assume that there is always competition in the lexical system, weak representations (such as low-frequency words or words in one’s second language) may be more vulnerable in general to the hampering effects of such competition compared to strong representations, hence requiring more executive control resources. Thus, if we assume that the potential extra amount of interference coming from high-frequent translation words is smaller than the net amount of interference coming from all competitor words on a given representation, then weak representations (such as low-frequency words and words in the second language) should still suffer the most and call for a greater amount of executive control resources. Nevertheless, even in such a scenario, executive control would not be exclusively responsible for the bilingual disadvantage since it would be bound to the frequency values of lexical representations. Moreover, the recruitment of such executive control would not be exclusive to bilinguals since it would not be triggered by interference from translation competitors, but rather by low lexical frequency.
In sum, most of the behavioral findings at our disposal show that the bilingual disadvantage is likely to originate at some point during lexical processing and that frequency of use seems to have an important role in this phenomenon. Nevertheless, concluding that the employment of the executive control network would not have any influence at all, especially in certain tasks, would be premature since the data do not conclusively argue against such an involvement. And more generally, correlating the net result stemming from behavioral data with a particular stage of processing in time is not straightforward. Therefore, in what follows we will examine studies that compare L1 versus L2 naming employing the fine-grained temporal technique of event-related brain potentials (ERPs). Doing so might aid our understanding how these differences in time arise.
What Can the Use of ERPs Tell us about the Origin of the Bilingual Disadvantage?
In this part we will particularly focus on ERP studies which, aside from manipulating response language, also manipulated the linguistic variables of lexical frequency and cognate status. As seen above, frequency is an interesting variable to explore, since it has been shown to modulate the bilingual disadvantage. Therefore, comparing the time-course between a frequency effect and a language effect will be informative to determine (a) the onset of the bilingual disadvantage (locus), and (b) how similar, both in time and in terms of waveform morphology, the frequency and the language effect are (mechanism). Similarly, cognate status (the amount of phonological overlap between translation words) has been found to affect the bilingual disadvantage: Cognate words elicit fewer ToT states and are named faster and more accurately than non-cognate words (e.g., Costa et al., 2000; Gollan and Acenas, 2004; Kohnert, 2004). Thus, just as for lexical frequency, the comparison between the cognate effect and the language effect in time (using ERPs) should provide some useful insights regarding the locus and potentially the mechanisms producing processing differences between a bilingual’s first and second language.
Strijkers et al. (2010) report two overt picture naming experiments in which both lexical frequency and cognate status were manipulated. Early and high proficient bilinguals named the same set of pictures either in their L1 (Spanish for a group of Spanish–Catalan bilinguals) or in their L2 (Spanish for a group of Catalan–Spanish bilinguals) while the EEG was recorded simultaneously to the overt response. The authors found an early effect that was practically identical for lexical frequency and cognate status: larger amplitudes were found in a positive going waveform around 200 ms after picture onset for the more difficult conditions (i.e., low frequency and non-cognate names; see also Costa et al., 2009 for a similar finding related to semantic interference). Crucially for the present purposes, a between group comparison showed that the effect of response language (i.e., the point in time where L1 and L2 started to diverge) elicited identical electrical changes as those observed for the frequency and cognate effects and in the same time window. That is, P2 amplitudes during L2 picture naming were increased compared to L1 picture naming. This finding has important implications for the issue of localization of the bilingual disadvantage: While it is possible to argue that lexical frequency correlates with conceptual variables, for cognate status this does not apply. In addition, a time-course of 200 ms after stimulus onset seems early to reflect post-lexical processing and is more likely to reflect initial stages of lexical access or at best lexico-phonological encoding (lexeme retrieval). Therefore, the authors concluded that both lexical frequency and cognate status originate during access to the lexicon, a claim which is in line with evidence from behavioral, hemodynamic, and patient data (e.g., Navarrete et al., 2006; Almeida et al., 2007; Graves et al., 2007; Kittridge et al., 2008; Knobel et al., 2008). Note that the first electrophysiological differences between L1 and L2 naming were measurable at the P2 component just as the differences between high and low-frequency words as well as cognates and non-cognates. The fact that response language modulated the same ERP component as the variables of lexical frequency and cognate status supports the notion that differences between L1 and L2 speech production originate during lexical processes. Convergent evidence can be found in another ERP study where cognate status was manipulated within participants and response language between participants in an overt picture naming task (e.g., Christoffels et al., 2007, personal communication but also visible in their Figure 52). Finally, Strijkers et al. (in preparation) manipulated cognate status and lexical frequency in a study where the same participants named different pictures in both their languages (Spanish L1 or L2 and Catalan L1 or L2). Again the same P2 component was modulated for all three variables (i.e., response language, cognate status, and lexical frequency), alleviating concerns regarding potential variability due to the between group comparisons in the previously mentioned studies. In other words, the data collected from overt naming ERP experiments demonstrates that L1 and L2 processing start diverging from each other during the initial phases of lexical access, confirming the inferences deriving from the behavioral data.
Regarding the underlying mechanisms, it is interesting to see the similarity between the language effect and the frequency and cognate effects, respectively, in the ERPs. Low-frequency L1 words seem to behave in the same manner as high-frequency L2 words and the same pattern emerges when comparing the electrophysiological signature of cognate status between languages (see Figure 2). At first sight, this pattern of results supports a frequency based explanation of the relative difficulty to access to lexical representations in one’s second language. Nevertheless, one should be cautious here. The fact that the ERP expression of the language effect overlaps perfectly with that of the frequency and cognate effects, namely a more positive brain response for the harder condition (low frequency, non-cognates, L2), is not inconsistent with the mechanism behind these effects being driven by the amount of executive control applied to the lexicon. It all depends on what this P2 modulation indexes. Regarding the functional significance of the P2 component, in a recent monolingual study Strijkers et al. (2011) demonstrated that lexical modulations at the P2 seem to be elicited only when there is a conscious intention to speak. The authors argued that this particular P2 (which they labeled descriptively the production P2, pP2) is engendered by the interaction between goal-directed top-down processes such as attention with the level of activation of items within the lexicon. If we use this functional characterization of the pP2 to interpret the language effect in the previously discussed ERP results, both weaker links between concepts and words in L2 compared to L1 and a stronger recruitment of executive control (although understood here as proactive attention) during L2 speech compared to L1 speech may contribute to the bilingual disadvantage as indexed by the pP2. That is, processing differences between L1 and L2 during lexicalization would emerge because L2 representations have lower levels of activation overall within the lexicon compared to L1 representations. At the same time, the lower activation level of L2 words will call for more executive control (understood here as proactive attentional resources) to retrieve words in L2 compared to L1. Thus, bilinguals would enhance a priori the lexical representations related to the target language (see also Wu and Thierry, 2011), and this top-down enhancement would be greater for less accessible representations such as low-frequency words or words in the second language. In such a scenario, the main source of the bilingual disadvantage would be frequency of use, since the additional resources of attention during L2 speech are invoked exactly to compensate for the worse accessibility in L2 and thus speed up rather than slow down the behavioral performance. It should be noted that the type of executive control involved would be rather different from that portrayed by Abutalebi and Green (2007) according to whom the extra involvement of the executive control network is directly related to bilingualism (i.e., the purpose of executive control is to reactively resolve interference from translation words, representations specific to bilingual speakers). In contrast, here we propose that the crucial factor for determining the degree of executive control engagement is the strength of a certain representation, a variable which is general to all speakers and not specific to bilingualism. That is, bilingualism would exert an indirect influence on the extra-linguistic processes: Through the division of speech between two languages and the subsequent lower overall strength of lexical representations for a bilingual (especially in the non-dominant language), preparing the system for speech will require more, but not different, proactive attentional resources in an L2 compared to an L1, or for a bilingual compared to a monolingual speaker. Note that this hypothesis regarding the cognitive source of the pP2 requires further testing and that it does not preclude that additional resources of reactive executive control are engaged later on. The main point we wish to make here is that the more simple solutions should be thoroughly considered before embracing theories involving qualitative differences between monolingual and bilingual language processing.
Figure 2. Figure taken from Strijkers et al. (2010). (A) Shows low-frequency and high-frequency ERPs compared with non-cognate and cognate ERPs at Cz in Experiment 1 (right) and Experiment 2 (left). The frequency ERPs are represented by a full gray and black line. The cognate ERPs are represented by a dotted gray and black line. Negativity is plotted upward. (B) Shows a between experiments comparison of the low- and high-frequency ERPs (left), non-cognate and cognate ERPs (right), and overall naming in L1 and naming in L2 ERPs (under). Negativity is plotted upward.
In sum, reviewing the electrophysiological evidence provides good grounds to believe that the bilingual disadvantage originates during lexicalization and that a reduced frequency of use is the direct cause of the hampering effects on linguistic processing associated with bilingualism. We have also seen that an indirect and additional role of executive control is possible, although some of the available ERP evidence opens up the possibility that this executive control consists in a speaker-general proactive enhancement of weak lexical representations.
The overall picture of the origin of the bilingual disadvantage that emerges when combining the different pieces of evidence that we have briefly reviewed in the present article is the following: both the behavioral and the available ERP evidence indicates an early lexical origin of the processing differences between L1 and L2, although these differences seem to persist until the very moment of articulation. Furthermore, the simplest explanation for the bilingual disadvantage relates to reduced frequency of use whereas the engagement of additional resources of executive control are likely to attenuate rather than increase lexical retrieval difficulties. Frequency being a variable that affects all speakers, this conclusion entails that speech production differences between L1 and L2 and between monolinguals and bilinguals are essentially a matter of quantity: In this case, “the more the better.”
Conflict of Interest Statement
The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.
This work was supported by grants from the Spanish government (PSI2008-01191, Consolider Ingenio 2010 CSD2007-00012) and the Catalan government (Consolidado SGR 2009-1521), a predoctoral grant from the Catalan government (FI) to Elin Runnqvist and predoctoral grants from the Spanish government (FPU) to Kristof Strijkers and Jasmin Sadat.
- ^Let it be clear that we do not wish to claim that the hemodynamic technique overall will not be able to differentiate between the accounts. We merely argue that the studies available in the literature so far are insufficient to do so. For instance (though the argument is not restricted to this example), Abutalebi and Green, 2007; see Abutalebi et al., 2008) claimed that the increased bold response in the left prefrontal cortex (along with other regions) for the same words in a bilingual versus a monolingual naming context is an insightful illustration of this region’s role in language control. However, one can easily argue that in a bilingual naming context cross-language activation gets boosted, making lexical access, and selection harder. In this scenario the LIFG increase might be related to an increase in processing within the lexicon and not per se a reflection of executive control. Demonstrating differences in brain responses between language tasks relying on a different amount of control is insightful, but insufficient. It is necessary to show within the same task that the pattern of activation of a linguistic variable (e.g., lexical frequency) remains constant over the particular executive control conditions in those regions thought to be responsible for language control in order to provide explicit evidence. To our knowledge, such endeavor has not been undertaken yet.
- ^Note that this result is not reported in Christoffels et al. (2007) and that Hanulovà et al. (2011) use the lack of reported early language effects as an argument for a post-lexical locus of the bilingual L2 naming-delay.
Abutalebi, J., Annoni, J. M., Zimine, I., Pegna, A. J., Seghier, M. L., Hannelore, L. J., Lazeyras, F., Cappa, S. F., and Khateb, A. (2008). Language control and lexical competition in bilinguals: an event-related fMRI study. Cereb. Cortex 18,1496–1505.
Costa, A., Strijkers, K., Martin, C., and Thierry, G. (2009). The time course of word retrieval revealed by event-related brain potentials during overt speech. Proc. Natl. Acad. Sci. U.S.A. 106, 21442–21446.
Gollan, T. H., and Acenas, L. A. (2004). What is a TOT? Cognate and translation effects on tip-of-the-tongue states in Spanish–English and Tagalog–English bilinguals. J. Exp. Psychol. Learn. Mem. Cogn. 30, 246–269.
Gollan, T. H., and Ferreira, V. S. (2009). Should I stay or should I switch? A cost-benefit analysis of voluntary language switching in young and aging bilinguals. J. Exp. Psychol. Learn. Mem. Cogn. 35, 640–665.
Gollan, T. H., Montoya, R. I., Cera, C., and Sandoval, T. C. (2008). More use almost always means a smaller frequency effect: aging, bilingualism, and the weaker links hypothesis. J. Mem. Lang. 58, 787–814.
Gollan, T. H., Slattery, T. J., Goldenberg, D., Van Assche, E., Duyck, W., and Rayner, K. (2011). Frequency drives lexical access in reading but not in speaking: the frequency-lag hypothesis. J. Exp. Psychol. Gen. 140, 186–209.
Graves, W. W., Grabowski, T. J., Mehta, S., and Gordon, J. K. (2007). A neural signature of phonological access: distinguishing the effects of word frequency from familiarity and length in overt picture naming. J. Cogn. Neurosci. 19, 617–631.
Hanulovà, J., Davidson, D. J., and Indefrey, P. (2011). Where does the delay in L2 picture naming come from? Psycholinguistic and neurocognitive evidence on second language word production. Lang. Cogn. Process. 26, 902–934.
Indefrey, P. (2006). “A meta-analysis of hemodynamic studies on first and second language processing: which suggested differences can we trust and what do they mean?” in The Cognitive Neuroscience of Second Language Acquisition, eds M. Gullberg and P. Indefrey (Malden, MA: Blackwell Publishing), 279–304.
Kello, C. T., Plaut, D. C., and MacWhinney, B. (2000). The task dependence of staged versus cascaded processing: an empirical and computational study of Stroop interference in speech perception. J. Exp. Psychol. Gen. 129, 340–360.
Kittridge, A. K., Dell, G. S., Verkuilen, J., and Schwartz, M. F. (2008). Where is the effect of frequency in word production? Insights from aphasic picture-naming errors. Cogn. Neuropsychol. 1, 1–30.
Schnur, T. T., Schwartz, M. F., Kimberg, D. Y., Hirshorn, E., Coslett, H. B., and Thompson-Schill, S. L. (2009). Localizing interference during naming: convergent neuroimaging and neuropsychological evidence for the function of Broca’s area. Proc. Natl. Acad. Sci. U.S.A. 106, 322–327.
Thompson-Schill, S. L., D’Esposito, M., Aguirre, G. K., and Farah, M. J. (1997). Role of left inferior prefrontal cortex in retrieval of semantic knowledge: a reevaluation. Proc. Natl. Acad. Sci. U.S.A. 94, 14792–14797.
Keywords: second language speech production, bilingual disadvantage, first versus second language processing differences
Citation: Runnqvist E, Strijkers K, Sadat J and Costa A (2011) On the temporal and functional origin of L2 disadvantages in speech production: a critical review. Front. Psychology 2:379. doi: 10.3389/fpsyg.2011.00379
Received: 04 May 2011;
Accepted: 29 November 2011;
Published online: 16 December 2011.
Edited by:Guillaume Thierry, Bangor University, UK
Copyright: © 2011 Runnqvist, Strijkers, Sadat and Costa. This is an open-access article distributed under the terms of the Creative Commons Attribution Non Commercial License, which permits non-commercial use, distribution, and reproduction in other forums, provided the original authors and source are credited.
*Correspondence: Elin Runnqvist, Departamento de Psicología Básica, Universitat de Barcelona, Passeig de la Vall d’Hebron, 171, 08035 Barcelona, Spain. e-mail: email@example.com