Age and Proficiency in the Bilingual Brain Revisited: Activation Patterns Across Different L2-Learner Types

complication that potentially contributes further to this lack of clarity. Careful control of these variables is crucial for teasing apart their effects, yet almost all previous neuroimaging studies have studied one or the other in isolation. Thirty ﬁve participants of varying proﬁciency and AoA were scanned using fMRI while performing an English (L2) past tense task; all were formal L2 learners. Early high proﬁciency bilinguals (EAHP) were contrasted with late high proﬁciency (LAHP) in three conditions: (i) regular inﬂection; (ii) irregular inﬂection; and (iii) regularity × AoA. In line with previous ﬁndings, LAHP (vs. EAHP) bilinguals showed more extensive activation across multiple regions for both regular and irregular inﬂection. The left inferior frontal gyrus (IFG; BA47) was one region that showed signiﬁcant activation in condition (iii). EAHPs engaged this region selectively for regular but not irregular inﬂection while LAHPs activated it during both types of inﬂection. Late high and late low proﬁciency (LALP) bilinguals were also contrasted in three conditions: (i) regular inﬂection; (ii) irregular inﬂection; and (iii) regularity × proﬁciency. In all regions showing signiﬁcant differences, LAHPs showed greater activation relative to LALPs (regular and irregular conditions). In the regularity × proﬁciency condition the left IFG was also a signiﬁcantly activated region. Previous studies suggest this region is positively associated with high proﬁciency but this has not always been replicated. LAHPs showed increased activation in BA45 but not BA44, suggesting L2 is a controlled rather than automatic process in this group despite being highly proﬁcient. Our study suggests AoA and proﬁciency both inﬂuence bilingual brain activation independently, an important replication given only two other neuroimaging studies have experimentally manipulated both variables within the same study. We also provide evidence for how different AoA inﬂuences left IFG engagement during L2 processing, and for the hypothesis that BA45 is associated with high proﬁciency when degree of automaticity is lower.


INTRODUCTION
In the study of the bilingual brain it has been shown that activated regions underlying a bilingual's first (L1) and second (L2) languages do not always overlap (e.g., Kim et al., 1997) and attempts have been made to identify the factors that contribute to these differences.L2 age of learning/acquisition (AoA) and L2 proficiency are two factors that have received close attention over the last 20 years.However, despite this close scrutiny we are actually not that much closer to knowing exactly how each factor influences the bilingual brain (Watkins et al., 2017).A specific question that remains unanswered is which of the two is more crucial in determining the pattern of brain activation during L2 processing, and related to that is how each factor influences functional activation.Addressing such questions has a potential impact on larger theoretical questions such as those about neural plasticity and the effects of experience as well as degree of automaticity on the bilingual brain (Li et al., 2014;Nichols and Joanisse, 2016;Vissiennon et al., 2017).
One difficulty when examining either L2 AoA or proficiency is that the two are closely intertwined and therefore possible confounds of each other.Further complicating the picture is that AoA may be confounded with the way L2 was learnt, referred to as either "modality of learning" (Fabbro, 2001) or "mode of learning" (Marrero et al., 2002).Many early L2 bilinguals learn their L2 informally or via immersion while late bilinguals tend to learn in more formal, non-immersion environments.But this is obviously not always the case.Thus, many previous studies may have thought they were studying a homogenous group of early or late bilinguals but it is possible that there were hidden differences related to mode of learning within each group.Previously reported differences could therefore have been a reflection of mode of learning rather than AoA per se.
In order to avoid the confound between AoA and proficiency, previous neuroimaging studies have usually held one of these constant while the other was examined, i.e., they examined only one of the factors in isolation.Perani et al. (1998) were one of the first groups to do this, using positron emission tomography (PET).An earlier study of theirs (Perani et al., 1996) had shown that auditory processing of stories in the L1 (vs.L2) of late acquisition (age >7 years)/low (L2) proficiency Italian-English bilinguals caused more extensive activation in the temporal lobes and temporoparietal cortex.But these findings were difficult to interpret because both AoA and proficiency were different in each language-L1 had an early AoA plus high proficiency while L2 had a late AoA plus low proficiency.When Perani et al. (1998) addressed this issue by holding proficiency constant in their subsequent study, they found no differences in activation between the two groups.Because previously found activation differences disappeared once proficiency was equated, they concluded that L2 proficiency rather than AoA determines differences in L2 brain activation.
The debate has continued for years since, not helped-as Waldron and Hernandez (2013) point out-by the fact that few (neuroimaging) studies examining L2 AoA have adequately controlled for L2 proficiency properly.Some studies have found support for Perani et al.'s (1998) conclusion on proficiency being the determining factor (e.g., Frenck-Mestre et al., 2005;Hesling et al., 2012).Others have not found such support; AoA effects have been demonstrated even when proficiency was held constant (e.g., Saur et al., 2009;Consonni et al., 2013;Archila-Suerte et al., 2015;Hernandez et al., 2015).A smaller number have argued that both play a role but affect the brain differently, for example proposing that AoA influences regions involved in grammatical processing while proficiency affects those used in semantic processing (Wartenburger et al., 2003).A recent metaanalysis of neuroimaging studies spanning the period 1998-2014 (Liu and Cao, 2016) examined how L2 AoA affects L1 vs. L2 networks and reported that late (vs.early) bilinguals rely on additional regions when processing L2 compared to their L1.However, while this meta-analysis confirmed AoA does affect the bilingual brain it focused exclusively on studies that recruited high proficiency bilinguals.Papers with low or moderate proficiency bilinguals were excluded, making it difficult to draw any firm conclusions about L2 proficiency effects.
Interestingly, very few imaging studies have manipulated both L2 AoA and proficiency in the same study-as described above they have tended to manipulate one factor while keeping the other constant.Two notable exceptions are Wartenburger et al. (2003) and Nichols and Joanisse (2016).Wartenburger et al. (2003) experimentally manipulated the two factors by comparing three groups of German-Italian bilinguals differing in their AoA (early/late) and proficiency (high/low): an early acquisition, high proficiency group (EAHP; n = 11), a late acquisition, high proficiency group (LAHP; n = 12) and a late acquisition, low proficiency group (LAHP; n = 9).The early bilinguals had been exposed to their L2 since birth, while mean AoA for late bilinguals was 18.9 years (LAHP) and 20.4 (LALP) years, respectively.Participants were scanned using functional magnetic resonance imaging (fMRI) while performing grammatical and semantic judgements on presented sentences.Wartenburger et al. (2003) found that L2 AoA was more important in determining brain regions involved in L2 grammatical processing, while L2 proficiency was more important in determining those involved in L2 semantic processing.This finding that grammatical processing might be influenced differently to semantic processing is perhaps not so surprising; as Wartenburger et al. (2003) point out, findings in other domains (monolingual electrophysiological, lesion and functional neuroimaging studies) have led to the idea that grammar is acquired incidentally and implicitly while lexicalsemantic processing is carried out by the explicit memory and knowledge system (Paradis, 1994;Ullman, 2001Ullman, , 2004Ullman, , 2005Ullman, , 2016;;Lebrun, 2002;Hernandez and Li, 2007;Abutalebi, 2008).But what is significant about Wartenburger et al.'s (2003) study is that by directly manipulating each factor it was able to show what the respective contributions of AoA and proficiency might be.Nichols and Joanisse's (2016) study is the other to have tested the independent effects of both L2 AoA and proficiency on L1 and L2 neural activation.Twenty two Mandarin-English bilinguals of varying L2 AoA and proficiency were scanned while performing a lexical-semantic task (picture-word matching); both functional and structural data were collected.The bilinguals in this study were shown to have a weak relationship between AoA and proficiency.This allowed the researchers to take a different approach to Wartenburger et al. (2003), in that L2 AoA and proficiency were treated as continuous variables.Like Wartenburger et al. (2003), Nichols and Joanisse (2016) also found both L2 AoA and proficiency effects although in this study AoA effects were found in the realm of semantic processing.AoA was found to uniquely predict activity differences in seven areas that included the right parahippocampal gyrus, bilateral superior temporal gyrus (STG) and bilateral inferior frontal gyrus while proficiency was uniquely associated with activation in the left parahippocampal gyrus and right cingulate (Nichols and Joanisse, 2016).
That there are only two studies that have specifically looked at both factors is surprising given the potential confound that exists between the two and the current lack of consistent findings among neuroimaging studies on bilinguals (see also Wong et al., 2016 for a review).The importance of replication in neuroimaging has recently been highlighted (Evans, 2017), particularly given that the high cost of data collection often leads to smaller sample sizes.Both direct and conceptual replication studies have a crucial role to play in increasing the confidence in neuroimaging findings (Open Science Collaboration, 2015), with the latter referring to a study in which a previous result or hypothesis is tested with different methods (Schmidt, 2009;Evans, 2017).
The current study was conducted to examine (i) the independent contributions of L2 AoA and proficiency to L2 neural representation, and (ii) the nature of those contributions and what light it might shed on questions about the L2 learning such as plasticity and automaticity of processing.Given the first aim, our study could also be viewed as an attempt to conceptually replicate the findings of Wartenburger et al. (2003) and Nichols and Joanisse (2016), and contribute to neuroimaging data on L2 AoA and proficiency.Following Wartenburger et al. (2003) both L2 AoA and proficiency levels in our study were experimentally manipulated.Mode of learning was kept constant across all participants in recognition of its status as a potential confound.We used a past-tense task to ensure task uniformity across different language processes (computational for regular verbs vs. lexical access for irregular verbs; see Ullman, 2001) and to keep processing to a single-word level (see section Stimuli for further explanation).

Participants
Thirty five bilingual Mandarin-English participants, all of whom learnt English at school (i.e., formal setting) were assigned to one of three groups: Early Acquisition High Proficiency (EAHP; n = 12), Late Acquisition High Proficiency (LAHP; n = 12) or Late Acquisition Low Proficiency (LALP; n = 11).While there are advantages to Nichols and Joanisse's (2016) approach of treating AoA and proficiency as continuous variables we chose not to do this given we could not guarantee the decoupling of L2 AoA and proficiency in our sample in the way they were able to.The study received ethical approval from the local Institutional Review Board (Domain Specific Review Board (A), National Healthcare Group, National University Hospital, Singapore) and all individuals provided written informed consent prior to participation.
Following previous neuroimaging studies (Perani et al., 2003;Meisel, 2004;Waldron and Hernandez, 2013;Klein et al., 2014) the cut-off for early bilinguals was set at age 7, although we do acknowledge that there is disagreement over what this cut-off age ought to be and in some cases debate over the traditional notion of the cut-off age itself.L2 (English) proficiency was determined via the Self Report Classification Tool (Lim et al., 2008), which has been validated on 198 undergraduates studying at a Singapore university and was chosen because it best reflected our bilingual context; our participants were also all students at the National University of Singapore.Speaking and listening proficiency ratings in the questionnaire were used for the purpose of this study; in a small number of cases where the two were different from each other the average of the two ratings were taken.Li et al. (2019) and Tomoschuk et al. (2018) have discussed the downside of relying purely on selfreported proficiency.To minimize subjectivity in ratings and to help our participants identify which rating best suited them, all participants were provided with a detailed guide explaining with examples what they should be able to do in their L2 for each rating.Research assistants were also on hand to address any doubts participants had.Participants' self-ratings were consistent with their General Certificate of Education (GCE) Ordinary Level English examination grades, which we also recorded as part of each participant's profile.The GCE paper examines written and oral production and comprehension.
EAHP and LAHP bilinguals were matched on chronological age and proficiency (Table 1).
The two groups differed significantly in their mean L2 AoA; mean AoA for EAHP was 4.75 years vs. 9.38 years for LAHP (p = 0.002).
All participants were right-handed and had no history of neurological damage.

Experimental Paradigm
While lying in the MRI scanner (Siemens (Erlangen, Germany) 1.5T Symphony MRI scanner), participants were presented with the present tense of a verb (regular or irregular).Participants were told to think of the past tense form but not answer aloud until they saw a row of fixation crosses ("++++++"), which appeared during the following 2 s silent period.Participants produced their answers aloud during this silent period (i.e., overt rather than covert generation) in order to minimize artifacts associated with head movement.Stimuli were presented using Presentation software (version 0.60, Neurobehavioural Systems, USA).Lowercase words appeared for 3 s in white Arial font (size 60) against a black background.Participants received the same oral instructions at the beginning of the experiment not to speak until the fixation crosses were presented, in accordance with our previous study on monolingual speakers (Oh et al., 2011).Participants were also reminded at the start of each run to provide the past tense form of the presented words aloud during the silent period.
Responses were audio recorded by one of the experimenters using an Olympus digital audio recorder (model DM20), and then transcribed.Any errors were noted and excluded from the imaging analysis.
Participants were given a familiarization session outside the scanner to familiarize them with the experiment.The 20 verbs used in the familiarization session were not used in the actual scanning session.

Stimuli
The stimuli in this experiment were mainly based on those used by Oh et al. (2011).Stimuli were made up of 40 regular and 40 irregular verbs (total n = 80).Past tense inflection is one way to examine both lexical access (via irregular inflection, thought to tap into semantic processes) and "grammatical" or morphological processing (via regular inflection, thought to reflect computational processes; see Ullman, 2001Ullman, , 2004Ullman, , 2005Ullman, , 2016)).In other words, past tense inflection allowed us to examine two different processes in the way Wartenburger et al. (2003) did, but at the single-word level and within a single task.It has been pointed out that the interpretation of neuroimaging data becomes more difficult as complexity of the task increases ( Fabbro, 2001).Thus, using this past tense task helped avoid the interpretation problem that comes with more complex sentencelevel processing tasks or comparing activation across different tasks (see Fabbro, 2001).Verbs were matched (pair-wise) for log frequency (of the past tense form), number of phonemes and phonological complexity.The latter was defined in terms of Consonant-Vowel (CV) structure (see Bird, 2003), where a CCVCC structure (as in the word "trust") would be more complex than a CVC structure present in a word such as "sit."Frequency of words was determined according to the Singapore International Corpus of English (ICE) database (http://ice-corpora.net/ice/index.html),which is part of the International Corpus of English (Greenbaum, 1996).There were four runs, each consisting of 10 regular and 10 irregular verbs randomly presented.The two groups of 40 verbs were randomly assigned to each of the four runs.Please see Appendix A for the full list of verbs.

Image Acquisition and Data Analysis
Images were acquired using blipped gradient-echo planar imaging (flip angle = 90 • ; 64 × 64 pixel matrix; FOV = 192 × 192 mm).Acquisition time was 3,000 ms, followed by 2,000 ms of silence.During acquisition, participants saw a verb and were instructed to think of the past tense of the verb in silence.Once the word was replaced by a series of crosses "+++++" (i.e., during the silent period), participants provided their answers aloud.Time to repetition (TR) was therefore 5,000 ms.
Seventy two images (32 oblique axial 3-mm slices with 0.3 mm gap, descending interleaved) were collected per run, depending on the randomly inserted (for jittering purposes) baseline period where no words appeared; only a "++++++" was seen on the screen during these baseline periods.The periods could last 5,000, 10,000, or 15,000 ms.In each run there was a two-in-six chance of a 5,000 ms period appearing after a trial, a three-in-six chance of a 10,000 ms period appearing and a one-in-six chance of a 15,000 ms period appearing.The average length of a run was 355 s (minimum 350 s, maximum 360 s).In addition, 50 s and at least 30 s of fixation were presented at the beginning and end of each run, respectively, to ensure signal homogeneity (the first four acquisitions were ignored) and sufficient baseline periods.
Brain Voyager QX (version 2.3, Brain Innovation, Holland) was used to analyze images.Slice scan time correction, motion correction, spatial smoothing (8 mm FWHM) and linear trend removal were applied to these functional images.These were then registered to the MPRAGE (magnetization prepared rapid acquisition gradient echo) images, and the realigned data then transformed into Talairach space.A random effects multi subject general linear model (GLM) was then computed.This hierarchical analysis entailed a first level analysis in which all experimental conditions for each subject were modeled as separate regressors.Each regressor was convolved with a canonical haemodynamic response function (HRF) peaking 5 s after onset of word presentation (Henson and Friston, 2007).The resulting GLM thus contained 3 regressors per subject: GenReg (generation of regular past tense verbs), GenIrreg (generation of irregular past tense verbs) and Others (for errors).Each regressor was then analyzed at a second level using separate group-level random-effects t-tests: EAHP vs. LAHP, as well as LAHP vs. LALP.The resultant group-level statistical parametric t-maps were corrected for multiple comparisons using clustersize thresholding, described below (Forman et al., 1995;Goebel et al., 2006).
Each map was initially thresholded at a voxel-wise p-value (p < 0.01, uncorrected) that yielded distinct segregated regions of interest (ROIs).These maps were then subjected to a whole brain (no mask) correction criterion based on the estimate of the map's spatial smoothness (the FWHM was estimated by BVQX to be 1.417 in native voxel resolution for all contrasts) and 1,000 iterations of Monte Carlo simulation to determine the minimum cluster size threshold.The thresholds determined for each contrast are listed, respectively, in the appropriate tables (Tables 3, 4).These cluster-size thresholds were then applied to the group-level statistical t-maps to yield a corrected 5% false positive rate.We selected any voxels that were activated above the indicated threshold (p < 0.05, corrected) and reported the peak for each significantly activated cluster (Tables 3, 4).In addition, z-normalized regressor values (averaged across all ROI voxels) for each condition for each group were extracted and further interrogated by plotting each regressor relative to the fixation baseline (zero).The inclusion of a fixation baseline also allowed the estimation of HRF predictors for each of these conditions of interest for each group of participants.
Between-group comparisons for behavioral data were made using independent sample t-tests as appropriate.Within-group comparisons were evaluated using paired t-tests.Statistical significance was accepted at p < 0.05.

Behavioral Results
Table 2 shows the mean accuracy rate and standard deviation for each of the three groups in the past tense task.There were no differences in performance between the early and late high proficiency bilinguals.However, low proficiency bilinguals made significantly more errors in the tasks.Given that participants had to wait till the 2,000 ms silent period (after the word was presented for 3,000 ms) to provide their answer in the scanner, no reaction time data are provided here.
No significant difference in overall accuracy rate was found between the two high proficiency groups (EAHP = 90.94% and LAHP = 90.10%),despite one group learning their L2 significantly later than the other.The LAHP group performed significantly better overall than the LALP group [t (22) = 2.54; p = 0.02].This difference was attributable to the performance difference in irregular rather than regular verbs.Analysis showed that LAHP performed significantly better on irregular verbs relative to LALP [t (22) = 2.40; p = 0.03] but not on regular verbs (p > 0.05).
All participants performed significantly better when inflecting regular verbs [EAHP [t (11) =4.42; p = 0.001]; LAHP [t (11) = 3.52; p = 0.005]; LALP [t (10) = 5.49; p = 0.0003] as compared to irregular verbs.Regular and irregular verbs were controlled for word frequency and phonological complexity and so this finding is not attributable to regular verbs being more familiar or to them being less phonologically complex.Errors in irregular past tense forms tended to be overregularizations (e.g., providing "slided" as the past tense for "slide") or, in a very small number of cases the wrong tense (e.g., providing the perfect past instead of simple past: "shrunk" for "shrink").In a small number of other cases (for both regular and irregular verbs), no response was made.Errors and trials on which no response was made were modeled as a dummy variable for the imaging analysis.

Imaging Results
Our imaging findings show that both L2 AoA and proficiency influence brain activation, independently of each other.How To examine AoA effects, we compared brain regions activated by EAHP vs. LAHP bilinguals in three conditions: (i) during regular inflection (see Part 1 in Table 3); (ii) during irregular inflection (Part 2 in Table 3) and (iii) regularity × AoA (Part 3 in Table 3).
A whole brain analysis was applied and significantly activated regions in each of these conditions are reported in Table 3.Only those regions significantly activated above threshold for cluster size (at a corrected p < 0.05) are reported.Regions are arranged anterior to posterior and left regions are reported first.Significantly greater activation was found for LAHP bilinguals relative to EAHP bilinguals in both the regular and irregular conditions, respectively (1a and 2a in Table 3).EAHP bilinguals did not differ significantly from baseline in either condition and so no significantly activated regions are reported for this group.LAHP bilinguals showed activation across the left and right hemispheres for both regular and irregular inflection.During regular past tense generation, the LAHP group (vs.EAHP) showed greater activation in the left post central (2 regions), right middle and medial frontal gyri.During irregular past tense generation LAHPs showed greater activation (vs.EAHP) in the left and right inferior frontal gyrus (BA47), medial frontal gyrus (BA6), left inferior parietal lobe (BA40) and right middle frontal gyrus (BA9).
In the third condition (regularity × AoA), three regions including the left IFG (BA47) were significantly activated (see Part 3 Table 3).The frontal cortex has been acknowledged for its important role in language processing (see Li et al., 2014).We interrogated voxels at the peak of this frontal region to examine the nature of the interaction (Figure 1).The pattern of activation shown in Figure 1 suggests that LAHPs engage left BA47 as much for both regular and irregular generation (computational process and lexical access) while EAHPs activate this region for regular inflection but not for irregular inflection.When the peak voxels of other significantly activated regions were interrogated, a similar pattern was observed in other frontal regions such as the right medial frontal gyrus (BA6; 8, −4, 65; Figure 2), right IFG (BA47; 42, 41, 1), the right MFG (BA 9;45,25,32), and left insula (−37, 16, 0).

LAHP vs. LALP: How Does L2 Proficiency Affect Neural Activation in the Bilingual Brain?
Table 4 identifies regions found during (i) regular inflection only (Part 1); (ii) irregular inflection only (Part 2); and (iii) regularity × proficiency (Part 3).As in section "EAHP vs. LAHP: How Does L2 AoA Influence Neural Activation in the Bilingual Brain?" a whole brain analysis was applied and only regions significantly activated above cluster threshold size (p < 0.05, corrected) in each condition were reported.
Significant differences were found between LAHP and LALP bilinguals in several brain regions for both regular and irregular conditions.In each of these regions, LAHPs showed greater  activation compared to LALPs (1a and 2a, Table 4).When compared to the LALP group, LAHP showed extensive right frontal activation (superior BA8; superior BA9; inferior BA 44/45; medial BA6; middle BA6) for regular inflection.Activation was more bilateral for irregular inflection; in the left hemisphere activation was found in frontal regions (superior BA9; medial BA9, inferior BA45; see Figure 3), several left cingulate regions (BA24; BA 31; BA32), two regions in the left precuneus and the left cerebellum.Right frontal regions were also seen (superior BA9; medial BA8), together with right parietal (inferior BA41; superior BA40), post central (BA3) and transverse temporal gyrus (BA41).
Finally, interactions were examined to ascertain whether or not there was a significant difference across high and low proficiency bilinguals when activation differences between regular and irregular generation in each group were compared (Part 3, Table 4).Significantly activated regions were mainly in the left hemisphere, including the left inferior frontal gyrus (BA44), the left middle frontal gyrus (BA9/46) and the left superior temporal gyrus (BA22).As before we interrogated the voxels at the peak in these areas to examine the nature of the interaction.We found that high proficiency bilinguals barely engaged the left IFG (BA44; see Figure 4) during regular or irregular inflection; in fact, this group demonstrated a small amount of deactivation in both conditions.Low proficiency bilinguals on the other hand deactivated this region for irregular generation but activated it for regular generation; this difference was significantly different (p < 0.05, corrected) from the nondifference in the LAHP group.

DISCUSSION
This study set out to tease apart the effects of L2 AoA and proficiency on neural activation in bilinguals.We scanned three groups of bilinguals-each with a different combination of L2 AoA and proficiency but all matched for mode of L2 learningwhile they generated English (L2) past tense verbs in the scanner.We experimentally manipulated these two variables so that each could be examined in isolation within the same study.We found that both AoA and proficiency influence L2 neural activation independently, replicating the finding in similarly designed studies that both factors play equally crucial roles in the bilingual brain (Wartenburger et al., 2003;Nichols and Joanisse, 2016).We discuss our main findings related to L2 AoA and L2 proficiency effects below, beginning with the former.
Firstly, our study showed that late bilinguals produce greater overall activation compared to early bilinguals when generating both regular and irregular past tense in their L2; this effect was independent of L2 proficiency.This result is consistent with Liu and Cao's (2016) meta-analysis of previous neuroimaging studies, which concluded that late high proficiency bilinguals show greater overall activation than early high proficiency bilinguals when processing their L2.
Our finding of significantly greater activation for late vs. early bilinguals during regular verb inflection (previously argued to reflect computational/grammatical processing; Oh et al., 2011;Ullman, 2016) also agrees with Wartenburger et al.'s (2003) study which showed the same pattern when participants performed a grammatical judgement task.Interestingly, our findings differ in the realm of "semantic processing" (i.e., irregular verb inflection in our study).Wartenburger et al.'s (2003) reported that EAHP and LAHP groups did not differ in activation during the semantic judgement task and concluded there are no AoA effects for semantic processing.Our study however suggests that AoA effects are present across extensive brain regions during the production of irregular past tense verbs, which some models interpret as being a semantic or lexical access task (e.g., Ullman's DP model, 2001, 2004).In support of our findings, we note that Nichols and Joanisse (2016) who used a semantic-lexical task similarly reported increased L2 activity relative to L1 in bilateral IFG as a function of later AoA.It is possible however that the difference between our studies and Wartenburger's may be due to task differences: while we used single-word stimuli (Nichols and Joanisse's task was a picture-word matching task), Wartenburger et al. (2003) used a semantic judgement task involving sentences and therefore more processes in addition to lexical retrieval.Perhaps it is the case when sentence-level processing is measured, AoA and proficiency begin to play more specialized roles such as AoA affecting grammatical processing and proficiency affecting semantic processing.This is a line of investigation that requires further exploration.
There are at least two extant explanations as to why activation differences may exist between early and late language learning.One is to do with automaticity, which has been associated with decreased activity in the inferior frontal cortices and right middle frontal gyrus (Poldrack et al., 2005).Another proposed reason for this observed difference in neural activation is that it reflects maturational changes in neural plasticity when learning of language happens later, as compared to first language acquisition (Perani et al., 1996;Weber-Fox and Neville, 1996;Wartenburger et al., 2003;Mechelli et al., 2004;Abutalebi, 2008;Pakulak and Neville, 2011;Klein et al., 2014;Li et al., 2014).We speculate that our early bilinguals had a greater degree of L2 automaticity by virtue of having learnt it early in life and possibly because the learning coincided with a time of greater plasticity for language learning.We return to this question of L2 automaticity below.
Our second finding was that late bilinguals who are highly proficient are also distinguishable from their early counterparts by how they engage with the left prefrontal cortex (PFC) during irregular past tense inflection.As reported above we observed a significant interaction between AoA and regularity (i.e., regular vs. irregular inflection) in the left IFG (BA47).Interrogation of peak voxels revealed that while both groups of bilinguals engaged the left IFG during regular inflection, only the late bilinguals continued to engage this region during irregular inflection (Figure 1).This was a recurring pattern in other significantly activated frontal regions as well (see EAHP vs. LAHP: How Does L2 AoA Influence Neural Activation in the Bilingual Brain?; Figure 2).We suggest that early high proficiency bilinguals use frontal regions in a more task-specific way, i.e., there is selective activation depending on whether they are inflecting regular (computational process) or irregular (lexical retrieval) verbs.This involvement of the left IFG in regular but not irregular verb inflection for early L2 learners is consistent with the Declarative Procedural model's prediction that early L2 grammar is similar to L1 grammar acquisition  ( Ullman, 2001Ullman, , 2004Ullman, , 2005)).We have previously shown that the left IFG (BA47; −31, 24, 3) was significantly more activated when monolingual native English speakers were generating regular (compared to irregular) past tense verbs (Oh et al., 2011).The pattern shown by our early bilinguals in the current study is therefore similar to that shown by the L1 English speakers in that study.
Our third finding is with regard L2 proficiency effects: we found that L2 proficiency affects neural activation, independently of the age our participants began learning their L2.Late L2 high proficiency bilinguals (LAHP) produced greater overall activation when compared to late L2 low proficiency bilinguals (LALP), in both the regular and irregular generation conditions.LAHP bilinguals activated right frontal areas (superior, middle, medial and inferior frontal) more than their LALP counterparts while generating regular past tense verbs.When generating irregular past tense verbs, LAHP (>LALP) bilinguals showed more bilateral activation (especially for superior frontal and post central gyri); regions such as left IFG (BA 45) were found too (Table 4; Figure 3).This finding fits with other neuroimaging studies, both structural and functional, that have shown increased volume or activation in the left inferior frontal gyrus with better proficiency.For example, various structural studies (Mårtensson et al., 2012;Stein et al., 2012Stein et al., , 2014) ) have reported that changes in L2 proficiency are positively related with volume increase in the left IFG.Functionally, activation in the left IFG has been seen to increase with increased proficiency, for example in the learning of language-like rules in an artificial grammar (Opitz and Friederici, 2003).Other studies have likewise argued for either decreased activation for less fluent bilinguals or increased activation in some regions to be positively correlated with increased proficiency, possibly in earlier stages of L2 learning (Hasegawa et al., 2002;Sakai et al., 2004;Xue et al., 2004; although see Saur et al., 2009 for a different view;Li et al., 2014;Nichols and Joanisse, 2016).One of these studies (Sakai et al., 2004) also used the past tense task to examine the functional role of the IFG and its relationship to proficiency in L2 learners.Like us, they found that high (vs.low) bilinguals showed increased activation in this area (their coordinates were close to ours) during irregular inflection.The inferior parietal lobe (Chou et al., 2006;Booth et al., 2008) is another region in which increased activations have been reported as a function of increased skill and development (Hernandez et al., 2015).Besides increased activation in the IFG, LAHP bilinguals in our study showed increased activation in the right inferior parietal lobe relative to LALP bilinguals.
We might ask why both early L2 high proficiency (EAHP) and late low proficiency (LALP) bilinguals show less extensive activation, respectively, relative to LAHP bilinguals.We propose that in the case of EAHP, relatively less activation is due to the language task being more automatic for them and therefore requiring less neural effort.For the LALPs relatively lower activation is caused by a different reason, e.g., it is possible that they found the task much harder and perhaps too hard, leading them to disengage or not engage adequately, especially in a task that is timed.This is a possibility that requires further study.
Returning to the question of automaticity, Jeon and Friederici (2015) propose that the degree of automaticity could be the critical factor in the functional organization of the prefrontal cortex.According to their account, anterior prefrontal regions (e.g., BA47, BA45) are involved during processes that require more control and have a low degree of automaticity (such as an L2).Conversely, more posterior prefrontal regions (e.g., BA44) are associated more with processes that have a high degree of automaticity (e.g., L1).Put another way: there is a reliance on BA44 in adult native language processing, which is considered to be highly automatic.In children however (and by extension L2 learners) language learning is still in "development" and there is greater reliance during this period on BA45, a more anterior part of the IFG (Hahne et al., 2004;Vissiennon et al., 2017).
Our findings provide support for such a view.The greater overall activation for LAHP (vs.EAHP) bilinguals in left anterior prefrontal regions (e.g., BA47) suggests that while our LAHP bilinguals were highly proficient and indistinguishable from EAHP bilinguals on past tense accuracy, their late acquired language processing was less automatic than their early counterparts.Similarly, we found significant activation in left BA45 (an anterior PFC region) for LAHP bilinguals (this time in relation to LALP bilinguals).We did not however find this pattern of activity in the more posterior region of left PFC (BA44).BA44-which showed differences in the LAHP vs. LALP contrast-showed very little activation (relative to baseline activation) for the LAHP group during regular or irregular verb inflection.Our previous findings with this same task in L1 native speakers (Oh et al., 2011) showed that BA44 was significantly activated for both regular and irregular verb inflection conditions.If one were to argue that a more automatic L1 was more activated than a high proficiency but less automatic L2 in this region, then this supports the anterior-posterior automaticity gradient in PFC proposed by Jeon and Friederici (2015).

CONCLUSION
In summary our study found that both L2 AoA and proficiency independently influence L2 functional activation.We saw L2 AoA affect neural activation in two ways.First, late L2 learning was associated with greater overall activation; this was interpreted as reflecting a lesser degree of automaticity in L2 processing even though high proficiency levels had been attained.We also saw early L2 bilinguals using the left IFG (BA47) selectively depending on whether they were generating regular or irregular past tense verbs.The activation patterns of these early L2 learners were similar to L1 native speakers generating the past tense (Oh et al., 2011).In terms of L2 proficiency effects, our study confirmed that high L2 proficiency was associated with increased activation in a number of regions, including bilateral frontal regions and the left IPL.This was not always the case in the left IFG however, in particular BA44.Increased activation in left BA45 but not BA44 is a pattern that is seen in the "developmental" stage of language learning in children or when processing is controlled rather than automatic (Jeon and Friederici, 2015;Vissiennon et al., 2017).We propose that the pattern of frontal activation we saw in our LAHP bilinguals was a function of degree of automaticity in L2.

FIGURE 1 |
FIGURE 1 | Activation differences in BA47 between the early (EAHP) vs. late (LAHP) bilinguals for the Regularity × AoA condition.Significant activation is attributed to: (i) the difference between EAHP and LAHP for irregular inflection (left of the graph) and (ii) the difference between regular and irregular inflection for the EAHP bilinguals (see the lighter bars).Activated regions are those activated above the indicated threshold (p < 0.05, corrected).

FIGURE 4 |
FIGURE 4 | Activation differences in BA44 between the high proficiency (LAHP) and low proficiency (LALP) late bilinguals in the Regular × Proficiency condition.Activated regions are those activated above the indicated threshold (p < 0.05, corrected).
EAHP, Early high proficiency bilinguals; LAHP, Late high proficiency bilinguals; LALP, Late low proficiency bilinguals.*All but two of the participants in this group were in their early 20's; one participant was 53 years old, the other 35.**All participants were in their early to mid 20's except for one, who was also 53.

TABLE 2 |
Behavioral results, participants producing the past tense of presented verbs.

TABLE 3 |
Regions activated by the EAHP vs. LAHP groups in the (1) Generate Regular Past Tense, (2) Generate Irregular Past Tense, and (3) Contrast between Irregular and Regular Tense generation tasks, respectively.

TABLE 4 |
Regions activated by the LAHP vs. LALP groups in the (1) Generate Regular Past Tense, (2) Generate Irregular Past Tense, and (3) Contrast between Irregular and Regular Tense generation tasks, respectively.
each of the factors affects neural activation is reported below; the first section (EAHP vs. LAHP: How Does L2 AoA Influence Neural Activation in the Bilingual Brain?) describes how early