Language Familiarity and Proficiency Leads to Differential Cortical Processing During Translation Between Distantly Related Languages

In the midst of globalization, English is regarded as an international language, or Lingua Franca, but learning it as a second language (L2) remains still difficult to speakers of other languages. This is true especially for the speakers of languages distantly related to English such as Japanese. In this sense, exploring neural basis for translation between the first language (L1) and L2 is of great interest. There have been relatively many previous researches revealing brain activation patterns during translations between L1 and English as L2. These studies, which focused on language translation with close or moderate linguistic distance (LD), have suggested that the Broca area (BA 44/45) and the dorsolateral prefrontal cortex (DLPFC; BA 46) may play an important role on translation. However, the neural mechanism of language translation between Japanese and English, having large LD, has not been clarified. Thus, we used functional near infrared spectroscopy (fNIRS) to investigate the brain activation patterns during word translation between Japanese and English. We also assessed the effects of translation directions and word familiarity. All participants’ first language was Japanese and they were learning English. Their English proficiency was advanced or elementary. We selected English and Japanese words as stimuli based on the familiarity for Japanese people. Our results showed that the brain activation patterns during word translation largely differed depending on their English proficiency. The advanced group elicited greater activation on the left prefrontal cortex around the Broca’s area while translating words with low familiarity, but no activation was observed while translating words with high familiarity. On the other hand, the elementary group evoked greater activation on the left temporal area including the superior temporal gyrus (STG) irrespective of the word familiarity. These results suggested that different cognitive process could be involved in word translation corresponding to English proficiency in Japanese learners of English. These difference on the brain activation patterns between the advanced and elementary group may reflect the difference on the cognitive loads depending on the levels of automatization in one’s language processing.

In the midst of globalization, English is regarded as an international language, or Lingua Franca, but learning it as a second language (L2) remains still difficult to speakers of other languages. This is true especially for the speakers of languages distantly related to English such as Japanese. In this sense, exploring neural basis for translation between the first language (L1) and L2 is of great interest. There have been relatively many previous researches revealing brain activation patterns during translations between L1 and English as L2. These studies, which focused on language translation with close or moderate linguistic distance (LD), have suggested that the Broca area (BA 44/45) and the dorsolateral prefrontal cortex (DLPFC; BA 46) may play an important role on translation. However, the neural mechanism of language translation between Japanese and English, having large LD, has not been clarified. Thus, we used functional near infrared spectroscopy (fNIRS) to investigate the brain activation patterns during word translation between Japanese and English. We also assessed the effects of translation directions and word familiarity. All participants' first language was Japanese and they were learning English. Their English proficiency was advanced or elementary. We selected English and Japanese words as stimuli based on the familiarity for Japanese people. Our results showed that the brain activation patterns during word translation largely differed depending on their English proficiency. The advanced group elicited greater activation on the left prefrontal cortex around the Broca's area while translating words with low familiarity, but no activation was observed while translating words with high familiarity. On the other hand, the elementary group evoked greater activation on the left temporal area including the superior temporal gyrus (STG) irrespective of the word familiarity. These results suggested that different cognitive process could be involved in word translation corresponding to English proficiency in Japanese learners of English. These difference on the brain activation patterns between the advanced and elementary group may reflect the difference on the cognitive loads depending on the levels of automatization in one's language processing.

INTRODUCTION
In the midst of globalization, English is regarded as an international language, or Lingua Franca (Seidlhofer, 2005;Crystal, 2012;Kirkpatrick, 2012), with the number of worldwide English speakers being over 2 billion (Crystal, 2008). However, it is evident that the English proficiency of Japanese learners is fairly low regarding English test scores including the Test of English for International Communication (TOEIC; Educational Testing and Service, 2020b), the International English Language Testing System (IELTS; IELTS Partners, 2020), and the Test of English as a Foreign Language (TOEFL; Educational Testing and Service, 2020a) in comparison to English learners in other nations.
Difficulty in handling English (second language: L2) for Japanese may be associated with linguistic reasons. The Japanese language (first language: L1) is the most distant from English in terms of linguistic distance (LD), which is mainly based on morphological, phonological, and syntactic elements (Chiswick and Miller, 2005). LD to English ranges from the lowest score (hardest to learn) of 1.00 for Japanese to the highest score (easiest to learn) of 3.00 for Afrikaans, Norwegian, and Swedish. The LD score is determined by the ease/difficulty that Americans have learning different foreign languages, and it corresponds fairly well with differences in foreigners' ease/difficulty in learning English. For L1 speakers to process a L2 with a large LD is not an easy process. However, little is known about cognitive aspects of processing a distantly related L2. One possible approach may be to understand the neural basis for L2 handling by linguistically distant L1 speakers. In particular, we focused on exploring the neural basis of translation because it is an indispensable part of L1 speakers' handling of L2. Before exploring specific aspects of L2 with a large LD, we will first introduce existing models of the word product (output) system underlying translation for L2 speakers in general. We will then review important behavioral and neuroscience experiments on translation conducted for L2 speakers in general. Finally, we will interpret the results of these experiments from a cognitive processing perspective.
The bilingual lexico-semantic system is an analytical cognitive model of L2 speakers' second language acquisition of words themselves and their meanings (Votaw, 1992). The system consists of several distinct elements: how the word looks (orthography), how it sounds (phonology), what it means (semantics), what syntactic properties it has (lemmas), and how it is pronounced (an output system that specifies the pronunciation of word forms) (Patterson et al., 1987;Indefrey and Levelt, 2000;Meyer et al., 2016). The bilingual lexico-semantic system is known to support a variety of linguistic activities such as reading, speaking, and switching between languages in translation in other (second) languages (Votaw, 1992;Price et al., 1999). Particularly, word translation by L2 speakers requires the speaker to generate the translation equivalent of the presented word rather than to merely name it (Green, 1986). In addition, these cognitive operations are assumed to be accomplished by modulating the activation of the language system (Grosjean, 1997;Paradis, 1997) with the inhibition control system, which is described under the scheme of the inhibition control (IC) model (Green, 1998;Ong and Zhang, 2010). This model sets and maintains the target, avoids naming words in L1, and instead produces the equivalent translation as a response. Therefore, it is assumed that the bilingual lexico-semantic system works accurately when the inhibition control system (Green, 1998;Ong and Zhang, 2010) adequately controls language processing. Moreover, psycholinguistic data emphasize two different routes for translation (Kroll and Stewart, 1994;Kroll and De Groot, 2002;Duyck and Brysbaert, 2008): a non-semantic direct route (lexical route) in which the word forms of translation equivalents are linked at the lemma level (Jescheniak and Levelt, 1994) and an indirect route (semantic route) in which they are connected via their meaning (i.e., their lexical concepts). According to the IC model, word selection along either route involves lemma activation and the inhibition of lemmas with a non-target language tag. The involvement of these two routes is thought to differ depending on the direction of word translation (L1-into-L2 or L2-into-L1) (Jescheniak and Levelt, 1994;Price et al., 1999). In L1-into-L2 translation the semantic route is dominant, whereas in L2-into-L1 translation the lexical route is dominant, reflecting the acquisition of the L2 word in the context of a pre-existing lexical concept-word form link in L1 (Price et al., 1999). In fact, Kroll and Stewart (1994) suggested through experimental studies that L1-into-L2 translation may produce more semantic processing than L2-into-L1 translation does. Thus, it is of great importance to explore the neural basis of translation by examining cortical activation patterns in both directions, L1-into-L2 and L2-into-L1.
There are some behavioral experiments using word translation tasks. De Groot and Poot (1997) examined the performance of balanced bilinguals, translating one set of words from L1, Dutch, to L2, English, and vice versa. The LD between Dutch and English is known to be close, scored as 2.75 (Chiswick and Miller, 2005). Reaction time for word-translation of L1 into L2 was longer than that of L2 into L1 and there were high error rates while translating L1 into L2. Kroll et al. (2010) also conducted a similar experiment in which balanced bilinguals translated simple L1 (English) sentences into L2 (French with a LD of 2.5) and vice versa (Chiswick and Miller, 2005). Their results were mostly in line with those by De Groot and Poot (1997), replicating more prolonged reaction time and higher error rate while translating L1 into L2 than L2 into L1. From these behavioral studies, it could be said that translation from L1 into L2 is cognitively more loaded than that from L2 into L1. Moreover, considering that these experiments were conducted for balanced bilinguals, it is also suggested that the mental lexicon in L2 may be smaller than that in L1 regardless of bilingualism levels (De Groot and Poot, 1997;Kroll et al., 2010).
With advancements in functional brain imaging, many studies have started to focus on brain activation patterns during translations between L1 and L2. Many of these studies recruited balanced bilinguals and examined brain activities during translation between languages with close LDs. Most studies performed thus far used PET (positron emission tomography), which is invasive in terms of the intake of radioactive substances, but is relatively unrestrictive regarding body motion and language-related behaviors, and thus is suitable for functional neuroimaging during translation. On the other hand, probably due to technical constraints, fMRI (functional magnetic resonance imaging) has not yet been applied directly for neuroimaging examination of bidirectional translation between L1 and L2, to our knowledge. Rather, fMRI has been used to reveal the cognitive mechanisms behind more fundamental processes of translation, such as the learning process of unknown L2 words (Mayer et al., 2015) and judging the correctness of translated texts (Lehtonen et al., 2005). Fortunately, Hervais-Adelman et al. (2015) examined the neural basis of translation with a focus on language translation from L1 to L2 only. They aimed to clarify how multilinguals who had a high level of language proficiency in at least three languages exhibited brain activation during simultaneous interpretation of L1 (their most fluent language: English or French) to L2 (9 target languages such as French, Spanish, Italian, and German). As a result, they confirmed the involvement in the translation of the anterior portion of Broca's area (BA 45). This finding cannot be discussed from a LD-based perspective (Chiswick and Miller, 2005) because participants did not necessarily translate English as the L2, but it is important in clarifying the neural basis of translation. There are also studies showing that the functional connectivity of the brain is different between L1into-L2 and L2-into-L1 translation (Zheng et al., 2020). Zheng et al. (2020) demonstrated that functional connectivity between a core semantic hub (the left anterior temporal lobe, ATL) and key nodes of attentional and vigilance networks (left inferior frontal, left orbitofrontal, and bilateral parietal clusters) increased during L1-into-L2 translation, whereas functional connectivity was observed only between the left ATL and the right thalamus, regions implicated in the automatic relaying of sensory information to cortical regions, during L2-into-L1 translation. These results may imply that enhanced functional connectivity between semantic and attentional mechanisms is involved during L1-into-L2 translation (Zheng et al., 2020). The finding in Zheng et al. (2020) is consistent with the assumption in the IC model that two different routes are involved depending on the direction of word translation (L1-into-L2 or L2-into-L1) (Jescheniak and Levelt, 1994;Price et al., 1999).
Some PET studies have examined brain activation during bidirectional language translation between L1 and L2 directly, and we will review them in detail here. Klein et al. (1995) used PET to investigate brain activation patterns during a word translation task between French and English with a close LD of 2.50 (Chiswick and Miller, 2005). Participants whose L1 was English but were also proficient in French (L2) translated L1 into L2 and vice versa. While translating L1 into L2, the left frontal ventrolateral cortex (BA 10/47), the left dorsolateral cortex (BA 8), the left temporal inferotemporal cortex (BA 37/20), the left parietal cortex (BA 7), and the cerebellum (Vermis) were activated. While translating L2 into L1, the left frontal ventrolateral cortex (BA 10/47; BA 9/46), the left dorsolateral cortex (BA 8), the left temporal inferotemporal cortex (BA 37/20), the left parietal cortex (BA 7), the cerebellum (right), and the thalamus/pulvinar were activated. Price et al. (1999) examined brain activities during translation between German (L1) and English (L2), having a close LD of 2.25 (Chiswick and Miller, 2005), on balanced bilinguals using PET. While translating both L1 words into L2 and vice versa, the left anterior cingulate, the left supplementary motor area and the left medial fusiform, the bilateral subcortical structures, the anterior insula, and the cerebellum were activated. Quaresima et al. (2002) examined brain activation while balanced bilinguals of Dutch (L1) and English (L2), with a close LD of 2.75 (Chiswick and Miller, 2005), translated easy sentences from L1 into L2 and vice versa, using fNIRS (functional near infrared spectroscopy), which offers non-invasive hemodynamic assessment in a natural environment, and thus is useful for this purpose. Among the lateral frontal and temporal regions covered in the fNIRS measurement, the left cortical area surrounding Broca's area (BA 44/45) was activated irrespective of translation direction.
In addition, there are a few studies focusing on brain functions during translation for English learners whose L1 is moderately distant from English. In a PET study, Rinne et al. (2000) examined brain activation of professional interpreters during translation from Finnish (L1) to English (L2) having a moderately close LD score of 2.0 (Chiswick and Miller, 2005). Activation patterns were asymmetric as to direction of translation. While translating L2 into L1, activations of the left ventrolateral frontal cortex (BA 46), and the left premotor cortex (BA 6) were observed. On the other hand, while translating L1 into L2, the left ventrolateral frontal cortex (BA 45), the left: inferior temporal cortex (BA 20/28), the left premotor cortex (BA 6), and the cerebellum were activated.
To summarize the major functional neuroimaging studies on translation presented above, various regions were activated while translating from L1 into L2 and vice versa. Moreover, the brain activation patterns were different depending on translation direction. Though there were different activation patterns during translation across studies, the area surrounding the left prefrontal cortex, such as the left ventrolateral frontal cortex involved in Broca's area and the left dorsolateral prefrontal cortex (DLPFC), was activated consistently. This was applicable to the studies focusing on language translation with close LDs (Klein et al., 1995;Price et al., 1999;Quaresima et al., 2002) but also to those with moderate LDs (Rinne et al., 2000). Broca's area, in particular, has been reported to be active regardless of the direction of translation (L1-into-L2 and L2-into-L1) in a study focusing on translation from both directions (Quaresima et al., 2002). This region is responsible for retrieving linguistic information (Klein et al., 1995) and is also related to verbal working memory (Paulesu et al., 1993), morphosyntactic processing (Laine et al., 1999), and semantic analysis (Cabeza and Nyberg, 1997). The left DLPFC plays an important role for working memory associated with translation (Klein et al., 1995) and language encoding and semantic processing (Rinne et al., 2000). These frontal regions are more widely activated during L1-into-L2 translation (Rinne et al., 2000). In addition, the left inferior temporal activation was observed in Klein et al. (1995) and Rinne et al. (2000). This region belongs to the so-called 'basal temporal language area' which has been related to word-finding (Lüders et al., 1991;Damasio et al., 1996) and semantic processing (Vandenberghe et al., 1996;Seghier and Price, 2012). The function of these temporal regions during language translation is thought to be primarily responsible for the semantic processing of language (Klein et al., 1995;Rinne et al., 2000).
The functional meaning of these brain regions is consistent with the mental representational model of second language acquisition. That is, these areas are involved in both word production and word perception (Lüders et al., 1991;Indefrey and Levelt, 2000;Indefrey and Levelt, 2004;Hamberger and Cole, 2011), and are therefore likely to be active in common even between languages with a close or moderate LD. On the other hand, the widespread activation including the temporal region during L1-into-L2 translation may reflect the dominance of the semantic route (Jescheniak and Levelt, 1994;Price et al., 1997). Thus, it is likely that the left prefrontal cortex and surrounding area are the regions generally involved in language translation, and that other regions might be differentially recruited depending on differences in LD and on the direction of translation.
Although these findings provided valuable insights into understanding the cognitive processes underlying L2 handling, there are limitations to applying them to understanding cognitive processes of Japanese speakers handling English, a most distantly related language with a LD of 1.0. First, previous studies have mainly been conducted on balanced bilinguals who could effortlessly translate L1 into L2 and vice versa. Because their performance is not expected to be similar to Japanese learners of English, whether brain activation patterns observed in previous studies are also applicable to the language translation process of Japanese learners or not is unclear. Second, those previous studies focused on language translation between English and other languages whose LD is close or moderate. The LD between Japanese and English is the most distant along with that between Korean and English (Chiswick and Miller, 2005). In fact, it has been shown that differences in LD produce different patterns of brain activation during language processing, such as sentence comprehension (Jeong et al., 2007). Accordingly, the results of previous studies might not be directly adapted to translation between Japanese and English.
Therefore, in the current study, we aimed to investigate brain activation patterns while Japanese learners of English translated Japanese words into English and vice versa. In so doing, we have to take the following issues into consideration. First, the large LD, literally entailing difficulty in L2 learning, leads to the emergence of various levels of Japanese learners of English. Since the level of English acquisition may affect the brain activation patterns during translation, we examined both advanced Japanese learners of English who might easily translate L1 into L2 and vice versa and elementary learners who might not easily do the same thing. Second, it is often too difficult for elementary-level English learners to translate Japanese sentences into English and vice versa. Thus, we adopted word translation as vocabulary knowledge is indispensable for acquiring L2 and allows the measurement of individual English skills (Laufer and Nation, 1999). Third, we have to consider the familiarity issue. When adopting L1 and L2 words as stimuli, it might be difficult to distinguish whether the observed cognitive reactions are attributed to qualitative differences of languages or to quantitative differences of cognitive loads. Thus, in order to examine the effects of word familiarity, we adopted high-and low-familiarity L1 and L2 words as stimuli.
Language translation is a linguistic activity that is commonly practiced on a daily basis in an environment where a second language is used. Thus, it is desirable to measure brain activations while translating in a less-restrictive environment that is as close as possible to normal daily life. Although most previous studies used PET and a large body of linguistic studies used fMRI, their experimental environments presented a rather restricted and unfamiliar environment in which participants performed translation. However, fNIRS can measure brain activation patterns by simply placing probes on the head under conditions close to everyday life, such as participants having freedom of movement, and was proven to be useful in a pioneering study by Quaresima et al. (2002) on translation. fNIRS has been successfully adopted in other language-related studies including language acquisition (Obrig et al., 2010(Obrig et al., , 2017Homae et al., 2011;May et al., 2018;Sugiura et al., 2018), speech perception (Minagawa-Kawai et al., 2002;Minagawa-Kawai et al., 2004;Minagawa-Kawai et al., 2007), and speech comprehension (Lei et al., 2018). Hence, we used fNIRS to measure brain activations during translation of Japanese (L1) and English (L2) words, taking into consideration language direction and word familiarity as within subject factors in both high-and low-proficiency English learners.

Participants
Forty-three healthy right-handed Japanese young adults (23 males and 20 females, mean age 20.81 ± 1.37, age range 18 -25) participated in this study. All participants had taken the TOEIC R Listening and Reading test within the past year. TOEIC R is the most widely used standardized examination with a yearly participation rate of over two million and its sufficiency in reliability and validity has been reported by Lawson (2008). Participants who received a score of over 730 points were assigned to the advanced group and participants who received a score below 470 points were assigned to the elementary group based on the TOEIC R official standard (Educational Testing and Service, 2020b). This official standard indicates that those who scored 730 points or more "have the ability to communicate appropriately in any situation" or "can communicate adequately at a similar level to a native speaker, " while those who scored 470 or less "have a minimum level of communication in a daily conversation" or "cannot communicate at all." Among the initial 43 participants, three were excluded from the data analysis. One misunderstood the instructions. Another was excluded due to instrumental trouble during the fNIRS experiment, and the third was recognized as left-handed, based on the Edinburgh inventory (Oldfield, 1971). The remaining participants consisted of 21 in the advanced-level group and 19 in the elementary-level group. Participants' average score in the advanced group was 826.36 ± 67.93 (max: 975, min: 740) and that in the elementary group was 377.50 ± 69.80 (max: 460, min: 225).
The experimental protocols were approved by the Institutional Review Board (IRB) of Chuo University and it was in accordance with the Declaration of Helsinki guidelines. written informed consent was obtained from all participants in advance.

Stimuli and Experimental Design
In this study, participants were asked to perform a word translation task between Japanese and English as quickly as possible. As Figure 1 shows, the stimuli in this experiment were divided into three task blocks, namely non-translation as baseline blocks, English-into-Japanese task blocks and Japanese-into-English task blocks. There were four task conditions in the task blocks: translation direction (Englishinto-Japanese/Japanese-into-English) × familiarity (high/low familiarity). All participants were required to answer by typing the spelling of the words in English or in Japanese using Roman letters on a keyboard. Japanese people habitually use Roman letters when typing Japanese words. For this reason, we decided that the balance of control was not affected between typing English letters and Japanese Roman letters. In baseline blocks, they were asked to transcribe Japanese words written in black into Roman letters without translating into Japanese or English (e.g., " " to "heisei" in Roman letters). In Japanese-into-English task blocks, they were asked to translate Japanese words written in red into corresponding English words and to type them [e.g., " " in Japanese Kanji character(s) to car in English]. In English-into-Japanese task blocks, they were asked to translate English words written in red in Roman letters into corresponding Japanese words and to type them in Roman letters (e.g., "world" to "sekai" in Roman letters). In all the task blocks, the participants were asked to press the "SPACE" bar immediately after they produced the translated or control word in their mind, type it on the keyboard, and finally press the "ENTER" key immediately after typing the translated words. If the participants did not produce the translated word, the next trial stimulus appeared on the computer monitor in five seconds. In baseline blocks, the times of the stimuli presentation were randomized, with the words appearing four or five times on a computer monitor, to avoid prediction of the timing of the subsequent trial. The number of stimuli presentation were three times for task blocks. The inter stimuli interval lasted 2 s.
Response time and the accuracy of the response were obtained while the participants conducted word translation. Concerning seemingly correct answers, we defined typing errors of two or more letters in a single word to be incorrect, but with one letter to be correct (e.g., mistyping "money" as "mooney" would be considered correct). We judged whether the answers included typing errors with independent visual examinations by three raters (KN, KO, and ToT). For stimuli presentation and response recording, we used the Psychoplrysics Toolbox (Brainard, 1997;Pelli, 1997;Kleiner et al., 2007), which operated in a Matlab (Mathworks, Natick, MA, United States) environment. The response for word translation task on the computer was synchronized temporally with fNIRS records through a serial port to record hemodynamic responses. Specific Japanese and English word stimuli were selected based on the following considerations. First, we set the word stimuli to comprise of only nouns because verbs, adjectives or other parts of speech tend to be polysemic, possibly making participants confused in grasping the meanings of the presented words. Second, we set word stimuli to be presented visually with Kanji, or Chinese characters, based on consultation with two professional simultaneous interpreters suggesting that Japanese words have many homonyms and cause higher chances of confusion when auditorily presented. In accordance, English words were also presented visually.
Basically, the stimuli in this study were chosen on a word familiarity basis both for Japanese and English words. This is because the most frequently used British National Corpus (BNC) was established based on English word frequency created by British English speakers, which was not suitable as word translation stimuli for Japanese (British National, and Corpus., 2007). Thus, we utilized the NTT Psycholinguistic Databases "Lexical Properties of Japanese" for the Japanese stimuli (Amano and Kondo, 1998) and English words familiarity ratings among Japanese for English stimuli (Yokokawa et al., 2007). Both corpora were based on familiarity ratings for English and Japanese words, respectively, for Japanese people. Word familiarity in both English and Japanese ranges from 1.0 to 7.0, with 7.0 being the most familiar, and 1.0 being the least.
Further, we utilized three English Japanese Dictionaries, namely the online Cambridge dictionary (Cambridge University, and Press., 2020), the OLEX English-Japanese Dictionary (Nomura et al., 2016) and the Genius English-Japanese Dictionary (Konishi and Minamide, 2001) to confirm whether the primary meaning of each selected noun was the same across the three dictionaries. In addition, we arranged visually presented Japanese words in Kanji, or Chinese characters, when necessary, to be included in the specific set of basic Kanji, "Joyo-Kanji, " which consists of 2135 characters intended for daily use (Agency for Cultural Affairs, 2010). For English words, the number of syllables was set from one to three. The mora of Japanese words was set from two to six. This was to enable participants to answer the questions (they were asked to translate Japanese/English words and type the spelling) within the limited time. We regarded two morae to be equivalent to one syllable as per Kubozono (1989).
Finally, for selecting high and low familiarity words both in Japanese as L1 and English as L2, we generated composite familiarity scores by adding the familiarity scores from the two corpora (Amano and Kondo, 1998;Yokokawa et al., 2007). Accordingly, 92 words with the highest and lowest scores, were selected as high and low familiarity words, respectively. Each averaged familiarity was 6.19 for high-familiarity words and 4.40 for low-familiarity words. They were significantly different in familiarity [t(182) = 41.93, p < 0.01, d = 3.11]. In addition, we selected 147 relatively common Japanese words as baseline words from Amano and Kondo (1998). Combinations of Kanji with Katakana or Hiragana characters (e.g., " "; parenting, " "; bronze medal) were excluded from these baseline word sets. All baseline words were written in Kanji characters like the task words. The averaged word familiarity was English-into-Japanese task

Japanese-into-English task
English-into-Japanese task

Japanese-into-English task
Japanese-into-English task A B FIGURE 1 | The structure of the word translation task paradigm. (A) The word translation task paradigm consisted of baseline and task blocks. There were four types of task blocks arising from combinations of translation direction (English into Japanese/Japanese into English), and high and low familiarity. (B) For each task block, a fixation point was presented for 200 ms. Then, a stimulus word shown in English or Japanese was presented in the center of the display. When English words were presented, participants were asked to translate and type corresponding Japanese words in Roman letters. When Japanese words were presented, participants were asked to translate and type corresponding English words.
6.02. There were no stimuli words which overlapped between baseline and task words.

Data Acquisition
During the word translation task, we recorded hemodynamic responses using fNIRS measurement. We used a 52-channel continuous wave system (ETG-4000, Hitachi, Japan). Optical data from individual channels were collected at two different wavelengths, 695 and 830 nm, and analyzed using the modified Beer-Lambert Law (Delpy et al., 1988). Changes in the oxygenated hemoglobin (oxyHb) and deoxygenated hemoglobin (deoxyHb) signals were calculated in units of millimolar × millimeter (mM × mm) (Maki et al., 1995). The sampling rate was set to 10 Hz. The probe was fixed using one 9 × 34 cm rubber shell over the frontal and temporal areas (Figure 2) in reference to previous studies (Niioka et al., 2018;Kawabata Duncan et al., 2019). The shell of 33 probes, consisting of a 3 × 11 array with 17 emitters and 16 detectors, allowed us to measure the relative concentration of hemoglobin at 52-channels. We defined the midpoint of a pair of illuminating and detecting probes as a channel location. We defined channel locations in accordance with the international 10-20 system for EEG (Klem et al., 1999;Jurcak et al., 2007). The fNIRS probes were placed such that Fpz coincided with the sixth probe in the middle column of holders in the 3 × 11 probe holder and the lower line substantially matched the horizontal reference curve, where the horizontal reference curve was determined by a straight line connecting FPz-T3-T4 . The inter-optode distance was 3 cm. For spatial profiling of fNIRS data, we adopted the probabilistic registration method Singh et al., 2005;Tsuzuki et al., 2007;Tsuzuki and Dan, 2014) to register fNIRS data to Montreal Neurological Institute (MNI) standard brain space, which further allows us to estimate macroanatomical locations of the channels (Rorden and Brett, 2000).

fNIRS Data Analysis
We used Matlab 2007b (The Mathworks, Inc., Natick, MA, United States) for fNIRS data analysis with several in-house toolboxes to realize the procedures to be described hereafter. Since the oxyHb signal is the most sensitive indicator of regional cerebral hemodynamic response (Huppert et al., 2006;Homae et al., 2007;Cui et al., 2010), we analyzed oxyHb signal changes. Individual timeline data for the oxyHb signal of each channel were preprocessed in the following way. First, we movingaveraged raw data for 5 s. Then, channels with a signal variation of 10% or less were considered defective measurements and excluded from analysis. To remove the influence of measurement noise such as breathing, cardiac movement and so on from the remaining channels, we applied wavelet minimum description length (Wavelet-MDL) (Jang et al., 2009). After pre-processing oxyHb timeline data for each individual on each channel, we conducted General Liner Model (GLM) analysis with regression to hemodynamic response function (HRF). The regressors were created by convolving (Equation 2) the boxcar function N (t p ,t) with the HRF shown in Equation 1 (Friston et al., 1998).
Following the conventional usage, we set the first peak delay, t p , to 6 s, the second peak delay, t d , to 10 s, and A, the amplitude ratio between the first and second peak, to 6 s. The first and second derivatives were included to reduce the influence of noise of individual data further. The specific design matrix is shown in Figure 3. Columns 1, 2, and 3 in Figure 3 respectively represent the HRF of the baseline block and its first and second derivatives. Columns 4, 5, and 6 respectively represent the HRF of the English-into-Japanese/high-familiarity task block and its first and second derivatives. Columns 7, 8, and 9 respectively represent the HRF of the English-into-Japanese/low-familiarity task block and its first and second derivatives. Columns 10, 11, and 12 respectively represent the HRF of the Japanese-into-English/high-familiarity task block and its first and second derivatives. Columns 13, 14, and 15 respectively represent the HRF of the Japanese-into-English/lowfamiliarity task block and its first and second derivatives. Column 16 represents the constant. We used the β value as an indicator of the oxyHb signal for each regressor. Among 16 β values, the four β values (β 4 , β 7 , β 10 , β 13 ) representing the task block served for further statistical analyses, while the others were regressed out. β 4 was the indicator of the brain activity during the task period Englishinto-Japanese/high familiarity and β 7 is the indicator of the brain activity during the task period English-into-Japanese/low familiarity. Similarly, β 10 is the indicator of the brain activity during the task period Japanese-into-English/high familiarity and β 13 is the indicator of the brain activity during the task period Japanese-into-English/low familiarity. A one-sample t-test against zero was performed on β values for each task block and channel at the group level. Family wise errors for the p-values were corrected using Bonferroni correction. With the Bonferroni method, the statistical significance level (a) is divided by the number of channels, resulting in it being too conservative. The present study is the first to focus on Japanese-English translation, which has a large LD, entailing difficulty in L2 learning, and it was necessary to avoid the type II errors of missing the channels that were truly activated. Therefore, we will discuss "activated channels" based on sufficient effect sizes being obtained not only for significant channels (a = 0.05), but also for marginally significant channels (a = 0.10).
Further, we conducted a three-way mixed analysis of variance (ANOVA) with group (advanced/elementary) as the between FIGURE 3 | An example of a design matrix, X. The row indicates time from up to bottom. The first to third columns indicate the canonical HRF (hemodynamic response function), and the first and second derivatives, respectively, for baseline trials. The fourth to sixth columns indicate the canonical HRF, and the first and second derivatives, respectively, for task trials (English into Japanese/high familiarity). The seventh to ninth columns indicate the canonical HRF, and the first and second derivatives, respectively, for task trials (English into Japanese/low familiarity). The 10th to 12th columns indicate the canonical HRF, and the first and second derivatives, respectively, for task trials (Japanese into English/high familiarity). The 13th to 15th columns indicate the canonical HRF, and the first and second derivatives, respectively, for task trials (Japanese into English/low familiarity). The 16th column indicates the constant. subject factor and direction (English-into-Japanese/Japaneseinto-English) and familiarity (high/low) as the within-subject factors on β values for each task block. β values were averaged between channels corresponding to the same anatomical label for channel activated in a one-sample t-test against zero. A simple main effect test was performed when an interaction between factors was significant. Statistical significance was set a priori at p < 0.05 for all comparisons.

Behavior Data Analysis
We used IBM SPSS Statistics 25 for behavior data analyses. First, we averaged reaction time and accuracy at the individual level for each of the four task blocks. Then, at the group level, we conducted a three-way mixed ANOVA with group (advanced/elementary), direction (English-into-Japanese/Japanese-into-English), and familiarity (high/low) as the within-subject factors on RTs and accuracy for each task block. A simple main effect test was performed when an interaction between factors was significant. A two-way interaction contrast for each of group was tested to confirm how familiarity contrasts differ depending on translation direction (English-into-Japanese/Japanese-into-English) when a three-way interaction was significant. Thus, for each group, we first calculated the contrast between translation direction (English-into-Japanese minus Japanese-into-English) under each familiarity condition to generate two contrasts: Englishinto-Japanese minus Japanese-into-English contrast for high and low familiarity words, respectively. From these, we further generated a two-way interaction contrast for each group to represent the difference between high and low familiarity words, namely, [English-into-Japanese minus Japanese-into-English for high familiarity words] minus [English-into-Japanese minus Japanese-into-English for low familiarity words]. For each group, a one-sample t-test against zero was performed on the obtained contrast. Statistical significance was set a priori at p < 0.05 for all comparisons.

Cortical Activation Patterns
By integrating the statistical analysis, spatial registration of the channels, and subsequent macroanatomical labeling, the cortical activation patterns observed in the current study are described as below. For the advanced group, there was no FIGURE 4 | The results of the group analysis for the advanced group. Family wise errors due to multichannel measurement were corrected using the Bonferroni method. Significant t-values for MNI-registered channels are indicated by the color scale.  For the opposite translation direction, when the elementary group translated Japanese (L1) words with high familiarity into English (L2), four channels registered at Brodmann areas were significantly or marginally significantly activated: 2, the primary somatosensory cortex; 22, the superior temporal gyrus; 43, and the subcentral area. When the elementary group translated Japanese (L1) words with low familiarity into English (L2), one channel registered at Brodmann area 22, the superior temporal gyrus, was significantly activated. These results show that different brain areas were recruited during word translation between the advanced and the elementary groups. In the advanced group, the frontal area (English-into-Japanese) or the frontal area to the left temporal area (Japanese-into-English) were recruited only during lowfamiliarity word translation. The results suggest that these regions were involved in the cognitive mechanism with word translation for the advanced group. On the other hand, the results suggest that the activation of the left temporal region was related to translation in the elementary group, regardless of the direction and word familiarity of the translation. A detailed functional description of these areas is given in the "Discussion" section.

Comparison Between the Advanced and Elementary Groups
We conducted a three-way mixed ANOVA with group (advanced/elementary) as the between-subject factor and direction (English-into-Japanese/Japanese-into-English) and familiarity (high/low) as the within-subject factors to compare brain activations between the advanced and elementary groups ( Table 3). Before this, values were averaged between channels corresponding to the same anatomical label for channels activated in a one-sample t-test against zero (BA 2: channel 20, BA 10: channels 36, 46, 47, 48, and 49, BA 22: channels 41 and 42, BA 40: channel 10, BA 43: channel 30, BA 44/45: channels 39, 40, and 50, BA 46: channels 25 and 35).
In a channel corresponding to the left primary somatosensory cortex (BA 2), there was no significant main effect for group (advanced/elementary), direction (English-into-Japanese/Japanese-into-English), and familiarity (high/low). On the other hand, the interaction between group and familiarity was significant [F(1,38) = 9.27, p < 0.01, η p 2 = 0.20]. The simple main effect of group was larger for low-familiarity words than for high-familiarity words in the advanced group (p < 0.05).
In channels corresponding to the frontopolar area (BA 10), there was a significant main effect for direction [Japanese-into-English > English-into-Japanese; F(1,38) = 14.58, p < 0.001, η p 2 = 0.27]. The interaction between group and familiarity was significant [F(1,38) = 8.39, p < 0.01, η p 2 = 0.18]. A simple main effect of familiarity was larger for the advanced group than for the elementary group for low-familiarity words (p < 0.01). The interaction between group and direction was significant [F(1,38) = 7.67, p < 0.01, η p 2 = 0.17]. The simple main of group effect was larger for the Japanese-into-English direction than for the English-into-Japanese direction in the advanced group (p < 0.001). Also, the simple main effect of direction for the advanced group was larger than that for the elementary group in the Japanese-into-English direction (p < 0.05). In channels corresponding to the left superior temporal gyrus (BA 22), there was no significant main effect for group, direction, or familiarity. The interaction between group and familiarity was significant [F(1,38) = 6.19, p < 0.05, η p 2 = 0.14]. The simple main effect of group was larger for high-familiarity words than for low-familiarity words in the advanced group (p < 0.05). In a channel corresponding to the left Wernicke's area (BA 40), there was no significant main effect for group, direction, or familiarity. The interaction between group and familiarity was significant [F(1,38) = 6.29, p < 0.05, η p 2 = 0.14]. The simple main effect of group was larger for high-familiarity words than for low-familiarity words in the elementary group (p < 0.05). Also, the simple main effect of familiarity was larger for the advanced group than for the elementary group for = 0.14]. The simple main of group effect was larger for high-familiarity words than for low-familiarity words in the elementary group (p < 0.05). Also, the simple main effect of familiarity was higher for the advanced group than for the elementary group for low-familiarity words (p < 0.05). In channels corresponding to the right DLPFC (BA 46), there was a significant main effect for direction [Japanese-into-English > English-into-Japanese; F(1,38) = 8.97, p < 0.01, η p 2 = 0.19]. The interaction between group and familiarity was significant [F(1,38) = 11.01, p < 0.01, η p 2 = 0.23]. The simple main effect of group was larger for low-familiarity words than for high-familiarity words in the advanced group (p < 0.05). The simple main effect of group was larger for high-familiarity words than for low-familiarity words in the elementary group (p < 0.05). Also, the simple main effect of familiarity was larger for the advanced group than for the elementary group for low-familiarity words (p < 0.01).
These results suggest that language direction and word familiarity had different effects on brain activation between the advanced and elementary groups, with significant interactions in the six regions (the left primary somatosensory cortex: BA 2, the frontopolar area: BA 10, the left superior temporal gyrus: BA 22, the left Wernicke's area: BA 40, the left Broca's area: BA 44/45, and the right dorsolateral prefrontal cortex: BA 46). On the other hand, no main effect or interaction was observed for the activation in the left subcentral area (BA 43), which does not support different activation between the two groups.

Behavioral Data
The averaged reaction times (RTs) and accuracy for each group are shown in Figure 6. The three-way mixed ANOVA on RTs (Table 3)  The simple main effect of familiarity was larger for high-familiarity words than for low-familiarity words in the advanced group (p < 0.001). Also, the simple main effect of familiarity was larger for high-familiarity words than for low-familiarity words in the elementary group (p < 0.001). The interaction between group, direction, and familiarity was significant [F(1,37) = 18.85, p < 0.001, η p 2 = 0.34]. All simple main effects of familiarity at each level of direction, and all simple main effects of direction at each level of familiarity were larger for the advanced group than for the elementary group (for the high-familiarity words and in the English-into-Japanese direction (p < 0.01), for the high-familiarity words and in the Japanese-into-English (p < 0.001), for the low-familiarity words and in the English-into-Japanese direction (p < 0.001), and for the low-familiarity words and in the Japanese-into-English direction (p < 0.001). Since significant three-way interaction was observed for ACC, a two-way interaction contrast was examined for each group. As a result, for the advanced group, the mean value of the contrasts was −0.95 with a standard deviation of 2.72, which was not significant compared to zero [t(19) = −1.56, n.s.]. On the other hand, for the elementary group, the mean value of the contrasts was 2.63 with a standard deviation of 2.41, which was significantly larger than zero [t(18) = 4.76, p < 0.001]. Further, probing this interaction contrast in the elementary group, we found that, for high familiarity words, the contrast, English-into-Japanese minus Japanese-into-English, was larger than zero (p < 0.001). Conversely, for low familiarity words, the contrast, Englishinto-Japanese minus Japanese-into-English, was smaller than zero (p < 0.05).
To summarize, there were no differences for RTs between the advanced group and the elementary group, whereas there were significant differences for accuracy: the advanced group responded significantly more accurately than did the elementary group. The slower RTs and the lower accuracy for low familiarity words suggest that it is more difficult to translate low-familiarity words than high-familiarity words, regardless of the direction of the translation, for both advanced and elementary groups. However, for the elementary group, there was an interaction between familiarity and direction with the accuracy, suggesting that the elements of the difficulty were different between the advanced and the elementary groups. In addition, regarding ACC, the advanced group exhibited no significant two-way interaction between word familiarity and translation directions. However, the elementary group exhibited a significant two-way interaction. For high-familiarity words, they answered more accurately during English-into-Japanese translation, whereas for low familiarity words, they answered more accurately during Japanese-into-English translation.

DISCUSSION
We revealed that there were different brain activation patterns while English learners of Japanese translated Japanese (L1) words into English (L2) and vice versa depending on their English proficiency. Specifically, the advanced group elicited greater activation on the left prefrontal cortex around Broca's area while translating words with low familiarity, but no activation was observed while translating words with high familiarity. On the other hand, the elementary group evoked greater activation on the left temporal area including the superior temporal gyrus (STG) irrespective of word familiarity. These results suggest that different cognitive processes could be involved in word translation depending on English proficiency in Japanese learners of English. Hereafter we will discuss the activation patterns observed in the current study macro-anatomically in reference to previous neuroimaging studies.

Interpretation of Results
Consistent Activation in Broca's Area (BA 44/45) In the current study we observed activation in the languagerelated regions which were also reported in the former studies.
First of all, the activation on Broca's area (BA 44/45) during translation was consistently observed in previous studies (Klein et al., 1995;Rinne et al., 2000;Quaresima et al., 2002;Kovelman et al., 2008a), in which balanced bilinguals translated between languages with close or moderate distances. It has been suggested that the left prefrontal cortex, including the pars opercularis and the pars triangularis of Broca's area, is related to language comprehension and semantic processing (Devlin et al., 2003). Also, the areas have been revealed as being involved with understanding and retrieval of semantic ambiguity (Rodd et al., 2005). In our study, the advanced group elicited greater activation on Broca's area when translating words with low familiarity, which should demand higher cognitive loads. It is expected that Broca's area plays an important role in language processing with high cognitive loads. Considering the previous studies' results (Klein et al., 1995;Rinne et al., 2000;Quaresima et al., 2002;Kovelman et al., 2008a), it is likely that even balanced bilinguals experience considerable cognitive loads when translating languages with close or moderate LDs. This should be all the more so for advanced English learners translating words in a language with a large LD. For the elementary group, it was difficult to translate words with low familiarity as shown by their low accuracy (Figure 6). Due to the difficulty, they could not translate words with low familiarity and gave up answering correctly. In other words, the elementary group was not able to perform well in word perception itself, which is necessary for word production (Lüders et al., 1991;Indefrey and Levelt, 2000;Indefrey and Levelt, 2004;Hamberger and Cole, 2011). Thus, it is appropriate to interpret that the elementary group did not experience cognitive load or experienced a different kind of cognitive load than the advanced group, thus failing to recruit Broca's area (BA 44/45).

Consistent Activation in the Dorsolateral Prefrontal Cortex (BA 46)
The right dorsolateral prefrontal cortex (R-DLPFC: BA 46) was activated in some previous studies (Klein et al., 1995;Rinne et al., 2000;Kovelman et al., 2008a) and the advanced group in the current study also elicited significant activation in the region while translating Japanese (L1) words with low familiarity into English (L2). The DLPFC is related to verbal working memory (Salmon et al., 1996;Zurowski et al., 2002), which plays an important role in keeping information in mind and processing it simultaneously in a short time (Baddeley, 2003). This region has also been consistently activated during tasks requiring effortful retrieval, maintenance or control of semantic information (Cabeza and Nyberg, 1997). Activation of the right DLPFC was also observed in some previous studies (Klein et al., 1995;Rinne et al., 2000;Kovelman et al., 2008b) focusing on balanced bilinguals. In the present study, the behavioral results showed that the advanced group processed the stimuli more accurately during translation than did the elementary group. Based on the function of the right DLPFC, we considered that such high performance in the advanced group was made possible by their ability to make good use of their verbal working memory. To sum up, the left Broca's area and the right DLPFC were consistently activated in not only balanced bilinguals whose L1 is closely or moderately related to English, but also in the advanced Japanese learners of English. Therefore, we conclude that these areas are involved with word translation regardless of LDs. 1994; Kroll and de Groot, 1997;Duyck and Brysbaert, 2008). In the elementary group, semantic route processing seems to have been dominant, regardless of the translation direction and word familiarity. In the elementary group, the word concept was processed with the semantic route because a sufficient amount of vocabulary was not stored. Accordingly, the semantic route may have elicited activation of the STG, but not Wernicke's area, associated with vocabulary storage (Paulesu et al., 1993;Aboitiz et al., 2010;Kekang, 2019). On the other hand, in the advanced group, because of the relatively rich vocabulary storage, lexical route processing (Kroll and Stewart, 1994;Kroll and de Groot, 1997;Duyck and Brysbaert, 2008) for English (L2)-into-Japanese (L1) translation similar to bilingual second language processing (e.g., Green, 1998) may have taken place, resulting in activation in Wernicke's area.
The frontopolar area (BA 10) has been reported to serve a function in the processing of cognitive branching (Koechlin and Hyafil, 2007), in which we maintain in working memory a primary goal, while at the same time processing tasks related to a secondary goal (Ramnani and Owen, 2004). This region was activated when the advanced group translated Japanese (L1) words with low familiarity into English (L2). As in the case of BA 22, we suggest that BA 10 activation is another indicator of the large cognitive loads that advanced English learners have when translating unfamiliar L1 words into L2.

Limitations and Perspectives of This Experiment
Although we affirmed the brain activation patterns for Japanese learners of English during word translation with a large LD, there are some limitations as to the investigation of the mechanism of Japanese learners acquiring English. First, our study did not make clear how brain activation patterns for the elementary group change into those for the advanced group. It is unclear whether it would be continuous or discrete. For the future, examining brain activation patterns for Japanese learners of English with an intermediate level would allow us to clarify the transition of cognitive mechanisms with increasing English levels. Alternatively, longitudinal studies on how elementary learners become advanced would provide clearer evidence for the differential activation. Second, we did not investigate brain activation patterns for Japanese learners of English who are balanced bilinguals. Thus, cortical activation patterns for Japanese learners who completely acquire English remains uncertain. In our study, we recruited an advanced group whose TOEIC R scores were over the average score of Japanese learners. However, there are few Japanese learners in the advanced group who are considered balanced bilinguals. Therefore, to fully understand the mechanism of acquiring English by Japanese learners with a large LD, we need to examine brain activation patterns on balanced bilinguals whose L1 is Japanese and L2 is English. Finally, we measured only the frontal and temporal regions with multichannel fNIRS due to the inherent spatial limitations of the fNIRS setup. With this limitation in mind, we carefully selected the measurement areas based on previous results (e.g., Klein et al., 1995;Price et al., 1999;Quaresima et al., 2002) related with language translation. Though we have these limitations to consider, we present significant findings that brain activation patterns for Japanese learners of English vary depending on the level of acquired English and cognitive loads of translation tasks. This study provides the first evidence revealing the cognitive mechanisms during word translation between languages at a large LD from a functional neuroimaging perspective. Furthermore, our study may serve to provide an effective cognitive strategy for Japanese learners of English at the elementary level. Our results show that cortical activation on the left STG was observed for the elementary group, while Wernicke's area was activated for the advanced group. These results may reflect whether the semantic or lexical route was dominant when English learners processed words such as during translation. However, since our data were not longitudinal and we have yet to provide definitive evidence for proving this hypothesis, we still need to verify that the differences in performance and cortical activation between the advanced and elementary groups reflect the improvement of English proficiency as a second language. There has been a lot of discussion about cognitive strategies in language acquisition. The depth of lexical knowledge is related to word perception (Ouellette, 2006). For processing with the lexical route, it is necessary to improve the mental lexicon for the second language and to increase accessibility to it (Talamas et al., 1999;Kroll and Tokowicz, 2001;Ouellette, 2006). It will be interesting to incorporate these plausible factors in future studies to examine the relationship between cortical activation in Japanese learners of English at the elementary level during word translation and cognitive strategies. Together with the current findings, such an integrated examination may provide insight into effective cognitive strategies for second language acquisition.

DATA AVAILABILITY STATEMENT
The raw data supporting the conclusions of this article will be made available by the authors, without undue reservation.

ETHICS STATEMENT
The studies involving human participants were reviewed and approved by Ethics Committee of Chuo University. The patients/participants provided their written informed consent to participate in this study.

AUTHOR CONTRIBUTIONS
KS and ID devised the idea of this study. KS and TaT selected and created the experimental stimuli. KS, KN, TaT, KO, and ToT prepared the fNIRS measurements. KN, TaT, KO, and ToT collected and performed the data analyses. KN created the experimental programming. YK reviewed the statistical procedures. KS, KN, TaT, and ID wrote the manuscript. All the authors have read the final version of the manuscript and agreed to its publication.

FUNDING
This study was partly supported by the JSPS KAKENHI Grant No. 19K23389 to KN.