Fast Brain Plasticity during Word Learning in Musically-Trained Children

Children learn new words every day and this ability requires auditory perception, phoneme discrimination, attention, associative learning and semantic memory. Based on previous results showing that some of these functions are enhanced by music training, we investigated learning of novel words through picture-word associations in musically-trained and control children (8–12 year-old) to determine whether music training would positively influence word learning. Results showed that musically-trained children outperformed controls in a learning paradigm that included picture-sound matching and semantic associations. Moreover, the differences between unexpected and expected learned words, as reflected by the N200 and N400 effects, were larger in children with music training compared to controls after only 3 min of learning the meaning of novel words. In line with previous results in adults, these findings clearly demonstrate a correlation between music training and better word learning. It is argued that these benefits reflect both bottom-up and top-down influences. The present learning paradigm might provide a useful dynamic diagnostic tool to determine which perceptive and cognitive functions are impaired in children with learning difficulties.


INTRODUCTION
Learning new words is a specifically human faculty that mobilizes several perceptual and cognitive abilities: sound perception and discrimination, attention, associative learning, and semantic memory (Perfetti et al., 2005;Davis and Gaskell, 2009). Here, we investigated the temporal dynamics of word learning in a novel word learning paradigm using both behavioral measures and Event-Related Potentials (ERPs). The main question was whether musically-trained children would outperform controls in terms of novel word learning and semantic association skills and whether this would be reflected in dynamic brain plasticity measures (ERPs).
The ERP methodology is well-suited to examine the temporal dynamics of word learning and brain plasticity as reflected by modulations of ERP components. Previous research on word learning has shown that the N400, a negative-going component that typically develops between 300 ms and 600 ms post-stimulus onset (Kutas and Hillyard, 1980), increased in amplitude when meaningless items acquire meaning and then decreased with further repetitions. This effect has been demonstrated in infants (from 9-24 months old; Friedrich and Friederici, 2008;Torkildsen et al., 2009;Junge et al., 2012;Borgström et al., 2015;Friedrich et al., 2015) and in adults (see below). However, to our knowledge, it has not yet been investigated in children and the present study was intended to fill this gap by testing children between 8 and 12 years old.
The increase in N400 amplitude with word learning is typically very fast. This effect has been observed in native adult English-speakers after 14 h of learning the meaning of novel French words (McLaughlin et al., 2004), after less than 1 or 2 h of learning novel word-picture associations (Dobel et al., 2009(Dobel et al., , 2010, after 45 min of learning the meaning of rare words (e.g., ''clowder''; Perfetti et al., 2005) and even after a single exposure if a novel word or pseudoword was presented in a strongly constrained and meaningful context (Mestres-Missé et al., 2007;Borovsky et al., 2010Borovsky et al., , 2012Batterink and Neville, 2011). Moreover, this fast increase in N400 amplitude is typically largest over fronto-central brain regions (FN400), as demonstrated in language-learning tasks and it possibly reflects speech segmentation and the building-up of novel word meaning (De Diego Balaguer et al., 2007;Mestres-Missé et al., 2007), two processes that may develop in parallel (François et al., 2017). While the scalp distribution of ERP components does not necessarily reflect the activation of directly underlying brain structures, it is nevertheless interesting that fronto-central brain regions are also found to be involved in the maintenance of novel information in working-or short-term memory, in the formation of new associations (Hagoort, 2014) and/or the construction of word representations in episodic memory (Wagner et al., 1998;Rodríguez-Fornells et al., 2009). Further exposures then allow for the integration of these novel items into existing lexical networks (Dumay and Gaskell, 2007), with recent results emphasizing the role of sleep in these processes (Tamminen et al., 2010(Tamminen et al., , 2013Friedrich et al., 2015). Once these ''novel'' representations are stabilized, the N400 is largest over centro-parietal regions as typically found for the N400 to already known words (Kutas et al., 1988).
Recently, we conducted an experiment in young adults, professional musicians and non-musicians (Dittinger et al., 2016) to test the hypothesis that music training would positively influence novel word learning. This hypothesis builds up on the results reviewed above and on recent behavioral and brain imaging results suggesting that music and language not only involve common sensory-perceptual processes (both at sub-cortical and cortical levels, Kraus and Chandrasekaran, 2010;Asaridou and McQueen, 2013) but also attentional (Patel, 2008;Tervaniemi et al., 2009;Strait et al., 2010Strait et al., , 2015Perruchet and Poulin-Charronnat, 2013) and short-term memory resources (Ho et al., 2003;George and Coch, 2011) as well as executive functions (Pallesen et al., 2010;Moreno et al., 2011;Rogalsky et al., 2011;Zuk et al., 2014) that are involved in novel word learning. Results showed that adult musicians outperformed non-musicians in the most difficult task that required participants to map newly acquired words onto semantic associates. Moreover, the shift in N400 scalp distribution from frontal sites (FN400) during meaning acquisition, to parietal sites when word meaning is stabilized (N400) was faster in musicians than in non-musicians. Interestingly, results also showed that the amplitude of both the N200 and N400 components in the semantic task was larger to unrelated than to related words in musicians but not in nonmusicians. The aim of the present experiment was to determine whether these results would replicate when comparing children with and without music training. The general procedure, inspired from Wong and Perrachione (2007), and the specific hypotheses are described below (see Figure 1).
Children were first asked to categorize unfamiliar Consonant-Vowel syllables with respect to voicing, vowel length and aspiration. For this purpose, we used Thai monosyllabic words. Thai is a tonal and a quantitative language in which both tonal (i.e., 5 tones) and vowel length contrasts are linguistically relevant for understanding word meaning (e.g., /pa1/ low tone with a short vowel means ''to find'' and /pa:1/ low tone with a long vowel means ''forest''; Gandour et al., 2002). We hypothesized that if music training reinforces auditory perception and attention, children with music training should make fewer errors in the phonological categorization tasks than control children especially when the task is the most difficult, that is, for the phonological contrasts that do not belong to the French phonemic repertory (e.g., aspiration; Dobel et al., 2009). After the phonological categorization task, the same children were then asked to learn the meaning of novel words through picture-word associations. Based on the previous results reviewed above in adults and in infants, we hypothesized that both an N200 and an N400 over frontal regions (FN400) would develop during word learning in all children but that this effect would develop faster in children with music training as was recently shown in adult musicians compared to nonmusicians.
Following picture-word learning, children were then tested for the efficiency of learning using two tasks: a matching task in which they decided whether the picture-word associations matched or mismatched those seen in the word learning phase and a semantic task in which new pictures were presented that were semantically related or unrelated to the newly-learned words. Based on previous results (Meyer and Schvaneveldt, 1971), we expected mismatching and semantically unrelated words to be associated with higher error rates (ERRs) and/or slower Reaction Times (RTs) than matching FIGURE 1 | Experimental design. Children performed a series of tasks: First, in the phonological categorization tasks (A), six natural Thai mono-syllabic words had to be categorized based on voicing (Task 1), vowel length (Task 2), or aspiration contrasts (Task 3). Second, in the word learning phase 1 (B), each word was paired with its respective picture. Third, in the word learning phase 2 (C), two of the pictures were presented simultaneously on the screen, together with an auditory word that matched one of the two pictures. Fourth, in the matching task (D), the auditory words were presented with one of the pictures, either matching or mismatching the previously learned associations. Fifth, in the semantic task (E), the auditory words were presented with novel pictures that were either semantically related or unrelated to the words. and related words. Such typical semantic priming effects were predicted in all children, showing that they all learned the picture-word associations and that learning generalized to new pictures. Moreover, we expected these behavioral effects to be accompanied by larger N400s in all children to mismatching and semantically unrelated words than to matching/related words over parietal brain regions, as typically reported for the N400 to already known words in children (Holcomb et al., 1992;Juottonen et al., 1996;Hahne et al., 2004) and in adults (Kutas and Federmeier, 2011, for review). Finally, based on our previous results with adults (Dittinger et al., 2016), we predicted lower error rates and larger N200 and N400 effects (the difference between mismatching/semantically unrelated words and matching/related words) in children with music training than in controls.

Participants
A total of 32 children, native speakers of French without known hearing or neurological deficits participated in this experiment with 16 children that were involved in extrascholar music training (MUS) and 16 children not involved in music training (NM), except for obligatory music lessons at school. However, participants in the NM group participated at least in one extra-scholar activity that was not related to music (e.g., sports, painting, dance), which suggests that both groups of children benefitted from stimulating extra-scholar environments.
Three children (2 with and 1 without music training) were excluded based on their level of performance in the matching and semantic tasks (i.e., percentage of error ± 2 standard deviations away from the mean) and six children (2 with and 4 without music training) because of too many artifacts in the electrophysiological data, leading to an attrition rate of 28% which is not uncommon in ERP studies with children (De Boer et al., 2005). The final group of musically-trained children (MUS) comprised six boys and six girls with three left-handers (mean age = 134.0 months, SD = 13.5) and the group of control children (NM) six boys and five girls with one left-hander (mean age = 124.5 months, SD = 20.0; F (1,21) = 1.84, p = 0.19).
In the music group, children practiced music for an average of 4.9 years (4-7 years; 5 children played the piano, 2 the trumpet, 2 the trombone, 2 the violin, and 1 the saxophone). None of the children was bilingual and all children had similar socio-economic background ranging from middle to low social class as determined from the parents' profession and according to the criteria of the National Institute of Statistics and Economic Studies (MUS: 4.4 and NM: 4.3). The protocol was approved by the local Ethical Review Committee of Aix-Marseille University, and the study was conducted in accordance with local norms and guidelines for the protection of human subjects. All children agreed to participate in the experiment once the procedure had been explained to them. Children were also told that they could stop the experiment at anytime if they felt uncomfortable (none did). Finally, at least one parent accompanied each child to the laboratory and signed an informed consent form in accordance with the Declaration of Helsinki before the experiment. Children were given presents at the end of the session to thank them for their participation. The experiment lasted for 2.5 h, which included the pose of the electrocap.

Musical Aptitude
Children performed two musicality tests (adapted from the MBEA battery; Peretz et al., 2003) consisting in judging whether pairs of piano melodies were same or different, based either on melodic or on rhythmic information.

Visual Stimuli
For the learning phase, six pictures representing familiar objects (i.e., bear, flower, key, chair, bell, eye) were selected based on the standardized set of 260 pictures (that are matched for name and image agreement, familiarity and visual complexity) built by Snodgrass and Vanderwart (1980). The same pictures as in the learning phase were presented in the matching task. For the semantic task, 36 new pictures that the children had not seen before in the experiment and that were semantically related or unrelated to the meaning of the newly-learned words were chosen from the internet by two of the authors (JC and MB). Semantic relatedness between new and old pictures (that is, those previously presented during the word learning phase and those presented in the semantic task) was confirmed by results of pre-tests with pilot children.

Experimental Tasks
Children were tested individually in a quiet experimental room while they sat in a comfortable chair at about 1 m from a computer screen. Auditory stimuli were presented through HiFi headphones (Sennheiser, HD590). Visual and auditory stimuli presentation, as well as the collection of behavioral data, were controlled by the ''Presentation'' software (NeuroBehavioral Systems, Version 11.0). Children performed six concatenated tasks (see Figure 1).

Phonological Categorization Task
Children performed three different phonological categorization tasks that lasted for 2.5 min each (see Figure 1A). All six Thai monosyllabic words were presented in each task using a 2500 ms Stimulus-Onset-Asynchrony (SOA). Children were asked to categorize them based upon different features in each task: (1) voicing contrast (e.g., /ba1/ vs. /pa1/); (2) vowel length (e.g., short: /ba1/ vs. long /ba:1/); and (3) aspiration contrast (e.g., /pa1/ vs. /p h a1/). For each task, the contrast was visually represented on the screen (see Figure 1A). Response side and task order were counterbalanced across children. Children were asked to press as quickly and as accurately as possible the left or right hand response button according to whether the auditory words matched the visual representation on the left or right side of the screen. Each monosyllabic word was presented 10 times in a pseudorandomized order with the constraints of no immediate repetition of the same word, and no more than four successive same responses.

Word Learning Phase 1
Children were asked to learn the meaning of each word previously presented in the phonological categorization task through picture-word associations. No behavioral response was required, but children were asked to remember the words for subsequent tests. The picture was presented first, and then followed after 800 ms by one of the six words. For instance, a drawing of an eye was followed by the auditory presentation of the word /pa1/ and thus /pa1/ was the word for eye in our ''foreign'' language (see Figure 1B). Two different lists were built so that across children different pictures were associated with different words. Total trial duration was 2800 ms. Each of the six picture-word pairs was presented 20 times, resulting in 120 trials that were pseudo-randomly presented (i.e., no immediate repetition of the same association) in two blocks of 3 min each. To closely follow the brain dynamics involved in word learning, ERPs in each block were further divided into two sub-blocks for a total of four sub-blocks (i.e., Block 1: trials 1-30; Block 2: trials 31-60; Block 3: trials 61-90 and Block 4: trials 91-120).

Word Learning Phase 2
To consolidate learning, children performed a task in which two different pictures were simultaneously presented on the left and right sides of the screen, followed after 500 ms by one of the six words (see Figure 1C). Children were asked to press the left response key if the word matched the picture on the left side of the screen or the right key if the word matched the right-side picture (half of the stimuli in each condition). Visual feedback regarding response correctness was given, followed by the presentation of the correct picture-word pair to strengthen the association. Total trial duration was 6000 ms. Each of the six picture-word pairs was presented 10 times, resulting in 60 trials that were pseudo-randomly presented (i.e., no immediate repetition of the same association and no more than four successive same responses), within two blocks of 3 min each. Behavioral data were analyzed but not ERPs because the procedure was complex and comprised too many events.

Matching Task
One of the six pictures was presented, followed after a 750 ms delay by an auditory word that matched or mismatched the associations previously learned. For instance, while the drawing of an eye followed by /pa1/ was a match, the drawing of a flower followed by /pa1/ was a mismatch (see Figure 1D). Children gave their responses by pressing one out of two response keys as quickly and as accurately as possible. Some examples were given before starting the task. Response hand was counterbalanced across children. At the end of the trial, a row of X's appeared on the screen. Children were asked to blink during this time period (1500 ms; total trial duration: 4750 ms) in order to minimize eye movement artifacts during word presentation. Each word was presented 20 times, half in the match and half in the mismatch conditions. The total of 120 trials was pseudorandomly presented (i.e., no immediate repetition of the same association and no more than four successive same responses) within two blocks of 5 min each.

Semantic Task
One of the new pictures was presented, followed after 1500 ms by a semantically related or unrelated word. For instance, while the picture of glasses was semantically related to the previously learned word /pa1/ (i.e., ''eye''), the picture of a watering can was semantically unrelated to /pa1/ (see Figure 1E). Children were asked to decide as quickly and as accurately as possible if the auditory word was semantically related to the new picture. Responses were given by pressing one of two response keys. Response hand was counter-balanced across participants and some examples were given before starting the task. At the end of the trial a row of X's appeared on the screen, and children were asked to blink during this time period (1500 ms; total trial duration: 7000 ms). Each word was presented 12 times but none of the new pictures were repeated, so that on each trial the words were always associated with a different related or unrelated picture. Half of the picture-word pairs were semantically related and half were semantically unrelated. A total of 72 trials was presented pseudo-randomly (i.e., no immediate repetition of the same association and no more than four successive same responses) within two blocks of 4.2 min each.

EEG Data Acquisition
The Electroencephalogram (EEG) was continuously recorded at a sampling rate of 512 Hz with a band-pass filter of 0-102.4 Hz by using a Biosemi amplifier system (Amsterdam, BioSemi Active 2) with 32 active Ag-Cl electrodes (Biosemi Pintype) located at standard positions according to the International 10/20 System (Jasper, 1958). EEG recordings were referenced on-line to a common electrode (CMS) included in the headcap (next to Cz). Two additional electrodes were placed on the left and right mastoids and data were re-referenced off-line to the average activity of the left and right mastoids, filtered with a bandpass filter from 0.1-40 Hz (slope of 12 dB/oct). The electrooculogram (EOG) was recorded from flat-type active electrodes placed 1 cm to the left and right of the external canthi, and from an electrode beneath the right eye. Electrode impedance was kept below 5 kΩ. EEG data were analyzed using the Brain Vision Analyzer software (Version 1.05.0005; Brain Products, Gmbh). Independent component analysis (ICA) and inverse ICA were used to identify and remove components associated with vertical and horizontal ocular movements. Finally, baseline correction, DC-detrend and removal of artifacts above a gradient criterion of 10 µV/ms or a max-min criteria of 100 µV over the entire epoch were applied resulting in an average of 12% of rejected trials. For each child, ERPs were time-locked to word onset, and segmented (including a 200 ms baseline) into 1200 ms epochs in the phonological categorization tasks and into 1700 ms epochs in the other tasks (i.e., word learning phase 1, matching and semantic tasks). To increase the signal to noise ratio, all responses were considered to compute the individual averages.

Statistical Analysis
Analysis of Variance (ANOVAs) were computed using the Statistica software (Version 12.0, StatSoft Inc., Tulsa). For errors (ERRs) and RTs in each task, ANOVAs included Group (MUS vs. NM) as a between-subject factor. Additional within-subject factors were Task (voicing vs. vowel length vs. aspiration) in the phonological categorization task and Condition (match vs. mismatch or related vs. unrelated, respectively) in the matching and semantic tasks.
ERPs in the phonological categorization task were analyzed by computing N100 maximum amplitudes in the 90-160 ms latency band. During the word learning phase 1, as well as in the matching and semantic tasks, ERPs were analyzed by computing the mean amplitudes of the N200 and N400 components in specific latency bands defined from visual inspection of the traces and from previous results in the literature ANOVAs included Group (MUS vs. NM) as a between-subject factor, Laterality (left: F3, C3, P3; midline: Fz, Cz, Pz; right: F4, C4, P4), and Anterior/Posterior (frontal: F3, Fz, F4; central: C3, Cz, C4; parietal: P3, Pz, P4) as within-subject factors. Additional within-subjects factors were Task (voicing vs. vowel length vs. aspiration) in the phonological categorization task, Block (1 vs. 2 vs. 3 vs. 4) in the word learning phase 1, and Condition (match vs. mismatch or related vs. unrelated, respectively) in the matching and semantic tasks. Post hoc Tukey tests (reducing the probability of Type I errors) were used to determine the origin of significant main effects and interactions. Finally, to examine the relationship between musical aptitude (i.e., ERRs in the musicality tasks) and word learning (i.e., ERRs in the semantic task), a linear regression model was fitted, with level of word learning as the dependent variable and level of musical aptitude as the predictor.

Relationship between Musical Aptitude and Word Learning
A highly significant correlation was found between musical aptitude and word learning (R 2 = 0.31, F (1,21) = 10.94, p = 0.003), which reflected the fact that children with fewer errors in the musical aptitude task (i.e., musically-trained children) achieved higher levels of word learning (i.e., fewer errors in the semantic task; see Figure 7). As screening measures showed a trend towards group differences on nonverbal intelligence (PM47, p = 0.07), and as age differences although not significant (p = 0.19) may influence word learning performance, two separate partial correlations were computed controlling for these variables. In both cases, the partial correlation between musical aptitudes and word learning remained highly significant when controlling for PM47 (r = 0.51, p = 0.01) or age (r = 0.51, p = 0.02).

DISCUSSION
This series of experiments revealed three main findings. First, ERPs recorded in the word learning phase showed that the temporal dynamics of novel word learning, as reflected by significant modulations of N200 and FN400 amplitudes after only a few minutes of picture-word associative learning, was faster in children with music training than in control children. Second, while all children were able to learn the meaning of new words, music training was associated with more efficient learning of picture-word associations as reflected by both behavioral and electrophysiological data in the matching and semantic tasks. Finally, a fronto-parietal network was involved in word learning with a shift of the distribution of the N200 and N400 components from frontal regions in the learning phase to parietal regions during the test phase (matching and semantic tasks). These findings are discussed below.

Fast Brain Plasticity in the Word Learning Phase
Recording ERPs during the four blocks of the learning phase allowed us to precisely follow the temporal dynamics of the learning process. ERPs averaged across all children clearly showed that large changes in brain activity occurred very rapidly during the acquisition of word meaning (see Figure 4A). As hypothesized based on previous results in adults (Mestres-Missé et al., 2007;Borovsky et al., 2012;Dittinger et al., 2016;François et al., 2017) and in infants  (Torkildsen et al., 2006;Friedrich and Friederici, 2008), results showed an increased long-lasting negativity from Block 1 to Block 2 over fronto-central sites, comprising both an N200 and an FN400 components, taken to reflect learning of novel picture-word associations (see Figures 4A,B, topographic maps). It is notable that the N200 component is more clearly visible than in previous experiments, possibly due to auditory rather than visual word presentation. Moreover, the overall amplitude of the N200 and FN400 components is much larger, and the FN400 component is longer-lasting, in children than in adults (Dittinger et al., 2016; see Figure 4). The differences between Block 1 and Block 2 were localized over fronto-central regions. This scalp distribution is very similar to previous results in word segmentation experiments François et al., 2014) and is compatible with previous findings suggesting that prefrontal and temporal brain regions are associated with the maintenance of novel information in working memory (Hagoort, 2014) and with the acquisition of word meaning (Rodríguez-Fornells et al., 2009). What is most remarkable is that these amplitude modulations were observed after only 3 min of learning novel word meanings (that is after only 10 repetitions of each picture-word association), thereby showing clear evidence for fast mapping (Carey, 1978) as reflected by fast changes in brain activity. Importantly, and as previously found in adults (Dittinger et al., 2016), these effects were significant in musically-trained children but not in children without music training. Thus, in line with our hypothesis, these results showed evidence for faster encoding of novel word meaning in musically-trained children. Interestingly, and strikingly similar to previous results in word segmentation experiments François et al., 2014), FN400 amplitude was already decreased from Block 2 to Block 3 (i.e., after 3-4 min), possibly due to repetition effects (Rugg, 1985) that contribute to learning. Cunillera et al. (2009) interpret their similar findings in light of the time-dependent hypothesis (Raichle et al., 1994;Poldrack et al., 1999) following which increased activation (as reflected by the FN400) is only found during the initial learning period and quickly decreases when words have been identified, or, as in our experiment, when meaning has been attached to the auditory word-form. Finally, no differences were found between Block 3 and Block 4 possibly because all children had reached a learning threshold.
Turning to the N200 component, and in contrast to what was previously found in adult non-musicians (Dittinger et al., 2016), the differences between Block 1 and Block 2 were not significant in children with no specific music training. Insofar as the N200 reflects categorization processes (Friedrich and Friederici, 2008), it may be that these children had not yet learned to categorize the correct word with the correct picture. Alternatively, and based on recent results by Du et al. (2014) showing enhanced N200 amplitude when Chinese compound words are repeated in priming experiments, it may be that adult non-musicians were more sensitive to the repetition of FIGURE 7 | Linear regression model. Musical aptitude (ERRs in the musicality task) against word learning performance (ERRs in the semantic task). Children with music training are illustrated in red (MUS) and children without music training in black (NM). words in the learning phase than children with no music training.

Testing Novel Word Learning in the Matching and Semantic Tasks
All children were able to learn the six picture-word associations within a short learning phase (around 12 min total time for both word learning phases 1 and 2) as shown by the low percentage of errors in the active learning phase (<21% in both groups) and in the matching and semantic tasks (between 13% and 38% across groups). Importantly, the level of performance in both tasks and in both groups was above chance level (50%) and far from ceiling or floor effects thereby showing that the level of task difficulty was not too easy nor too difficult. In line with previous findings in the literature and with findings in adults using a similar design (Dittinger et al., 2016), results in the matching task showed clear matching effects with lower error rates and faster RTs to matching than to mismatching words thereby showing that all children had learned the picture-word associations presented in the word learning phase. Moreover, this learning effect generalized to new pictures in the semantic task, as revealed by faster RTs to auditory words semantically related to new pictures than to unrelated words (Meyer and Schvaneveldt, 1971;Dittinger et al., 2016). However, the semantic priming effect was not significant on error rates. While surprising, this finding possibly reflects a response bias towards rejection: when children were not certain whether pictures and words were semantically related (e.g., ''honey'' and ''bear''), they tended to respond that they were unrelated. This interpretation is in line with the adult results showing that participants made significantly fewer errors for unrelated than for related words.
Finally, and perhaps most importantly, while the matching (on both errors and RTs) and semantic priming effects (on RTs) were significant in both groups (no Group by Condition interaction), musically-trained children made significantly fewer errors than controls in both the matching and the semantic tasks (main effect of Group), suggesting that they had learned the meaning of novel words more efficiently than controls. Importantly, the effect of musicianship in children was very similar to what was found in adults (Dittinger et al., 2016). In the semantic task, adult musicians outperformed adult nonmusicians. Moreover, the level of performance was similar in both groups (musician children [23.1%] and adults [23.6%]; non-musician children [33.9%] and adults [30.5%] thereby showing that the level of task difficulty was similar for children [learning 6 novel words] and adults [learning 9 novel words]). In the matching task, adult musicians made fewer errors than adult non-musicians but, in contrast to children, this difference did not reach significance possibly because the matching task was too easy to reveal a between-group difference in adults.
Comparison of the electrophysiological data in the matching task between children and adults also revealed interesting differences. While children without music training showed a typical N400 effect over central electrodes (N400 larger to mismatch than match words), adults without music training showed a reversed N400 effect over frontal electrodes (see Figure 5 of Dittinger et al., 2016) that we interpreted as showing that they had not yet fully integrated the meaning of novel words into pre-existing semantic networks. Following this interpretation, non-musician children, by showing typical N400 effects, were faster in integrating the meaning of novel words than non-musician adults. However, this speculation needs further support to be convincing since adult non-musicians performed as well as adult musicians in the matching task but children with music training outperformed control children. Finally, as in the word learning phase, the N200 effect was significant in adults but not in children with no music training, again possibly because adults were more sensitive to word repetition (Du et al., 2014) than children. Maybe more interestingly, children without music training showed an N400 effect without an N200 effect in the matching task, thereby supporting the hypothesis that both components reflect independent processes (e.g., Du et al., 2014;see Hofmann and Jacobs, 2014; for a detailed discussion of this issue).
Turning to the semantic task and in line with the behavioral results showing that children with music training learned the meaning of novel words more efficiently than control children, N200 and N400 amplitudes were significantly larger for unrelated than for related words over parietal regions in children with music training but not in non-musician children (see Figure 6). Again, these results are very similar to previous results in adults (Dittinger et al., 2016) showing significant N200 and N400 semantic priming effects over parietal regions in adult musicians but not in non-musicians. Only the N200 effect was significant in adult non-musicians, again pointing to the independance of these two components (Du et al., 2014). By contrast, it is striking that, similar to adult non-musician results in the matching task, reversed N400 effects (larger N400 to related than to unrelated words) were found in the semantic task, both in children without music training, over left central sites (see Figure 6B) and in non-musician adults, over frontal sites (see Figure 6 of Dittinger et al., 2016). Below we propose an interpretation of these surprising results that showed up in two independent samples.
Results in the word learning literature have shown that the N400 is larger for semantically unrelated than for related words in both lexical decision tasks (Borovsky et al., 2012) and semantic priming experiments (Mestres-Missé et al., 2007). This is taken as evidence that novel words are processed differently based on previously learned associations and that, with training, the meaning of novel words is rapidly integrated into semantic memory networks (Mestres-Missé et al., 2007;Batterink and Neville, 2011;Borovsky et al., 2012). Based on this interpretation, the different N400 effects for children with and without music training in the semantic task suggest that while musically-trained children had already integrated the meaning of the novel words into semantic memory, as reflected by typical N400 effects, this was not yet the case for control children (reversed N400 effects). In other words, while all children were able to retrieve the specific picture-word associations that were stored in episodic memory during the word learning phase, as reflected by typical N400 effects in the matching task, generalization of learning as seen through priming effects from new pictures semantically related to the novel words could possibly take longer for control children than for musically-trained children. In sum, differences between musically-trained and untrained participants (both children and adults, Dittinger et al., 2016) were larger when the task required retrieving general information from semantic memory in the semantic task than retrieving specific picture-word associations in the matching task.
Finally, in contrast to the frontally-distributed N400 component during the early stages of learning discussed above, the N400 effect in the test phase was clearly centroparietally distributed. Thus, when the meaning of words was already learned, as in the matching and semantic tasks (see Figures 5B, 6B), and as in typical N400 experiments with known words (Kutas et al., 1988), the N400 showed a more parietal scalp distribution that possibly reflects access to the meaning of words already stored in semantic memory or the integration of novel words meaning in existing semantic networks (Batterink and Neville, 2011). In sum, by recording ERPs both in the word learning phase and in the matching and semantic tasks from the same participants, we found a clear fronto-parietal shift in N400 scalp distribution with learning (compare Figures 4,  5, 6). Importantly, this shift in N400 distribution from the acquisition to the consolidation of novel word meaning was also found in adults (Batterink and Neville, 2011;Dittinger et al., 2016).

The Cascade and Multi-Dimensional Interpretations
We previously proposed two complementary bottom-up and top-down interpretations to account for the advantage of musician compared to non-musician adults in novel word learning (Dittinger et al., 2016). Following the ''cascade '' interpretation (bottom-up), increased auditory sensitivity is the driving force behind enhanced word learning in musicians. According to this view, enhanced auditory perception and attention in musicians (Kraus and Chandrasekaran, 2010;Besson et al., 2011;Strait et al., 2015) allow one to build clear and stable phonological representations (Anvari et al., 2002;Corrigall and Trainor, 2011) that are more easily discriminable and consequently easier to associate with specific meanings and to store in semantic memory. Previous reports provided clear evidence that music training improves sensitivity of auditory-related brain regions (Schneider et al., 2002;Elmer et al., 2013;Kühnis et al., 2014) and fosters the ability to focus and maintain attention on auditory stimuli (Magne et al., 2006;Moreno et al., 2009;Tervaniemi et al., 2009;Strait et al., 2010Strait et al., , 2015Corrigall and Trainor, 2011).
In line with these results, the level of performance in the three phonological categorization tasks (voicing, vowel length and aspiration) was significantly higher in musically-trained children than in controls (see Figure 2A). This supports the hypothesis that music training is associated with clearer and more stable phonological representations. This, in turn, may facilitate the learning of new picture-word associations in the word learning phase. However, independently of music training, the N100 amplitude was largest to the unfamiliar, non-native aspiration contrast (see Figure 3). This result differs from previous ones in adults showing larger N100s to the aspiration contrast only in professional musicians (Dittinger et al., 2016). It may be that the differential sensitivity to familiar and unfamiliar phonetic contrasts decreases from childhood to adulthood and that music training helps to maintain this sensitivity. This interpretation needs to be further tested in future experiments.
Following the multi-dimensional interpretation, music training not only improves auditory sensitivity but also other functions that are relevant for novel word learning. For instance, there is evidence that music training enhances short-term memory (Ho et al., 2003;George and Coch, 2011) and executive functions (Pallesen et al., 2010;Moreno et al., 2011;Rogalsky et al., 2011;Zuk et al., 2014). In line with this interpretation, the present results showed that music training influenced associative learning and semantic integration as reflected by larger modulations of the N200 and N400 components in the matching and semantic tasks (McLaughlin et al., 2004;Perfetti et al., 2005;Mestres-Missé et al., 2007). Moreover, there is also evidence from at least one longitudinal intervention study that 1 year of music training is enough to enhance verbal and performance IQ as compared to drama lessons (Schellenberg, 2004). Consistent with these findings, musicallytrained children in our study showed a trend for higher nonverbal IQs (as measured with the PM47, p = 0.07) than controls. It is thus possible that children with music training performed better in the matching and semantic tasks not because music training enhanced auditory perception or different aspects of language processing but because, in general, increased cognitive abilities improved word learning (Banai and Ahissar, 2013;Zatorre, 2013). This is a difficult issue. One could indeed try to match children's level of performance on several cognitive abilities (e.g., working and short-term memory, general intelligence). However, this might result in a selection bias against musically-trained children if superior cognitive abilities were a direct consequence of music training. Coming back to our results, it is notable that children performing higher in the musicality tests also performed higher in the most difficult semantic task (see Figure 7), and that this correlation remained highly significant also when controlling for the influence of nonverbal general intelligence. Thus, while music training is likely to influence high level cognitive functions that could facilitate word learning, the results found in the present experiment do not seem to be mediated by nonverbal intelligence. In sum, facilitated word learning in children with music training probably results from the strong interplay between improved auditory perception and higher cognitive functions so that the cascade and multidimensional interpretations are best considered as complementary.

CONCLUSION
Our results showed that all children were able to learn the meaning of novel words and that, similar to previous results found in adults (Batterink and Neville, 2011;Dittinger et al., 2016) word learning was associated with a frontoparietal shift of the topographical distribution of the N400 and N200 components that developed with learning. Importantly, a few years of music training (4.5 years on average) was found to positively correlate with word learning: musicallytrained children performed higher than controls in both the matching and the semantic tasks and the electrophysiological markers of word leaning (the N200 and N400 effects) were larger in children with than without music training. To our knowledge, this is the first report of a relationship between music training and the semantic aspects of language processing in children. Importantly, these results extend previous findings showing that music training enhanced phonological awareness, reading development, word comprehension and syntactic processing (Anvari et al., 2002;Jentschke and Koelsch, 2009;Corrigall and Trainor, 2011). These results also support the hypothesis that second language learning is facilitated by musical training (Slevc, 2012;Chobert and Besson, 2013;Moreno et al., 2015) and taken together they provide strong evidence for the importance of music classes in primary school.

Limitations and Perspectives
The first limitation of the present study is the small number of children in each group. Although we tested a relatively large group of 32 children with the aim of having 16 participants in each group, several children had to be discarded for technical reasons. Nevertheless, two main arguments support the robustness of our findings. First, even with a small sample size, the effects of main interest were significant (and therefore statistically valid, Friston, 2012) in musically-trained children and not significant in control children, thereby showing clear between-group differences. Second, as discussed above, the main effects found for children in the different experiments described here are remarkably similar to those previously found using a very similar paradigm with musician and non-musician adults (Dittinger et al., 2016). Thus, the correlation between music training and better novel word learning (both in behavior and ERPs) was replicated in two independent samples of participants.
The second limitation is that, while we would like to attribute the reported differences between musically-trained and untrained children to music training, the present experiment does not allow to rule out that differences other than music training accounted for the observed between-group differences. The only way to demonstrate the causal role of music training is to conduct a longitudinal study with non-musician children trained with music and to compare results with another group of non-musician children trained with an equally interesting non-musical activity. However, before conducting such longitudinal studies to ascertain the origins of the differences, it is first of primary importance to demonstrate differences between musically-trained and control children in novel word learning, and this was the aim of the present study.
Finally, the series of experiments used in this paradigm allowed us to test for auditory perception of linguistic and non-linguistic sounds (musicality tests), for auditory attention and for associative and semantic memory. Thus, an interesting perspective would also be to use this paradigm as a diagnostic tool to determine which specific computations and cognitive functions are impaired in children with learning difficulties or in patients with degenerative disorders.

AUTHOR CONTRIBUTIONS
MB, JCZ and JC designed and supervised the research; JC collected EEG data and ED analyzed the EEG data; MB and ED wrote the manuscript, and JCZ and JC contributed to the manuscript.