Linguistic Skills in Bilingual Children With Developmental Language Disorders: A Pilot Study

The current pilot study compared the linguistic characteristics of a cohort of simultaneous bilingual children (Italian, L1; German L2) with developmental language disorders (DLDs) and those of bilingual peers with typical language development (TLD). Importantly, the two groups were balanced for a number of environmental variables (e.g., age of first exposure to the L2, acquisition contexts, degree of exposure to both languages) known to affect linguistic development in both TLD and DLDs. The analyses included the assessment of the participants’ phonological short-term memory. Their lexical, grammatical and narrative abilities were analyzed in both languages by administering the Italian and German equivalent forms of the Battery for the assessment of language in children aged 4 to 12 – BVL_4-12 (Marini et al., 2015). The children with DLDs had reduced phonological short-term memory and lexical skills that, in turn, contributed to the reduced levels of local coherence and informativeness of their narratives. Such difficulties were found at similar levels in their two languages. These results suggest that reduced phonological short-term memory and lexical selection skills may reflect a core symptom in both mono- and bilingual children with developmental language disorders.

The current pilot study compared the linguistic characteristics of a cohort of simultaneous bilingual children (Italian, L1; German L2) with developmental language disorders (DLDs) and those of bilingual peers with typical language development (TLD). Importantly, the two groups were balanced for a number of environmental variables (e.g., age of first exposure to the L2, acquisition contexts, degree of exposure to both languages) known to affect linguistic development in both TLD and DLDs. The analyses included the assessment of the participants' phonological short-term memory. Their lexical, grammatical and narrative abilities were analyzed in both languages by administering the Italian and German equivalent forms of the Battery for the assessment of language in children aged 4 to 12 -BVL_4-12 (Marini et al., 2015). The children with DLDs had reduced phonological short-term memory and lexical skills that, in turn, contributed to the reduced levels of local coherence and informativeness of their narratives. Such difficulties were found at similar levels in their two languages. These results suggest that reduced phonological short-term memory and lexical selection skills may reflect a core symptom in both mono-and bilingual children with developmental language disorders.

INTRODUCTION
Recent estimates suggest that approximately 7,000 languages are currently spoken worldwide in barely 196 Countries (Simons and Fennig, 2018). This suggests that the vast majority of the world's population is consistently exposed to two or more languages and can be considered bilingual (Baker, 2006;Marian and Shook, 2012;Jonak, 2015). Bilingualism has many faces. For example, if the focus is on the age of acquisition and the child is adequately exposed to his/her languages since birth, then (s)he can be considered a simultaneous bilingual. If (s)he is exposed to a second language after the second/third year, then (s)he is an early or late sequential bilingual depending on whether L2 acquisition begins in early or late childhood/adulthood (De Houwer, 2009;Meisel, 2009;Paradis et al., 2011). According to the level of proficiency in their two languages, bilinguals can be dominant (if they are more fluent in one language than the other) or balanced (if they master the languages with the same proficiency) (e.g., Peal and Lambert, 1962). In this article we will refer to a pragmatic definition of bilingualism, namely the regular use of two (or more) languages independently from other variables such as age of acquisition or proficiency level (Grosjean, 1992;Uljarević et al., 2016).
A particularly delicate issue regards the identification of children who are exposed to different languages and have developmental language disorders (DLDs) (Kay-Raining Bird et al., 2016). The label DLDs has recently been endorsed for referral to children who have language disorders not associated with a known medical etiology even if they may co-occur with other neurodevelopmental disorders (Snowling et al., 2017). It replaces previous labels such as Specific or Primary Language Impairments. That said, the condition of bilingualism poses serious difficulties to clinicians. Children with typical language development (TLD) exposed to two or more languages may be incorrectly diagnosed with language impairments. However, individuals with DLDs exposed to a bilingual context may not be correctly diagnosed. The former error is often referred to as overdiagnosis or mistaken identity; the latter as underdiagnosis or missed identity (Salameh et al., 2002;Genesee et al., 2004;De Jong, 2008;Armon-Lotem, 2012;Grimm and Schulz, 2014). Both over-and underdiagnosis stem from the observation that children with typical development exposed to two or more languages might experience difficulties (e.g., limited verb accuracy, reduced lexical repertoire, and morphosyntactic impairments; Michael and Gollan, 2005;Bialystok et al., 2010;Vender et al., 2016) that resemble those observed in children with DLDs (e.g., Paradis and Crago, 2000;Håkansson, 2001;Paradis, 2005;Paradis et al., 2008). However, converging evidence suggests that in children with typical development who have been adequately exposed to two languages the milestones of both first and second language development may be similar to those observed in monolingual peers (e.g., Paradis et al., 2011;Marini et al., 2016). Indeed, the exposure to a bilingual context is not a risk factor for lexical development (e.g., Marini et al., 2017). Differences in the age of first exposure to the L2 (simultaneous, early or late sequential), acquisition contexts (e.g., at home or at school), and degree of exposure to both L1 and L2 might significantly affect linguistic development (Schwartz, 2004;Meisel, 2007;Paradis et al., 2010;Armon-Lotem, 2012;Cattani et al., 2014). A further complication relates to the additional effects potentially exerted by other cognitive functions. For example, increasing evidence suggests that phonological shortterm and working memory play a major role in language learning and processing (e.g., Kormos and Sáfár, 2008;Engel de Abreu and Gathercole, 2012;Verhagen and Leseman, 2016;Riva et al., 2017) and significantly affect children's performance on lexical and grammatical production tasks (Marini et al., 2014). This supports the need to consider both environmental and cognitive factors when dealing with children exposed to more than one language. Only such a comprehensive account of the child's bilingual experience and cognitive profile may allow clinicians to correctly interpret their performance on standardized linguistic testing. Crucially, such preliminary information is necessary but not sufficient. An accurate assessment of a bilingual's two languages requires the administration of equivalent tests in the child's two languages with norm-referenced measures for each language across the different levels of linguistic processing (Kohnert, 2010;Jonak, 2015). The adaptation process should consider the specific characteristics of the considered languages (e.g., phonology, lexical frequencies, derivational and/or inflectional morphology, etc. . .) in order to ensure an accurate comparison of the linguistic competence of the children across their two languages.
The majority of the studies focusing on bilinguals with DLDs have not included all the information needed to correctly evaluate the participants' linguistic performance. Nonetheless, the available evidence highlights that the exposure to a bilingual context does not complicate the process of linguistic acquisition in children with DLDs (Kohnert, 2010). Simultaneous and early sequential bilinguals with DLDs tend to have similar patterns of impairment as monolinguals with DLDs (Håkansson et al., 2003;Paradis et al., 2003Paradis et al., , 2006Rothweiler et al., 2012). For example, in Paradis et al. (2003) three groups of 7-yearold children with DLDs (monolingual English-and Frenchspeaking and simultaneous bilinguals speaking both French and English) produced similar morphosyntactic errors on a linguistic production task. Namely, all of them produced more errors (and had similar accuracy levels) while using morphemes carrying information about verbal tense than in the use of the other inflective morphemes. Similarly, Rothweiler et al. (2012) investigated verbal morphology in German in 6 monolingual German-speaking children with DLDs, 6 early sequential bilinguals with DLDs (Turkish L1 and German L2), and 6 early sequential bilinguals with TLD (Turkish L1 and German L2). The bilinguals had been first exposed to their L2 between 2;09 and 4;04 years. Impairments in Subject-Verb agreement in German were similar for bilingual and monolingual children with DLDs suggesting that such morphosyntactic difficulties in subject-verb agreement might reflect a clinical marker of DLDs in German for both mono-and bilinguals. This finding was supported also by a following study by Clahsen et al. (2014), where the same 18 participants as in Rothweiler et al. (2012) did not differ in another morphosyntactic feature of German language, i.e., past participle inflection. Even if they did not control for the potential impact of phonological short-term and working memory on the morphological difficulties of these children, studies such as those by Rothweiler et al. (2012) are particularly interesting as they include a comparison between bilinguals with and without DLDs (see also Gutiérrez-Clellen and Simon-Cereijido, 2007). Indeed, as even children with typical development might experience difficulties in their L2s, comparing their L2 linguistic (i.e., lexical and grammatical) performance with that of bilingual children with DLDs may be misleading from a clinical point of view suggesting the presence of a potential impairment in typically developing L2 learners (e.g., Windsor and Kohnert, 2004).
Bilingual children with DLDs have difficulties in both of their languages (Restrepo and Kruth, 2000;Peña et al., 2001;Cleave et al., 2010;Kohnert, 2010;; but see Jacobson and Livert, 2010) and learn them at a slower pace than bilingual peers with typical development (Håkansson et al., 2003). Over the past 10 years, some studies have begun to investigate the linguistic performance of bilingual children with DLDs also by using procedures of narrative assessment (Cleave et al., 2010;Squires et al., 2014;Rezzonico et al., 2015). These have proved particularly informative with respect to traditional standardized testing. For example, Cleave et al. (2010) compared the narrative skills in English of 14 monolinguals and 12 bilinguals with DLDs (who had English as L1 and different languages as L2). Based on parent reports, the bilinguals were exposed to their L2 in the home for at least 25% and spoke that language at least 10% of their daily time. Unfortunately, the two groups were not balanced for Socio-Economic Status and monolinguals had parents with higher levels of education. For this reason, the authors used maternal education level as a covariate in their analyses. On two tasks assessing expressive morphosyntax [i.e., the word structure subtest of the Clinical Evaluation of Language Fundamentals-Preschool 2 (CELF-P2; Wiig et al., 2004) and the Structured Photographic Expressive Language Test -Preschool 2 (SPELT-P2; Dawson et al., 2005)] monolingual children with DLDs outperformed bilinguals. However, when asked to perform a narrative story production task and a story retelling task, the two groups of children were no longer different in any narrative or linguistic measure. Most importantly, they had similar difficulties also on morphological and grammatical measures. Unfortunately, in this study the children's performance on the two narrative production tasks was presented as a composite score. This does not allow readers to discern whether one of these two tasks is more informative than the other and their respective impact on the composite indexes used for the analyses. This is a significant vulnus in the literature as story retelling and story generation tasks exert different cognitive and linguistic demands on children. The former rely heavily on memory, whereas the latter on executive (i.e., phonological short-term and working memory, inhibition, monitoring and planning) and discourse selection and organization skills. As to this regard,  compared the linguistic skills of nine bilingual children with DLDs aged between 6 and 13 years and exposed to Friulian L1 and Italian L2. The assessment focused on lexical comprehension, sentence comprehension, sentence repetition, semantic fluency, and the ability to produce a story on a picturedescription task. In this study almost all participants experienced similar difficulties across the two languages. Most interestingly, the picture descriptions had similar levels of productivity (in terms of words and mean length of utterance), contained similar amounts of different words (i.e., types) and similar percentages of phonological semantic and morphological errors across the two languages.
In summary, the available evidence suggests that in order to adequately describe the linguistic profile of bilingual children with DLDs it is necessary to gather a comprehensive amount of information about their bilingual history and cognitive profile. Furthermore, their linguistic skills should be analyzed by using equivalent tests with the inclusion of narrative production tasks. This pilot study has been designed to identify the linguistic characteristics of a cohort of 11 simultaneous and early bilingual children (Italian, L1; German L2) with DLDs in each of their languages by using equivalent forms of the same battery of tests which includes also a narrative generation task (the Battery for the assessment of language in children aged 4 to 12 -BVL_4-12; Marini et al., 2015) and comparing their performance with those of bilingual children with TLD. The two groups were matched for a number of environmental variables (i.e., language exposure, socio-economical status of their families, level of reading in families, etc. . .) that are known to exert a significant impact on language development. As children with DLDs usually have difficulties in phonological short-term and working memory (e.g., Vugs et al., 2014) that might affect their linguistic performance (Marini et al., 2014), we included in the assessment also a task aimed at controlling for this potentially confounding variable. We hypothesized that (1) the participants with typical development would outperform the bilinguals with DLDs on phonological short-term memory; (2) phonological short-term memory would be significantly correlated with measures of lexical and grammatical production (Marini et al., 2014); (3) after controlling for the effect of phonological short-term memory children with typical development would have higher scores on tasks assessing lexical and grammatical skills; and (4) children with DLDs would have similar skills in both languages.

Participants
Twenty-two Italian-German bilingual children were included in this study. Eleven participants had previously received a diagnosis of DLDs using standardized testing in both languages in rehabilitation centers in Bolzano's area. The remaining eleven children had typical cognitive and language development (TLD). All participants came from families living in Bolzano's area, a bilingual region in north-eastern Italy where both Italian and a variety of German (i.e., the Austrian-Bavarian dialect) are currently spoken. Children with TLD were recruited in mainstream schools, whereas children with DLDs were recruited in rehabilitation centers. Parents provided family demographic information and children's daily language usage and exposure at home through a questionnaire. Importantly, all the participants had been consistently exposed also to Standard German at school since the age of 3 years. For all children both parents released their written and informed consent to the participation of their children to the study and to data processing. The study was approved by the Ethical Committee of the Hospital of Bolzano.
As shown in Table 1, no group-related differences were found for chronological age [t (20)   The children are more exposed to Both 2 18 1 9 The mother talks to her child in The child answers to his/her mother in Finally, the two groups did not differ on the Raven's progressive matrices (Raven, 1938) [t(20) = 1.586; p = 0.129] but differed on a task of phonological short-term memory, i.e., the forward digit recall subtest of the Wechsler Scales (Wechsler, 1993) [t(20) = 3.210; p < 0.004] (See Table 1).

Procedures of Linguistic Assessment
The participants' linguistic skills in the two languages were assessed by administering the Italian and the German versions of the Batteria per la Valutazione del Linguaggio in bambini dai 4 ai 12 anni (Battery for the assessment of language in children aged 4 to 12 -BVL_4-12; Marini et al., 2015). The BVL_4-12 has been designed to assess production, comprehension, and repetition abilities in children aged 4 through 12 years. The Italian version of the BVL_4-12 has been recently standardized in Italian. This Battery is currently under adaptation and standardization in Slovenian, Spanish, Russian, and German. For the adaptation to German particular care was paid to the selection of languagespecific items for each task (i.e., by considering lexical frequencies as well as the grammatical characteristics of German). The order of language administration (Italian vs. German) was randomized across subjects. We will report the performance of these children only on tasks assessing lexical, grammatical and discourse skills.

Assessment of Lexical Skills
The participants' lexical abilities were assessed by administering the naming, discourse production, and lexical comprehension subtests of the BVL_4-12. Their ability to select and produce words in German and Italian was explored by administering a naming and a discourse production task. In the former children were asked to name a series of black and white pictures (67 for Italian and 67 for German). For each language, the images elicited a maximum of 67 correct answers that had been controlled for frequency of use and semantic category. In order to further control for the adequacy of this adapted task, the participants were also administered a naming task which has been already standardized in German, i.e., the naming task of the Neuropsychologisches Screening für 5-11jährige Kinder (Kaufmann et al., 2008). Additional information about the participants' lexical production skills were obtained by administering a discourse production task where children were asked to describe the story portrayed in one vignette made of six pictures provided in the correct order on the same page (the "Nest Story" by Paradis, 1987). Each narrative sample was recorded and transcribed. The transcriptions included potential phonological fillers, pauses, false starts, phonetic/phonological errors, neologisms, and extraneous utterances. Two independent raters analyzed the transcripts following the multilevel analysis of discourse production outlined in Marini et al. (2011). Acceptable inter-rater reliability was set at Cohen's k ≥ 0.80. Namely, lexical production skills in each language were assessed by calculating a percentage of semantic paraphasias, i.e., words that replaced semantically related target words (e.g., "flower" in the sentence "They are looking at the flower, " where the speaker meant "tree").
The percentage of Semantic Paraphasias was calculated by using the following formula: [(semantic paraphasias/words) * 100]. The participants' lexical comprehension skills were assessed by administering the Italian and German versions of the lexical comprehension task of the BVL_4-12. They were asked to identify which, among four pictures, best represented the meaning of a series of 41 target words uttered by the examiner. The target words were selected according to their frequency of occurrence in German and Italian (i.e., high, medium, and low), respectively. One of the pictures represented the target word (e.g., "cat"), whereas the remaining three pictures portrayed a semantic (e.g., "dog"), a phonological (e.g., "car"), and an unrelated distractor (e.g., "table"). Children received 1 point for each correct answer. The maximum score for lexical comprehension was 41 for each language.

Assessment of Grammatical Skills
The participants' grammatical comprehension abilities were investigated with the German and Italian versions of the grammatical comprehension task of the BVL_4-12. This evaluates the ability of the children to understand sentences with varying syntactic complexity. After hearing each stimulus sentence (e.g., "the girl pushes the boy"), children were showed a sheet with four pictures: one represented the meaning of the target sentence whereas the remaining three pictures represented grammatical distractors (e.g., "the girl pushes the boys, " "the girls push the boy, " or "the girls push the boys, " respectively).
For each correct answer, the children received 1 point with a maximum score of 40 for each language. In order to further control for the adequacy of this adapted task, the participants were also administered a Grammatical Comprehension task which has been already standardized in German, i.e., the one in the Neuropsychologisches Screening für 5-11-jährige Kinder (Kaufmann et al., 2008).
The participants' grammatical production skills were determined by calculating a percentage of paragrammatic errors and one of complete sentences produced during the generation of the Nest Story (Marini et al., 2011). Paragrammatic errors are scored whenever a child replaces a correct function word or bound morpheme with a wrong one. The percentage of paragrammatic errors was calculated by dividing the number of such errors by the number of utterances {i.e., [(Paragrammatic errors/utterances) * 100]}. The percentage of Complete Sentences was derived by dividing the number of grammatically complete sentences by the utterances {i.e., [(complete sentences/utterances) * 100]}. Utterances were identified according to the criteria outlined in Marini et al. (2011). A sentence was considered complete if it contained all the arguments necessarily required by the verb and no omissions or substitutions of free or bound morphemes.

Assessment of Discourse Skills
The macrolinguistic analysis included two measures of discourse organization (i.e., percentages of errors of local and global coherence) and one of informative content (percentage of lexical informativeness). The % of Local Coherence Errors measured the extent to which each utterance was conceptually related to the preceding one and included both missing referents and topic shifts. Here the notion of topic shift is used in its general meaning. Consequently, topics shift were scored whenever an utterance was interrupted and the following one introduced a new topic without completing the one left suspended as in the following example: "/ They are going to . . . / the man is on the ground /." A missing referent was identified whenever the referents in a sentence were ambiguous or incorrect. The % of Local Coherence Errors was calculated as follows: [(local coherence errors/utterances) * 100].
Global coherence errors include the production of utterances that: (1) repeat previously introduced concepts (repetitive utterances, as the underlined utterance in the following example: /the woman asks him to take the bird / she wants him to take it /); (2) do not provide any additional information (filler utterances; as the underlined utterances in the following example: /they are calling him / what else? / let me think /); (3) derail from the flow of discourse (tangential utterances; as in /They are under a tree/ This tree looks like the one I have in my garden / In spring it will be full of flowers/); (4) include ideas that are conceptually incongruent with the stimulus (conceptually incongruent utterances; as in /He is going up there / a neighbor calls him / [in the story there is no neighbor calling him]). The percentage of global coherence errors was calculated as follows: [(global coherence errors/utterances) * 100].
The informative content of each storytelling was calculated in terms of a percentage of Lexical Information Units (LIUs) to words [(lexical information units/words) * 100]. LIUs included those words that were appropriately used in the text . Those words that had been scored as errors of any kind (including words embedded in filler, repeated, incongruent or tangential utterances) were excluded from the count of LIUs.

Analysis of Lexical Skills
A group-related difference was found at the Forward Digit Span task. For this reason, the presence of differences on lexical measures (i.e., naming, % semantic errors, and lexical comprehension) was investigated considering the potentially confounding role of phonological short-term memory (see Table 2). The relationship between performance at the Forward Digit Span task and the measures tapping lexical skills was investigated using Pearson product-moment correlation coefficient on the whole sample of participants after summing the scores obtained in the two languages. A significant positive correlation was found only between forward digit span and naming (r = 0.640; p < 0.001). For this reason, the grouprelated difference on naming was analyzed by performing one repeated measure ANCOVA with group (i.e., DLDs vs. TLD) as between-subject factor, language (Italian vs. German) as within-subject factor, the participants' performance at the Forward Digit Span task as covariate, and the naming scores as dependent variables. For the remaining two variables no covariation was needed and two repeated measures' ANOVAs with group (i.e., DLDs vs. TLD) as between-subject factor and language (Italian vs. German) as within-subject factor and the % semantic errors and lexical comprehension as dependent variables were performed. The alpha level was set at p < 0.017 (0.05/3 dependent variables) after Bonferroni correction for multiple comparisons. Children with DLDs performed worse than controls on the naming tasks [F(1,19) = 10.818; p < 0.004; partial η 2 = 0.363]. Furthermore, they produced more semantic errors [F(1,20) = 7.315; p < 0.014; partial η 2 = 0.268] and understood fewer words on the lexical comprehension subtest of the BVL_4-12 [F(1,20) = 14.985; p < 0.001; partial η 2 = 0.428]. Importantly, no within-subject effects were found for any of these three measures (p = 0.529, p = 0.210, and p = 0.056, respectively).

Analysis of Grammatical Skills
Among the measures assessing grammatical skills (i.e., % paragrammatic errors, % complete sentences and grammatical comprehension; see Table 3), only the % of complete sentences correlated significantly with Forward Digit Span task (r = 0.542; p < 0.009). For this reason, the group-related difference on this variable was analyzed by performing one repeated measures ANCOVA with group (i.e., DLDs vs. TLD) as between-subject factor, language (Italian vs. German) as within-subject factor, and the participants' performance at the Forward Digit Span task as covariate. For the remaining two variables no covariation was needed and two repeated measures' ANOVAs with group (i.e., DLDs vs. TLD) as between-subject factor and language (Italian vs. German) as within-subject factor and the % of Paragrammatic errors and the score at the grammatical comprehension task as dependent variables were performed. Alpha level was set at p < 0.017 (0.05/3 dependent variables) after Bonferroni correction. No group-related difference was found in the % of paragrammatic errors [F(1,20) = 1.671; p = 0.211; partial η 2 = 0.077] with a significant within-subject effect (p < 0.005) which, however, did not determine a significant Language * Group interaction (p = 0.839). No group-related differences were found for the % of complete sentences [F(1,20) = 3.131; p = 0.092; partial η 2 = 0.135] with no significant within-subject effect (p = 0.427). Finally, no group-related difference was found for grammatical comprehension [F(1,19) = 0.789; p = 0.385; partial η 2 = 0.040], again with no significant within-subject effect (p = 0.484).

Analysis of Discourse Production Skills
As no significant correlation was found between the scores at the forward digit span and discourse production (% errors of local coherence, % errors of global coherence, % lexical informativeness), group-related differences in these measures were investigated with three repeated measures' ANOVAs with group (i.e., DLDs vs. TLD) as between-subject factor, language (Italian vs. German) as within-subject factor and the three narrative measures as dependent variables (see Table 4). Alpha level was set at p < 0.017 (0.05/3 dependent variables) after Bonferroni correction.

Further Inspection of the Reliability of the Adapted Tasks
In order to control for the reliability of the adapted versions of the naming and the grammatical comprehension subtests of the BVL_4-12, the participants received also two tasks assessing the same skills that have already been standardized in German. These are the naming and the grammatical comprehension subtests of the Neuropsychologisches Screening für 5-11-jährige Kinder (Kaufmann et al., 2008). The relationship between performance at the Naming and Grammatical comprehension tasks in German of the BVL_4-12 and of the Neuropsychologisches Screening für 5-11-jährige Kinder was investigated using Pearson productmoment correlation coefficient. This analysis showed significant robust correlations between the two naming (r = 0.810; p < 0.001) and grammatical comprehension tasks (r = 0.758; p < 0.001).

DISCUSSION
In the current pilot study we investigated language skills in a cohort of simultaneous bilingual Italian-German speaking children with and without developmental language impairments. The participants were balanced for a number of environmental variables that are known to excert a major influence on bilingual development. Their linguistic performance (in terms of lexical, grammatical, and narrative abilities) was assessed by administering a selection of the subtests of the BVL 4-12 (Marini et al., 2015) and its adapted version to German. Overall, the results showed that: the participants with DLDs had reduced phonological short-term memory (confirmation of Hypothesis 1); that phonological memory correlated with Naming and % of Complete Sentences (confirmation of Hypothesis 2); that children with DLDs had reduced lexical skills that, in turn, likely contributed to the reduced levels of local coherence and informativeness of their narratives (partial confirmation of Hypothesis 3); and that these difficulties were found at similar levels in their two languages (in line with previous findings by Restrepo and Kruth, 2000;Peña et al., 2001;Cleave et al., 2010;Kohnert, 2010;. The only exception was the production of paragrammatic errors that were language dependent in both groups (partial confirmation of Hypothesis 4). A final consideration concerns the significant positive correlations observed for the Naming and Grammatical comprehension tasks in German of the BVL_4-12 and the same tasks of the Neuropsychologisches Screening für 5-11-jährige Kinder, suggesting the reliability of the adapted versions of these two tasks. As a first point, this study confirms previous investigations reporting phonological short-term memory difficulties in monolingual (e.g., Ellis Weismer et al., 1999;Montgomery, 2006;Marini et al., 2014) and sequential bilingual children with DLDs (Engel de Abreu et al., 2014) and extends this observation also to simultaneous bilinguals. It supports the possibility that a reduced phonological short-term memory is a characteristic feature of all children with DLDs (including both monoand bilinguals). According to the phonological storage deficit hypothesis, difficulties in keeping track of phonological and/or lexical items in short-term memory and eventually process them in working memory might result in slowed vocabulary acquisition (e.g., Gathercole and Baddeley, 1990;Marini et al., 2017) and affect their linguistic development (Bishop, 2006;Archibald and Gathercole, 2007;Graf Estes et al., 2007). In line with this hypothesis, phonological short-term memory correlated with the performance on a task assessing lexical selection in both languages (i.e., Naming) and the % Complete Sentences derived by the multilevel analysis of discourse which assesses the ability to generate complete sentences during the production of a narrative discourse. The specific contributions of phonological short-term memory on these linguistic skills will be discussed later.
The cohort of children with DLDs produced fewer correct lexical items on the naming task, more semantic errors on the narrative production task and understood fewer words on the lexical comprehension test. Such findings suggest that in this cohort of simultaneous bilinguals with DLDs lexical selection is an area of major weakness. According to an influential model of message production (e.g., Levelt, 1989;Levelt et al., 1999;Indefrey and Levelt, 2000), during the process of lexical selection, the target word is activated also because of the cooccurring inhibition of semantically related competitors. In this process, not only short-term memory but also additional executive skills such as inhibition, attention and planning play a critical role. In a recent investigation, a group of 15 Portuguese-Luxembourgish bilinguals from Luxembourg with a diagnosis of DLDs did not manifest the same advantages in selective attention and interference suppression as 33 typically developing Portuguese-Luxembourgish bilingual peers living in Luxembourg. However, they did not lag behind another group of 33 typically developing Portuguese-speaking monolinguals from Portugal in these domains of executive functioning (Engel de Abreu et al., 2014). Based on our findings and those by Engel de Abreu et al. (2014), we hypothesize that the process of lexical selection is not affected by the condition of bilingualism (as the controls were also simultaneous bilinguals), the specific language in use (as such skills were similarly impaired in both languages), or other cognitive limitations (as the group-related differences survived the covariation for phonological short-term memory and were significant also in two measures that did not correlate with this cognitive variable). Rather, the reduced lexical selection skills may reflect a core symptom of DLDs. Indeed, the production of semantic errors is usually interpreted as a failure in this process: the selection of a semantically wrong word is likely related to a difficulty in selecting the word that best matches with the non-verbal concept selected in the previous phase of prelinguistic conceptual message formulation. Importantly, the lexical difficulties observed in the group of participants with DLDs were also related to macrolinguistic difficulties, as highlighted by the production of a number of words with no clear referents that reduced the levels of both local coherence (see also Rezzonico et al., 2015) and lexical informativeness. This further supports the need to extend the linguistic analysis also to macrolinguistic aspects of language processing in children with DLDs using discourse generation tasks (see also Cleave et al., 2010;Marini et al., 2014).
The participants' grammatical skills were assessed in terms of production of paragrammatic (i.e., morphologic and morphosyntactic) errors, percentage of complete sentences produced in the narrative generation task, and grammatical comprehension. Interestingly, the participants with DLDs did not have more difficulties than bilingual controls on any of these three measures. Indeed, they produced the same amount of paragrammatic errors and percentage of complete sentences to utterances in the narrative production task and understood the same number of sentences in the grammatical comprehension task. Apparently, these results are at odds with those of previous investigations where impairments in inflectional morphology and morphosyntax have been considered as core symptoms of DLDs in both monolingual (e.g., Rice, 2003) and bilingual children (Jacobson and Livert, 2010;Rothweiler et al., 2012). For example, a previous study by Rothweiler et al. (2012) suggested that Subject-Verb agreement in German may be impaired in both monolinguals and early sequential bilinguals with DLDs and that such morphosyntactic difficulties might reflect a clinical marker of DLDs in German. However, our findings are coherent with previous observations showing that grammatical morphology is not necessarily a core feature of DLDs (Thordardottir, 2016). Furthermore, there is evidence that in narrative discourse production and retelling tasks such skills may not appear deficitarian (e.g., Cleave et al., 2010). In line with what reported in Cleave et al. (2010) we hypothesize that the absence of significant differences in morphosyntactic skills might be related to the use of different ways to assess such skills. Indeed, as reported also in other investigations, narrative production tasks are more adequate than decontextualized tasks to describe the actual linguistic profile of persons with DLDs (e.g., Marini et al., 2008). An alternative possibility is that the two groups did not differ on such skills as even children with typical development might experience difficulties in their L2s. Therefore, comparing their L2 grammatical performance with that of bilingual children with DLDs may be misleading from a clinical point of view suggesting the presence of a potential difficulty in typically developing L2 learners (e.g., Windsor and Kohnert, 2004). The lack of groups of monolinguals is among the limitations of this study as this omission did not allow us to determine whether the group of bilingual participants with TLD had similar grammatical skills as those of Italian-speaking or German-speaking monolinguals with TLD. Another interesting finding concerns the languagerelated effect found in the production of Paragrammatic errors. As already commented, the children with DLDs and those with typical development produced a similar amount of such morphological and morphosyntactic errors. However, both groups produced more such errors in German than in Italian. This further highlights the need to have equivalent tests adapted (and not simply translated) in the languages spoken by bilingual children.
Discourse production skills were assessed in terms of percentages of errors of both local and global coherence and lexical informativeness. Participants with DLDs were less informative (for a similar finding on a story retelling task see Rezzonico et al., 2015) and had more difficulties than controls in establishing local coherence among the produced utterances. On the contrary, no group-related differences were found in the production of errors of global coherence. The latter was quite expected as difficulties of global coherence likely reflect planning and monitoring difficulties that are usually observed in patients with different disorders such as Williams Syndrome (Marini et al., 2010) or Autism Spectrum Disorders (Ferretti et al., 2018;Marini et al., 2019). These findings further support the possibility that the two languages were similarly processed also at the macrolinguistic level in both groups. Importantly, a qualitative inspection of their local coherence errors showed that they did not produce any topic shift. Rather, as already discussed, their difficulties were likely related to an increased production of words with no clear referents. A lexical difficulty observed in children with DLDs in both languages affected their overall communicative skills confirming the presence of significant interactions between micro-(i.e., lexical and grammatical) and macrolinguistic (i.e., pragmatic and discourse) levels of processing.
The current investigation has some limitations that we would like to address here. First of all, the reduced number of participants. As a pilot study, only 11 participants per group were included. This obviously reduced the generalizability of the current results to the general population. However, we would like to stress that such vulnus is at least partially counterbalanced by the accurate matching of the two groups of participants on a number of environmental variables that are known to significantly affect the trajectory of language development in bilinguals. A second limitation concerns the afore-mentioned absence of cohorts of Italianspeaking and German-speaking monolinguals. Future investigations should consider the inclusion of larger cohorts of bilingual and monolingual participants. This might also allow researchers to run more accurate statistical analyses exploring the potential causality among the considered variables (e.g., phonological short-term memory and lexical skills) while also controlling for a potential bilingual effect on children with typical development.

CONCLUSION
In conclusion, the results of the current study have both theoretical and clinical implications. From a theoretical point of view, they support the idea that impaired lexical selection is among the core symptoms of DLDs not only in monolinguals but also in simultaneous bilingual children. Furthermore, they suggest that such difficulty is independent of the specific language in use. Rather, it apparently affects all of the languages known by such bilinguals. From a clinical point of view, these results support the need for an accurate assessment of a bilingual's linguistic competence that should be performed by using equivalent tests in the respective two languages. They also suggest to carefully consider, during the assessment procedure, the role of environmental (e.g., age of first exposure to the languages, degree of exposure, etc.) and cognitive (e.g., phonological short-term memory) factors and their potential repercussions on language development and processing. Finally, they support the usefulness of multilevel procedures of narrative analysis that significantly contribute to draw a comprehensive linguistic profile that will be pivotal to rehabilitation (Thordardottir, 2010;Thordardottir et al., 2015).

ETHICS STATEMENT
This study was carried out in accordance with the recommendations of the "Comitato Etico dell'Ospedale di Bolzano". All subjects released their written informed consent in accordance with the Declaration of Helsinki. The protocol was approved by the "Comitato Etico dell'Ospedale di Bolzano."

AUTHOR CONTRIBUTIONS
AM planned the study, ran the statistical analyses and wrote the manuscript. PS and IR supervised the adaptation of the BVL_4-12 to German and supervised the recruitment of the participants. CS and FA co-supervised the recruitment procedure and supervised the administration of the tasks to the children. All authors contributed with comments to the interpretation of the results.