Grammatical skills of Dutch children with 22q11.2 Deletion Syndrome in comparison with children with Developmental Language Disorder: Evidence from spontaneous language and standardized assessment

Background Virtually all children with 22q11.2 Deletion Syndrome (22q11DS) experience language difficulties, next to other physical and psychological problems. However, the grammatical skills of children with 22q11DS are relatively unexplored, particularly in naturalistic settings. The present research filled this gap, including two studies with different age groups in which standardized assessment was complemented with spontaneous language analysis. In both studies, we compared children with 22q11DS to children with Developmental Language Disorder (DLD), for whom the origin of language difficulties is unknown. Methods The first study included 187 preschool children (n = 44 with 22q11DS, n = 65 with DLD, n = 78 typically developing; TD). Standardized assessment consisted of grammar and vocabulary measures in both expressive and receptive modality. Spontaneous language during a play session was analyzed for a matched subsample (n = 27 per group). The second study included 29 school-aged children (n = 14 with 22q11DS, n = 15 with DLD). We administered standardized tests of receptive vocabulary and expressive grammar, and elicited spontaneous language with a conversation and narrative task. In both studies, spontaneous language measures indexed grammatical accuracy and complexity. Results Spontaneous language analysis in both studies did not reveal significant differences between the children with 22q11DS and peers with DLD. The preschool study showed that these groups produced less complex and more erroneous utterances than TD children, who also outperformed both groups on the standardized measures, with the largest differences in expressive grammar. The children with 22q11DS scored lower on the receptive language tests than the children with DLD, but no differences emerged on the expressive language tests. Discussion Expressive grammar is weak in both children with 22q11DS and children with DLD. Skills in this domain did not differ between the groups, despite clear differences in etiology and cognitive capacities. This was found irrespective of age and assessment method, and highlights the view that there are multiple routes to (impaired) grammar development. Future research should investigate if interventions targeting expressive grammar in DLD also benefit children with 22q11DS. Moreover, our findings indicate that the receptive language deficits in children with 22q11DS exceed those observed in DLD, and warrant special attention.

Background: Virtually all children with q . Deletion Syndrome ( q DS) experience language di culties, next to other physical and psychological problems. However, the grammatical skills of children with q DS are relatively unexplored, particularly in naturalistic settings. The present research filled this gap, including two studies with di erent age groups in which standardized assessment was complemented with spontaneous language analysis. In both studies, we compared children with q DS to children with Developmental Language Disorder (DLD), for whom the origin of language di culties is unknown.
Methods: The first study included preschool children (n = with q DS, n = with DLD, n = typically developing; TD). Standardized assessment consisted of grammar and vocabulary measures in both expressive and receptive modality. Spontaneous language during a play session was analyzed for a matched subsample (n = per group). The second study included school-aged children (n = with q DS, n = with DLD). We administered standardized tests of receptive vocabulary and expressive grammar, and elicited spontaneous language with a conversation and narrative task. In both studies, spontaneous language measures indexed grammatical accuracy and complexity.
Results: Spontaneous language analysis in both studies did not reveal significant di erences between the children with q DS and peers with DLD. The preschool study showed that these groups produced less complex and more erroneous utterances than TD children, who also outperformed both groups on the standardized measures, with the largest di erences in expressive grammar. The children with q DS scored lower on the receptive language tests than the children with DLD, but no di erences emerged on the expressive language tests.

. Introduction
The 22q11.2 Deletion Syndrome (22q11DS) is a genetic condition, which leads to multiple physical and psychological problems, including congenital heart defect and low intellectual functioning (McDonald-McGinn et al., 2015). Although phenotypic expression is heterogeneous, speech and/or language problems are reported in 95% of the children with 22q11DS (Solot et al., 2019), making this one of the most common features of the syndrome. The language problems in children with 22q11DS have, however, almost exclusively been described with standardized tests. Very few studies have analyzed children's spontaneous language, even though this is a more ecologically valid way to evaluate language development and can be used to set therapy goals (Klatte et al., 2022). The current study aimed to fill this gap.
In addition, we compared the language abilities of children with 22q11DS to children with Developmental Language Disorder (DLD). Similar to children with 22q11DS, children with DLD have severe difficulties with learning language. However, their language difficulties exist in the absence of the challenging physical and cognitive conditions that we see in 22q11DS. As of yet, there are no direct, large-scale comparative studies of children with 22q11DS and children with DLD. Such comparisons are meaningful to determine whether interventions for children with DLD may also be suited for children with 22q11DS. Moreover, given the etiological differences between the groups, it can enhance our understanding of the mechanisms underlying language impairment. We therefore conducted two studies, comparing the spontaneous language of both preschool and school-aged children with 22q11DS to peers with DLD. Moreover, we analyzed the results of a number of standardized language tests. In the study with preschool children, we also included a typically developing (TD) control group. In both studies, we focused on the domain of grammar, as this is a hallmark deficit in DLD, while relatively unexplored in 22q11DS.
. . q . Deletion Syndrome 22q11DS is caused by a microdeletion on the long arm ('q') of chromosome 22, with the name thus referring to its genetic cause. The syndrome was previously also called Velo-Cardio-Facial, DiGeorge or Shprintzen syndrome, but we now know that these conditions are all due to the same genetic deletion: 22q11DS (McDonald-McGinn et al., 2015). It is the most frequently occurring genetic syndrome after Down syndrome, with an incidence of 1 in 2148 live births (Blagojevic et al., 2021). Despite the relatively uniform etiology, individuals with 22q11DS differ greatly in symptom expression. Over 180 manifestations have been associated with the syndrome (McDonald-McGinn et al., 2015). Congenital heart defects are the most common physical symptom, estimated to occur in up to 75% of the population. Palatal abnormalities, such as cleft palate and velopharyngeal insufficiency, are also frequently observed. In addition, cognitive and psychiatric problems are part of the syndrome. Many individuals with 22q11DS have borderline intellectual functioning or mild intellectual disability (Fiksinski et al., 2022). Moreover, 22q11DS is associated with elevated rates of psychopathology, including attention deficit hyperactivity disorder, autism spectrum disorder, anxiety disorder and psychotic disorder (Schneider et al., 2014).

. . Language impairment in children with q DS
Next to the symptoms mentioned above, speech-language problems are observed in virtually all children with 22q11DS (Solot et al., 2019) and do not appear to be related to other manifestations of the syndrome, such as congenital heart defect and palatal abnormalities (Gerdes et al., 1999;Solot et al., 2001). In early childhood, it is reported that the first words and sentences emerge relatively late (e.g., Gerdes et al., 1999;Solot et al., 2000;Roizen et al., 2007), with some children even remaining nonverbal until the age of 4 years (Solot et al., 2001). During the preschool age, both expressive and receptive language abilities of children with 22q11DS are significantly weaker in comparison to TD children, as indicated by lower scores on standardized language tests (Gerdes et al., 1999(Gerdes et al., , 2001Solot et al., 2001;Everaert et al., 2022). A recent study , using the same preschool sample as the current study, for example showed that Dutch children with 22q11DS between 3 and 6.5 years old scored, on average, . /fcomm. .
2 standard deviations below the normed mean on a composite measure of expressive language. For receptive language, this was 1.5 standard deviations below the normed mean. The significant difference in the severity of the expressive and receptive language impairment is in line with what is reported in other research with preschoolers (Gerdes et al., 1999;Solot et al., 2001). Next to composite measures, Everaert et al. (2022) also examined subtest outcomes of the standardized assessment and observed pervasive difficulties across language domains, with the lowest scores on expressive morphosyntactic skills. With the exception of Scherer et al. (1999), who showed low lexical diversity in the spontaneous language of 4 children with 22q11DS between 0;6 and 2;6 years old, an investigation of the spontaneous language of preschool children with 22q11DS has not yet been undertaken.
Research on school-age children with 22q11DS also used standardized language assessment and indicates that language impairment in 22q11DS is persistent, both in production and comprehension (Moss et al., 1999;Solot et al., 2001;Glaser et al., 2002;Rakonjac et al., 2016;Van den Heuvel et al., 2018). Language impairment even goes beyond what is expected based on children's level of intellectual functioning (Glaser et al., 2002;Persson et al., 2006;Van den Heuvel et al., 2018), in agreement with what is found for preschoolers (Gerdes et al., 1999;Scherer et al., 1999). However, in contrast to preschool children, school-age children with 22q11DS are reported to have weaker receptive than expressive language and relatively strong expressive morphosyntactic abilities (Glaser et al., 2002;Van den Heuvel et al., 2018). These contrasting findings may reflect unique developmental trends for different language modalities and domains, although more research is needed to confirm this.
Next to reporting standardized test scores, a number of studies with school-age children with 22q11DS have examined children's language profile in more detail. Van den Heuvel et al. (2018) conducted a fine-grained error analysis of two standardized tests of expressive syntax. Difficulties interpreting and using contextual cues were found to characterize the errors of their 6-13-yearold participants with 22q11DS on these tasks. In addition, three studies reported weak narrative abilities of children with 22q11DS at the macrolevel, gauging story structure and information transfer (Persson et al., 2006;Van den Heuvel et al., 2017;Selten et al., 2021). Persson et al. (2006) also analyzed the microstructural narrative production abilities of their 19 participants between 5 and 8 years old. Grammatical errors were not highly prevalent in the narrative samples, but low grammatical complexity, as indicated by short sentences and few subordinate clauses, was found to be characteristic of the stories that these children told. Van den Heuvel et al. (2017) also reported a reduced sentence length of their 6-13-year-old participants with 22q11DS in comparison with TD peers.

. . q DS and Developmental Language Disorder
Given the severe language impairment of children with 22q11DS, which cannot be (fully) explained by cognitive or physical features of the syndrome, it is not surprising that parallels have been drawn with children with DLD. DLD is a neurodevelopmental disorder which primarily affects the ability to learn a native language (Bishop et al., 2017), estimated to occur in 3-7% of the child population (Tomblin et al., 1997;Norbury et al., 2016;Calder et al., 2022). The language difficulties of children with DLD cannot be explained by an obvious cause, such as a biomedical condition, hearing impairment, or intellectual disability. Instead, DLD is thought to arise from the interaction between multiple genetic and environmental risk factors (Bishop, 2009). These risk factors may differ from child to child, making the etiology of DLD heterogeneous. On the phenotypic level, diverse language problems in all language domains can be observed (for an overview, see Leonard, 2014;Gerrits et al., 2017). However, morphosyntactic difficulties, in Germanic languages particularly those related to verbs, are seen as a hallmark deficit and have been proposed as clinical markers that support the identification of DLD (see Leonard, 2014). Such difficulties can be observed in performance on standardized tests or other elicitation probes (e.g., Riches, 2012;Krok and Leonard, 2015;Boerma et al., 2017), but are also often shown in children's spontaneous language. Low grammatical accuracy and complexity in the spontaneous language of Dutch children with DLD is for example reflected by frequent tense and agreement errors, difficulties with argument structure, the overuse of root infinitives, a short sentence length, and the use of few complex sentences (e.g., Bol and Kuiken, 1988;De Jong, 1999;Wexler et al., 2004;Verhoeven et al., 2011;Zwitserlood et al., 2015).
As DLD per definition precludes a known biomedical condition, children with 22q11DS cannot be diagnosed with DLD. Instead, they may have a so-called 'language disorder associated with X' (Bishop et al., 2017). Despite the different labels, there appears to be substantial clinical overlap between the groups. Children with 22q11DS are often seen and treated by the same professionals that provide treatment for children with DLD . It is, however, unclear whether the two groups can be differentiated based on their language profile. Previous research comparing children with DLD and children with 22q11DS is scarce. In their discussion section, Persson et al. (2006) indirectly compared the results from their 22q11DS sample with the results from a different study including children with DLD. They observed similarities across the two groups with respect to sentence length and the production of subordinate clauses, but noticed differences in grammatical accuracy, with lower accuracy for the children with DLD compared to the children with 22q11DS. Three studies directly compared children in the two groups. Kambanaros and Grohmann (2017) conducted a longitudinal case study of a boy with 22q11DS, testing him at age 6 and age 10, and compared him to children with DLD. At the age of 6, the boy produced longer sentences relative to peers with DLD, but at age 10 he scored worse on the comprehension of subject relative clauses. Other measures, including a wide range of standardized tests and experimental tasks, did not differentiate the boy from the children with DLD, neither at age 6 nor at age 10. In addition, Selten et al. (2021), using the same school-aged sample as the current study, examined narrative comprehension and production at the macrolevel of 6-10-year old children with 22q11DS and children with DLD. They did not find a significant difference on any of the narrative measures between the Frontiers in Communication frontiersin.org . /fcomm. . two groups. Using fMRI data from the same children, Van Steensel et al. (2021) even reported comparably reduced brain activation during language processing in both groups.

. . The current study
Previous research showed that language impairment is a common feature of 22q11DS. Children with 22q11DS experience severe language difficulties across all language domains and in both receptive as well as expressive modality. However, our knowledge of the language profile of children with 22q11DS is almost exclusively based on standardized test performance. While such tests give important information on whether language abilities are age-appropriate, they also have a number of limitations. For example, standardized language assessment does not provide insight into grammatical production skills in real-life situations, some aspects of grammar are difficult to reliably test in a standardized way, and some children may not comply with the necessary behavioral restrictions of standardized testing (Costanza-Smith, 2010;Doedens and Meteyard, 2022;Klatte et al., 2022). The latter may also hold for young children with 22q11DS, as indicated by the task completion rates reported in the study of Everaert et al. (2022). Ideally, standardized language assessment is complemented with the analysis of spontaneous language, which is ecologically valid, can be used with all children, and is considered to be the gold standard for setting therapy goals in the domain of grammar (Heilmann, 2010;Price et al., 2010).
The current study therefore investigated the spontaneous language of children with 22q11DS, aiming to further our knowledge on the syndrome's language profile. In view of the contrasting findings of previous work between preschool and school-age children, we conducted a study with each age group. We complemented spontaneous language analysis with standardized measures and, in the study with preschool children, included a TD control group. In addition, in both studies, we compared the children with 22q11DS to age-matched peers with DLD. This is the first large-scale comparison of a group of children with language problems associated with 22q11DS, a known biomedical condition accompanied by physical and cognitive challenges, and a group of children experiencing language difficulties that are not associated with such challenges. An open question is whether those two groups can be differentiated at the phenotypic level, which may have important implications for both our understanding of the required conditions for language acquisition as well as for clinical care. We focused on grammar, as weaknesses in this domain are characteristic of DLD. At the same time, relatively little is known about the grammatical skills of children with 22q11DS, especially in naturalistic settings.
Based on previous research (Persson et al., 2006;Kambanaros and Grohmann, 2017), we expected that the grammatical complexity of children with 22q11DS and children with DLD would be comparably low. Moreover, grammatical errors could be more prevalent in the group of children with DLD in comparison with the children with 22q11DS, although the evidence base for this prediction is very limited. For the preschool children, we predicted that both children with 22q11DS and children with  This information is based on a wide variety of standardized, age-appropriate measures (M = 100, SD = 15). In the full sample, scores were missing for one TD child and two children with 22q11DS. In the subsample, this was the case for one child with 22q11DS. b Parental education is the average education level of both parents, measured on a nine-point-scale (1 = no education, 9 = university degree). In the full sample, information was missing for one TD child and two children with DLD. In the subsample, this was the case for one child with DLD. c This score of global language ability is a standardized composite (M = 100, SD = 15) of three language tests from the CELF-Preschool-2-NL. In the full sample, scores were missing for eight children with 22q11DS and two children with DLD. In the subsample, this was the case for four children with 22q11DS.
Frontiers in Communication frontiersin.org . /fcomm. . DLD would perform below TD peers on all measures, although grammatical accuracy of the children with 22q11DS could be on par with the control group. Finally, although we expected roughly similar results in the preschool and school-age study, we reckoned with the possibility that school-age children with 22q11DS would have relatively stronger grammatical skills than preschoolers, given the previous contrasting findings on expressive morphosyntactic abilities in these age groups (preschool: Everaert et al., 2022;schoolage: Glaser et al., 2002;Van den Heuvel et al., 2018).

. . . Participants
The children in the preschool study participated in a prospective cohort study ("3T project") which examined development in the domains of behavior, cognition and language. Participants were recruited between November 2018 and November 2019. All children were between 3 and 6.5 years of age, grew up monolingually, and had no hearing impairment. The latter two criteria were verified through a telephone interview with parents. The first group, children with 22q11DS (see Everaert et al., 2022), had a genetically confirmed diagnosis of 22q11DS. They were recruited via the 22q11DS expertise center at University Medical Center Utrecht in the Netherlands and via the Dutch patient support association. The second group, children with DLD, had been diagnosed with DLD before and independent of the 3T project by licensed professionals. They obtained an overall score of 2 standard deviations (SD) below the mean on a standardized language test battery or a score of 1.5 SD below the mean on two out of four language domains which were tested with at least two measures (for the full protocol, see Stichting Siméa, 2017). Moreover, next to the absence of hearing impairment, they had a non-verbal intelligence of 70 or above. The children with DLD were recruited via organizations that provide care and education services for children with communication difficulties, including Royal Kentalis, Royal Auris, VierTaal and NSDSK. At the time of the study, they all received speech-language therapy at day care or school. Finally, the third group, TD children, did not have documented developmental delays and no family history of language disorders or dyslexia. They were recruited via regular day care centers or elementary schools. Three TD children were excluded, because they obtained a score of more than 1 SD below the mean on standardized language assessment that was administered for the purpose of the 3T project. The final sample included 44 children with 22q11DS, 65 children with DLD and 78 TD children. The demographic characteristics of this sample are presented in Table 1. For a description of the prevalence of physical symptoms in our 22q11DS sample and the percentage of children receiving speech-language therapy, we refer to Everaert et al. (2022).
The three groups of children did not differ in age in months [F (2,184) = 0.97, p = 0.38, η 2 p = 0.01]. However, there were significant differences in sex [χ 2 (2, N = 187) = 19.6, p < 0.001, V = 0.32], with relatively more boys in the group with DLD than in the other two groups (in line with what is known on DLD; Tomblin et al., 1997, but see Calder et al., 2022. Intellectual functioning, obtained from medical/school records or assessment by the current researchers, also differed significantly between the groups [F (2,181) = 58.04, p < 0.001, η 2 p = 0.39]. The TD children obtained the highest scores, followed by the children with DLD and, finally, the children with 22q11DS (all p < 0.001). The average education level of both parents, measured with an online questionnaire, was also higher for the TD children in comparison with the 22q11DS and DLD groups [H (2) = 38.0, p < 0.001, η 2 = 0.20], but did not differ significantly between the latter two groups. The same pattern was observed for global language ability [F (2,174) = 142.2, p < 0.001, η 2 p = 0.62], assessed with the Core Language Index Score of the CELF-Preschool-2-NL (Wiig et al., 2012).
As can be observed in Table 1, a subsample of 27 children in each of the three groups was selected to allow for individual matching on age in months and sex, making the groups as comparable as possible [age in months: F (2,78) = 0.005, p = 0.995, η 2 p < 0.01; sex: χ 2 (2, N = 81) = 0.00, p = 1.00, V = 0.00]. Spontaneous language was analyzed for this subsample. A child with 22q11DS was matched to a child with DLD and a TD child from the same sex who were at most 3 months older or younger. Moreover, only TD children were selected who scored in the average range (between 85 and 115) on the Core Language Index. For one matched TD child, the quality of the language sample recording appeared to be too poor. We therefore had to replace this child with another, who did have the right sex and age but who scored above average on global language ability (i.e., 120). Similar to the full sample, the TD children in the subsample obtained higher core language scores than children in the other two groups [F (2,74) = 50.8, p < 0.001, η 2 p = 0.58], which, in turn, did not differ from each other. We did not match on intellectual functioning, as differences between the groups are inherent [F (2,77) = 22.8, p < 0.001, η 2 p = 0.37]. In the subsample, intellectual functioning of the children with DLD and TD children was not significantly different anymore (p = 0.082), and was higher than the intellectual functioning of the children with 22q11DS (all p < 0.001). Finally, parental education differences between the three groups remained significant [H (2) = 9.5, p = 0.009, η 2 = 0.10]. This effect was driven by differences between the DLD and TD groups (p = 0.003).

. . . . Standardized language measures
Standardized language measures were used to assess children's abilities in the domains of expressive and receptive grammar. To determine whether grammatical skills are a relative strength or weakness, we also included measures of expressive and receptive vocabulary. Scores of the children with 22q11DS on these tests have been reported in Everaert et al. (2022).
Subtests of the Preschool version of the Clinical Evaluation of Language Fundamentals, CELF-Preschool-2-NL (Wiig et al., 2012), evaluated expressive grammar, receptive grammar and expressive vocabulary. All subtests were administered following the official manual and have a normed mean of 10 (SD = 3). Expressive grammar was measured with two subtests, on word level and  on sentence level. During the subtest Word Structure, children saw one or two pictures and were asked to complete a sentence uttered by the researcher, thereby eliciting the production of verbs, adjectives, plurals, pronouns and diminutives. The second subtest of expressive grammar was Recalling Sentences, which is a sentence repetition task with items that increase in length and complexity. This type of task is considered to test syntactic skills (Polišenská et al., 2015). Receptive grammar was measured with the subtest Sentence Structure. Children saw four pictures and were asked to point to the picture that best matched a sentence uttered by the researcher. The test assesses children's understanding of different grammatical structures, including passives, relative clauses, negation and prepositional phrases. Finally, expressive vocabulary was evaluated with the Expressive Vocabulary subtest. Children saw a picture of an object or action and had to label the picture.
Receptive vocabulary skills were assessed with the Peabody Picture Vocabulary Test (PPVT-III-NL; Schlichting, 2005). The test was administered in accordance with the official manual and quotient scores with a mean of 100 (SD = 15) are reported. Children saw four pictures and heard a target word. They were asked to point to the picture which corresponded to the target word.

. . . . Spontaneous language samples
Spontaneous language of children was collected during a play session of ∼15-20 min. The play break followed a standardized protocol and was divided in three parts. In the first part, children played alone with a fixed set of toys, including the Playmobil city life petting zoo set and a number of plastic fruits/vegetables. After a few minutes, or sooner if the child did not speak during this part, the researcher brought a tractor and joined the child. In this second part, the child and researcher played together, but the child remained in charge of what was happening. The researcher was instructed to follow the child, only taking initiative when the child had clear difficulty playing with the toys. After around 10 min, the final part of the play break began, in which both the child and researcher colored with crayons. If the child did not speak much, the researcher would ask open-ended questions.

. . . Procedure
The 3T project was approved by the Medical Research Ethics review board of the University Medical Center Utrecht (CCMO registry nr. NL63223.041.17). Parents of participating children signed an informed consent form. The researchers who worked with the children had a background in linguistics or psychology and were trained using a standardized protocol. Children were individually tested in a quiet room at day care or school. Standardized language tests, cognitive tasks and the play break were administered in a fixed order during two sessions of ∼45 min each. The two test sessions were on separate days and were always administered by same researcher. The play break was in the second session. This was video-recorded with a GoPro HERO camera and, for adequate audio recordings, a Samson Go Mic portable USB microphone was used. The standardized tests for expressive language were recorded with the same USB microphone and also scored by a second researcher. Discrepancies were discussed and solved by consensus.
The language samples of the 27 children in each of the three groups were transcribed according to the Codes for the Human Analysis of Transcripts (CHAT) conventions (part of CHILDES; MacWhinney, 2000), by trained researchers with a background in linguistics. The T-unit was used as the basic unit of analysis, defined as a main clause with subordinate clauses attached to it (Hunt, 1970). Quality checks were done by the first and senior author to guarantee that the conventions were accurately followed. Moreover, the transcripts were annotated on a separate tier for grammatical accuracy and complexity (see Data analysis). For sake of reliability, the annotations of nine transcriptions (three of each group; 11%) were compared with annotations from a second researcher. Annotation agreement was reached in 94.6% of the T-units.

. . . Data analysis
The analyses were performed in Computerized Language Analysis Software (CLAN, part of CHILDES; MacWhinney, 2000) and SPSS version 28 (IBM Corp, 2021). Univariate ANOVA's were done to compare the three groups on the five standardized language Frontiers in Communication frontiersin.org . /fcomm. .

measures.
As the groups significantly differed in SES and sex, while these differences are not inherent to the groups, we also conducted univariate ANCOVA's. The inclusion of the covariates SES and sex did not change the results. Intellectual functioning differences are inherent to the groups and intellectual functioning was therefore not included as a covariate in the analyses (Miller and Chapman, 2001;Dennis et al., 2009). All analyses were done for the full sample as well as the subsample. Results for the subsample did not differ from the results of the full sample and are therefore not reported. As an additional analysis, we conducted paired samples ttests in the DLD and 22q11DS groups to investigate whether there was a discrepancy between expressive grammar (measured with subtests "word structure" and "recalling sentences") and the other language domains. For this analysis, quotient scores of the receptive vocabulary task were transformed to CELF-scores. The analyses of the spontaneous language samples focused on grammatical accuracy and grammatical complexity, and were based on the work of Zwitserlood et al. (2015). The main outcome parameters of both categories are presented in Table 2 (see the Appendix for examples of errors and complex utterance categories). All outcome parameters exclude interjections and communicators (e.g., "uh, " "yes, " "no"; on average 19% of the total number of a child's utterances), onomatopoeia (2%), unintelligible utterances (6%), as well as incomplete sentences due to trailing off and interruption (2%). Furthermore, the outcome parameters are corrected for length of the included language sample, as this differed per child. That is, all outcome parameters are calculated as proportions, taking into account the total number of T-units (or, in some specific cases, the total number of clauses). Sample length, calculated as the total number of T-units after exclusions, did not significantly differ between the three groups of children ( Table 2, we also report on a number of specific verb-related errors (part of the main parameter "% verb-related errors"), as these errors are known to occur frequently in the spontaneous language of Dutch children with DLD. These specific verb-related errors include (1) the number of subject-verb agreement errors relative to the total subject-verb agreement attempts, (2) the number of past tense errors relative to the total number of T-units requiring a past tense, (3) the number of root infinitives relative to the number of Tunits containing a verb, (4) the omission of an argument (subject, object or other) relative to the number of T-units containing a verb. Comparable to the analyses with the standardized language measures, univariate AN(C)OVA's were done to compare the three groups on all main outcome parameters for grammatical accuracy and grammatical complexity. The inclusion of SES as covariate did not change the results. For the specific verb-related errors and for the main outcome parameter "% complex utterances, " we conducted non-parametric tests (Kruskall Wallis H test and, for post-hoc comparisons, Mann Whitney U test), as inspection of the data showed violations of the assumptions of normality and equality of error variances. Effect sizes were interpreted following Cohen (1988).

. . . Standardized language measures
The performance of the three groups of children (full sample) on the standardized tests of grammar and vocabulary is presented in Table 3 Comparing the average scores per group across language domains, we see low performance of children with 22q11DS on all measures. For both the children with 22q11DS and the children with DLD, the lowest mean scores are on the two subtests of expressive grammar (close to −2 SD below the mean). For the children with DLD, a larger discrepancy between expressive grammar and the other domains are observed than for the children with 22q11DS. Paired samples t-tests between the two expressive grammar subtests on the one hand and the other standardized measures on the other hand showed significant differences across the board in the DLD group (all p < 0.001), with effect sizes ranging from 0.79 to 1.73. In the 22q11DS group, significant differences were also observed (p < 0.05), with the exception of "recalling sentences" in comparison with "active vocabulary" (p = 0.20) and "recalling sentences" in comparison with "sentence comprehension" (p = 0.053). Effect sizes ranged from 0.22 to 0.98.

. . . Spontaneous language samples
For each of the three groups, the means and standard deviations on all outcome measures for grammatical accuracy and grammatical complexity are presented in Table 4.

. . . . Grammatical accuracy
Grammatical accuracy was subdivided into three main outcome parameters and four specific verb-related errors. The relative number of error-free T-units is a broad measure of grammatical accuracy, for which a significant effect of Group was observed [F (2,78) = 18.0, p < 0.001, η 2 p = 0.32]. TD children produced relatively more error-free T-units than children with 22q11DS and children with DLD (both p < 0.001). No significant differences emerged between the latter two groups (p = 1.00). The same pattern was found for the other two main outcome parameters. That is, there were significant effects of Group on both verb-related errors [F (2,78) = 19.4, p < 0.001, η 2 p = 0.33] and non-verb-related errors [F (2,78) = 12.9, p < 0.001, η 2 p = 0.25]. In comparison with the other two groups, TD children produced relatively less verbrelated (both p < 0.001) and non-verb-related (22q11DS: p = 0.007; DLD: p < 0.001) errors. The groups of children with 22q11DS and children with DLD did not differ significantly from each other on either parameter (verb-related: p = 1.00; non-verb-related: p = 0.20).
Results from the specific verb-related errors showed one very extreme outlier in the 22q11DS group on the proportion of subject-verb agreement errors (scoring 100%). This child was very young (3;1 years old) and produced a limited number of utterances. We excluded this outlier from the analyses, although results with and without the outlier remained the same.  whereas the latter two groups did not differ in their MLU and MLU 5 (all p = 1.00). Another index of grammatical complexity was the proportion of utterances containing a verb. There was one very extreme outlier in the 22q11DS group from a young child (3;4 years old; scoring 1.8%) which was excluded from the analyses; results with and without the outlier remained the same. A significant effect of Group emerged on the proportion of utterances containing a verb [F (2,77) = 7.7, p < 0.001, η 2 p = 0.17], with TD children producing relatively more utterances with a verb than the groups of children with 22q11DS and children with DLD (all p < 0.001), who did not differ (p = 1.00). Finally, the same pattern appeared from the proportion of complex sentences [H (2) = 18.2, p = 0.002, η 2 = 0.21]. There were no significant differences between the children with 22q11DS and the children with DLD (p = 0.25), who produced less complex sentences than their TD peers (22q11DS: U = 147.5, z = −3.8, p < 0.001, r = 0.52; DLD: U = 174.5, z = −3.3, p < 0.001, r = 0.45).

. . . Participants
The children in the school-age study participated in a project on language processing and activation in the brain (see Selten et al., 2021;Van Steensel et al., 2021). Participants were recruited between November 2017 and July 2018. The 6-10-year-old participants included 14 children with a genetically confirmed diagnosis of 22q11DS and 15 children with an official diagnosis of DLD (for a description of the DLD criteria and protocol used in the Netherlands, see 2.1.1.). All children had either a verbal or nonverbal intellectual functioning level of 70 or above. Moreover, they did not have hearing loss of more than 35 decibel, as determined by pure tone audiometry, nor a diagnosis of autism spectrum disorder. Finally, due to an fMRI scan which was also part of the research protocol (Van Steensel et al., 2021), children were excluded if they had metal objects in their bodies or if they experienced severe anxiety in the scanner. Recruitment procedures were similar to the study with preschool children. Demographic characteristics of the two groups of children are presented in Table 5. The two groups did not differ on age in months [t (27)

. Standardized language measures
We included one standardized measure of expressive grammar and, as a reference, one standardized measure of receptive vocabulary, which were both administered in line with the official manuals. Results from these measures have been reported as background measures in the study of Selten et al. (2021). Similar to the study with preschool children, expressive grammar was tested with a sentence repetition task. The Recalling Sentences subtest of the school-aged version of the CELF, the CELF-IV-NL (Kort et al., 2008), required children to repeat sentences of increasing length and complexity. The normed scores have a mean of 10 (SD = 3). Receptive vocabulary was assessed with the PPVT-III-NL (see 2.1.2.1.).

. . . . Spontaneous language samples
Spontaneous language of children was collected with a narrative task which was preceded by a conversation between the researcher and the participating child. We used the Multilingual Assessment Instrument for Narratives (MAIN) (Gagarina et al., 2012; for the Dutch version, see Blom et al., 2020) to elicit semi-spontaneous language. The MAIN targets narrative abilities of 3-to 10-yearold children and consists of four comparable stories, all matched to six full-color picture sequences. In the current research, the stories "Cat" and "Baby Birds" were used. The children first saw the picture sequence belonging to "Cat." The researcher told the story and asked the child ten comprehension questions. Subsequently, children saw the picture sequence belonging to "Baby Birds" and were asked to generate their own story, which was, again, followed by ten comprehension questions. The MAIN can be used to analyze children's understanding and production of story structure (i.e., narrative abilities at the macrolevel; see Selten et al., 2021), but can also be used to examine microstructural narrative skills, including grammatical accuracy and complexity. For the current study, we used the narrative generated by the children, thus excluding children's answers to the comprehension questions, and complemented this with spontaneous language from a preceding conversation. This allowed us to elicit more utterances and to more reliably investigate grammatical skills. The conversation between the researcher and child was about day-to-day topics, such as birthdays, vacations and hobbies.

. . . Procedure
Ethical approval was obtained from the Medical Research Ethics review board of the University Medical Center Utrecht (CCMO registry nr. NL62366.041.17). Parents of participants gave written informed consent. The researchers who worked with the children were the same as those who worked with the preschool children. The individual test session of ∼1 h took place in a quiet room at the University Medical Center Utrecht. Language tests were administered in a fixed order. Spontaneous language as well as the standardized test for expressive grammar were recorded with a Samson Go Mic portable USB microphone. With respect to the transcriptions and annotations of the spontaneous language samples, procedures were similar to what has been previously described for the preschool children (see 2.1.3.). A total of 10% of the annotations, randomly selected from three participants with 22q11DS and three participants with DLD, were compared with annotations from a second researcher. Annotation agreement was reached in 91.5% of Tunits.

. . . Data analysis
Similar to the preschool study, the analyses were performed in Computerized Language Analysis Software (CLAN; MacWhinney, 2000) and SPSS version 28 (IBM Corp, 2021). Independent samples t-tests were done to compare the children with 22q11DS and the children with DLD on the two standardized language measures. Moreover, a paired samples t-test was done to investigate whether there was a discrepancy between expressive grammar (measured with the subtest "recalling sentences") and other language domains (in this case, receptive vocabulary). The data-analysis approach of the spontaneous language of the school-age children corresponded to the approach of the study with preschoolers (see 2.1.4.). The mean percentage of excluded utterances was 17% for interjections/communicators, 1% for onomatopoeia, 4% for unintelligible utterances, and 3% for incomplete sentences. Sample length, calculated as the total number of T-units after exclusions, did not significantly differ between the two groups of children (22q11DS: M = 69, SD = 28; DLD: M = 80, SD = 26; t (27) = 1.09, p = 0.29, d = 0.41). Independent samples t-tests compared scores of the two groups on the main outcome parameters for grammatical accuracy and complexity (Table 2), as well as on the four specific verb-related error categories. As the groups in the school-age study were small, we provided the full statistics for both significant and non-significant results. Effect sizes were interpreted following Cohen (1988).

. . . Standardized language measures
The mean scores of the children with 22q11DS and the children with DLD on the expressive grammar test were 5.1 (SD = 2.2, range = 1-8) and 3.9 (SD = 2.0, range = 1-7), respectively. These scores were not significantly different from each other [t (27) = 1.6, p = 0.13, d = 0.58]. On the receptive vocabulary test, the children with 22q11DS scored, on average, 83.1 (SD = 13.7, range = 66-110). The children with DLD had a mean score of 93.2 (SD = 13.6, range = 72-117), which fell just short of significance relative to the children with 22q11DS [t (26) = 2.0, p = 0.06, d = 0.74]. Comparable to the results from the preschool children, the weakest mean scores for both groups were found on expressive grammar. The discrepancy between the expressive grammar and receptive vocabulary scores was larger for the children with DLD than for the children with 22q11DS, as shown by the results of the paired samples t-tests. A significant difference emerged between expressive grammar and receptive vocabulary in the DLD group [t (14) = 7.0, p < 0.001, d = 1.81], whereas this difference did not reach significance in the 22q11DS group [t (12) = 1.0, p = 0.08, d = 0.52).

. . . Spontaneous language samples
For each of the two groups, the means and standard deviations on all outcome measures for grammatical accuracy and grammatical complexity are presented in Table 6.

. Discussion
Language impairment is characteristic of children with 22q11.2 Deletion Syndrome (22q11DS; Solot et al., 2019), next to other physical and psychological symptoms such as congenital heart defect and low intellectual functioning (McDonald-McGinn et al., 2015). However, the language difficulties of children with 22q11DS have almost exclusively been described with standardized language tests, while the analysis of spontaneous language is more ecologically valid and the preferred method for setting therapy goals in the domain of grammar (Klatte et al., 2022). We aimed to contribute to a more complete overview of the language profile of preschool and school-age children with 22q11DS, conducting two studies in which we complemented standardized language testing with the analysis of spontaneous language. In both studies, we compared children with 22q11DS to age-matched children with Developmental Language Disorder (DLD), who also experience severe language difficulties but for whom the cause is unknown. We focused on children's grammatical skills, as these are typically weak in children with DLD (Leonard, 2014) while relatively unexplored in children with 22q11DS.

. . The language profile of children with q DS
The standardized test results from both the study with preschool children and school-age children confirm that language impairment is common in children with 22q11DS (e.g., Van den Heuvel et al., 2018;Solot et al., 2019;Everaert et al., 2022). Although there was substantial variation within our 22q11DS samples, the mean scores on the standardized subtests were all more than 1 Standard Deviation (SD) below what is expected based on chronological age. In both the preschool and schoolage study, the lowest scores were found on the subtests for expressive grammar, with mean scores between 1.7 and nearly 2 SD below the mean. Although this contrasts with previous research on school-age children with 22q11DS (Glaser et al., 2002;Van den Heuvel et al., 2018), which reported a relative weakness in receptive grammar and semantics, differences between the mean subtest scores were small and strong conclusions about relative strengths and weaknesses in the language profile of children with 22q11DS can therefore not be drawn (see also Everaert et al., 2022). In addition, the results from the two studies that we conducted with different age groups do not give reason to assume unique developmental trends for different language domains or modalities in 22q11DS, as was previously suggested (for a discussion, see Van den Heuvel et al., 2018). Although direct comparisons between the age groups should be interpreted with caution, mean norm scores on the two standardized tests that were included in both studies were comparable between the preschool and school-age children with 22q11DS and thus do not point to a developmental shift in the language profile.
The spontaneous language analysis in the preschool study, which included a typically developing (TD) control group, confirmed the findings from the standardized assessments. Hence, the current study shows that language impairment in 22q11DS is also characterized by weak language performance in reallife situations. During play, our 3-6-year-old participants with 22q11DS produced shorter and less complex utterances than their age-matched TD peers. They also made more grammatical errors in both verb-and non-verb-related categories. The low complexity of the spontaneous language that we observed in the children with 22q11DS corresponds to previous results from a narrative and a perspective-taking task (Persson et al., 2006;Van den Heuvel et al., 2017). However, the results from the current study diverge from Persson et al. (2006) with respect to grammatical accuracy. Their 5-8-year-old participants with 22q11DS produced substantially fewer utterances with grammatical errors than both the preschool and school-age participants with 22q11DS of the current study. This could possibly be explained by a relatively short utterance length of the participants of Persson et al. (2006), which, in turn, could result in fewer grammatical errors. However, Persson et al. (2006) used a narrative task to elicit spontaneous language, which is associated with longer utterances and more errors than elicitation methods such as play or conversation that were used in the current study (e.g., Wetherell et al., 2007). A reverse pattern of findings would have therefore been easier to understand. Note that if we compare our findings to Zwitserlood et al. (2015), a Dutch study which also elicited spontaneous language with a Frontiers in Communication frontiersin.org . /fcomm. . narrative task, we do see differences in the expected direction. The participants of Zwitserlood et al. (2015) produced relatively longer/more complex utterances and made relatively more errors than the participants of the current study, in line with results from research comparing different elicitation methods (e.g., Wetherell et al., 2007).
. . Comparing children with q DS to children with DLD The comparisons of the children with 22q11DS to children with DLD pointed toward differences in their respective receptive language skills and similarities in their expressive language abilities. The preschool children with 22q11DS were outperformed by the children with DLD on the standardized receptive language tests of grammar and vocabulary. A trend in the same direction was observed in the school-age study, which only included one receptive language measure (i.e., receptive vocabulary). We did not find significant differences between the children with 22q11DS and children with DLD on the expressive language tests, in either age group. Like the children with 22q11DS, the children with DLD also scored lowest on the subtests measuring expressive grammar, which was to be expected based on what is known about DLD (e.g., Leonard, 2014). A clear discrepancy between the expressive grammar subtest scores and the scores on the other tested domains was only found in the children with DLD.
The analysis of spontaneous language also revealed that expressive grammar is vulnerable in both 22q11DS and DLD. We did not find evidence for a difference on any of the main outcome parameters gauging grammatical accuracy and complexity between children with 22q11DS and peers with DLD, irrespective of age group. Moreover, the frequency of specific verb-related errors which are known to characterize the spontaneous language of Dutch children with DLD (e.g., De Jong, 1999;Zwitserlood et al., 2015) also did not differ between the groups. In fact, mean scores of the two groups were remarkably close together on many of the outcome variables. This largely confirms the findings from the three previous studies that directly compared children with 22q11DS to children with DLD and also reported substantial overlap between the groups (Kambanaros and Grohmann, 2017;Selten et al., 2021;Van Steensel et al., 2021). Of note, although we were not able to include a TD control group in the school-age study, the overlap in expressive language performance between 22q11DS and DLD suggests that schoolaged children with 22q11DS are likely to struggle with language production in naturalistic settings. This confirms the findings in the preschool study.

. . Implications, limitations and future directions
Our findings highlight the necessity to regularly assess and monitor the language development of children with 22q11DS as part of routine clinical care, as recommended by Solot et al. (2019). Given the broad linguistic weaknesses of children with 22q11DS, but also the large individual differences in the severity of these weaknesses, routine assessments from a young age onward are necessary to support early interventions, and, in turn, mitigate the ramifications of language impairment and improve outcomes. Research can contribute to these goals by providing more knowledge on these individual differences and the factors that are associated with those differences (e.g., intellectual functioning, SES, physical symptoms, etc.), which was beyond the scope of the current research. In addition, future research can provide more insight into the developmental trajectory of the language skills of children with 22q11DS. Although our results suggest comparably severe weaknesses in both preschool and schoolage groups, a limitation of the current research is the lack of a TD control group in the school-age study as well as the small sample size in this age group. Moreover, the cross-sectional nature of our research does not allow us to draw conclusions about children's developmental trajectories. There is a strong need for longitudinal research on the language impairment of children with 22q11DS in comparison to TD peers, particularly as previous work suggested an increasing severity of receptive language impairment with age ( Van den Heuvel et al., 2018) and in light of the observation that intellectual functioning declines during childhood and adolescence in 22q11DS (e.g., Fiksinski et al., 2022).
The current study showed substantial overlap between children with 22q11DS and children with DLD in terms of expressive grammatical skills, as evidenced by both standardized language assessment and spontaneous language analysis. Given inherent differences between children with 22q11DS and children with DLD, this overlap has important theoretical implications. Neither the large differences in intellectual functioning and co-occurring physical symptoms, nor the presence or absence of a known genetic condition, seems to result in differences in the expressive grammatical skills of these two groups of children. Our findings thereby correspond to other studies that showed more commonalities than differences in the grammatical skills of etiologically diverse groups of children (e.g., Bloom and Lahey, 1978;Bol and Kuiken, 1990;Laws and Bishop, 2004;Bol and Kasparian, 2009), and support the consensus among professionals on this topic (Bishop et al., 2016). It appears that there are multiple routes toward impaired grammar development with similar, or even virtually identical, phenotypic characteristics. The shared phenotypic characteristics of children's expressive grammar could be hypothesized to reflect, at least in part, simplification processes that are typical for earlier stages of development. In other words, if acquiring or using grammatical rules is, for whatever reason, difficult, there are common ways to make it easier. The current study was, however, not set up to test this hypothesis and was limited by the use of standardized tests and spontaneous language samples. Comparative research on language impairment in etiologically diverse groups, preferably with experimental designs (see e.g., Perovic et al., 2013), is needed to understand the observed commonalities and differences in children's language profiles. . /fcomm. .
As mentioned, the current study did not only find similarities in the language profiles of children with 22q11DS and children with DLD. Receptive language difficulties were more severe in children with 22q11DS, showing that, despite overlap, different disorders have their own profile of relative strengths and weaknesses (e.g., Rice et al., 2005;Fidler et al., 2007). Given the poor prognosis of children with receptive language problems (e.g., Snowling et al., 2006;Zambrana et al., 2014) and the uncertainty about the effectiveness of therapy in this group (Law et al., 2003), special attention to these problems in children with 22q11DS is warranted in both research and clinical care. A possible avenue for future research would be to compare children with 22q11DS to a subgroup of children with DLD who both have expressive and receptive language problems. This can provide further insight into the mechanisms underlying (impaired) language development, for example enhancing our knowledge on the relation between low intellectual functioning and receptive language problems. It is also of clinical relevance, as children with 22q11DS and children with DLD often get language support in similar services, such as speech-language therapy and special education (see Boerma et al., 2022). The overlap in expressive grammar of the two groups of children may offer professionals working with children with 22q11DS a starting point for setting therapy goals in the domain of grammar. Moreover, it may even suggest that expressive grammar interventions targeting children with DLD also benefit children with 22q11DS. Although studies directly investigating the effectiveness of interventions in 22q11DS are a crucial next step, a subgroup comparison with children with DLD who have both expressive and receptive language problems could furthermore inform professionals about the usefulness of receptive language interventions with children with 22q11DS.

. . Conclusion
The current study is the first to investigate grammatical accuracy and complexity in the spontaneous language of children with 22q11DS. Complementing spontaneous language analysis with standardized testing in preschool and school-aged children, we showed weak expressive grammar in both naturalistic as well as standardized test settings, thereby contributing to a more complete description of the language profile of children with 22q11DS. The expressive grammatical skills of the children with 22q11DS did not differ from those of children with DLD, despite clear differences between the two groups in the presence or absence of known etiology and accompanying cognitive and physical challenges. This overlap indicates that expressive grammar may be a shared and significant vulnerability across different populations that can further our knowledge of the mechanisms underlying language acquisition and that can improve clinical care for children such as those with 22q11DS. The observed weaker receptive language skills of the children with 22q11DS compared to the children with DLD show that different disorders are associated with a unique language profile of strengths and weaknesses. It is an open question whether the differences in receptive language are related to factors which inherently differentiate the 22q11DS and DLD groups.

Data availability statement
The datasets presented in this article are not readily available because the datasets generated and/or analyzed for the current study are not publicly available due to GDPR compliance as well as legal and ethical limitations. A limited amount of data can be shared by the corresponding author upon reasonable request. Requests to access the datasets should be directed to t.d.boerma@uu.nl.

Ethics statement
The studies involving human participants were reviewed and approved by Medical Research Ethics Review Board of the University Medical Center Utrecht. Written informed consent to participate in this study was provided by the participants' legal guardian/next of kin.