SYSTEMATIC REVIEW article

Front. Pediatr., 23 February 2022

Sec. Children and Health

Volume 10 - 2022 | https://doi.org/10.3389/fped.2022.801220

Systematic Review and Meta-Analysis of Screening Tools for Language Disorder

  • Academic Unit of Human Communication, Development, and Information Sciences, Faculty of Education, The University of Hong Kong, Hong Kong, Hong Kong SAR, China

Article metrics

View details

18

Citations

12k

Views

3,1k

Downloads

Abstract

Language disorder is one of the most prevalent developmental disorders and is associated with long-term sequelae. However, routine screening is still controversial and is not universally part of early childhood health surveillance. Evidence concerning the detection accuracy, benefits, and harms of screening for language disorders remains inadequate, as shown in a previous review. In October 2020, a systematic review was conducted to investigate the accuracy of available screening tools and the potential sources of variability. A literature search was conducted using CINAHL Plus, ComDisCome, PsycInfo, PsycArticles, ERIC, PubMed, Web of Science, and Scopus. Studies describing, developing, or validating screening tools for language disorder under the age of 6 were included. QUADAS-2 was used to evaluate risk of bias in individual studies. Meta-analyses were performed on the reported accuracy of the screening tools examined. The performance of the screening tools was explored by plotting hierarchical summary receiver operating characteristic (HSROC) curves. The effects of the proxy used in defining language disorders, the test administrators, the screening-diagnosis interval and age of screening on screening accuracy were investigated by meta-regression. Of the 2,366 articles located, 47 studies involving 67 screening tools were included. About one-third of the tests (35.4%) achieved at least fair accuracy, while only a small proportion (13.8%) achieved good accuracy. HSROC curves revealed a remarkable variation in sensitivity and specificity for the three major types of screening, which used the child's actual language ability, clinical markers, and both as the proxy, respectively. None of these three types of screening tools achieved good accuracy. Meta-regression showed that tools using the child's actual language as the proxy demonstrated better sensitivity than that of clinical markers. Tools using long screening-diagnosis intervals had a lower sensitivity than those using short screening-diagnosis intervals. Parent report showed a level of accuracy comparable to that of those administered by trained examiners. Screening tools used under and above 4yo appeared to have similar sensitivity and specificity. In conclusion, there are still gaps between the available screening tools for language disorders and the adoption of these tools in population screening. Future tool development can focus on maximizing accuracy and identifying metrics that are sensitive to the dynamic nature of language development.

Systematic Review Registration:

https://www.crd.york.ac.uk/prospero/display_record.php?RecordID=210505, PROSPERO: CRD42020210505.

Introduction

Language disorder refers to persistent language problems that can negatively affect social and educational aspects of an individual's life (1). It is prevalent and estimated to affect around 7.6% of the population (2). Children with language disorder may experience difficulties in comprehension and/or in the use of expressive languages (3). Persistent developmental language disorder not only has a negative impact on communication but is also associated with disturbance in various areas such as behavioral problems (4), socio-emotional problems (5), and academic underachievement (6).

Early identification of persistent language disorder is challenging. There are substantial variabilities in the trajectories of early language development (7, 8). Some children display consistently low language, some appear to resolve the language difficulties when they grow older, and some demonstrated apparently typical early development but develop late-emerging language disorder. This dynamic nature of early language development has introduced difficulties in the identification process in practice (9). Therefore, rather than a one-off assessment, late talkers under 2 years old are recommended to be reassessed later. Referral to evaluation may not be not based on positive results in universal screening, but mainly concerns from caregivers, the presence of extreme deviation in development, or the manifestation of behavioral or psychiatric disturbances under 5 years old (9). Those who have language problems in the absence of the above conditions are likely to be referred for evaluation after 5 years old. Only then will they usually receive diagnostic assessment.

Ideally, screening should identify at-risk children early enough to provide intervention and avoid or minimize adverse consequences for them, their families, and society, improving the well-being of the children and the health outcomes of the population at a reasonable cost. Despite the high prevalence and big impact of language disorder, universal screening for language disorder is not practiced in every child health surveillance. Screening in the early developmental stages is controversial (10). While early identification has been advocated to support early intervention, there are concerns about the net cost and benefits of these early screening exercises. For example, the US Preventive Task Force reviewed evidence concerning screening for speech and language delay and concluded that there was inadequate evidence regarding the accuracy, benefits, and harms of screening. The Task Force therefore did not support routine screening in asymptomatic children (11). This has raised concerns in the professional community who believe in the benefits of routine screening (12). However, it is undeniable that another contributing factor for the recommendation of the Task Force was that screening tools for language disorder vary greatly in design and construct resulting in the variability in identification accuracy.

Previous reviews of screening tools for early language disorders have shown that these tools make use of different proxies for defining language issues, including a child's actual language ability, clinical markers such as non-word repetition, or both (13). Screening tools have been developed for children at different ages [e.g., toddlers (14) and preschoolers (15)] given the higher stability of language status at a later time point (16, 17). Screening tools also differ in the format of administration. For example, some tools are in the form of a parent-report questionnaire while some have to be administered by trained examiners via direct assessment or observations. Besides the test design, methodological variations have also been noted in primary validation studies, such as the validation sample, the reference standards (i.e., the gold standard for language disorder), and the screening-diagnosis interval. These variations might eventually lead to different levels of screening accuracy, which has been pointed out in previous systematic reviews (10, 13).

These variations have been examined in terms of the screening accuracy (13). Parent-report instruments and trained-examiner screeners have been found to be comparable in screening accuracy. In longitudinal studies in which language disorder status has been validated at various time points, accuracy appears to be lower for longer-term prediction than for concurrent prediction. Although the reviews have provided a comprehensive overview regarding the variations in different language screening tools, the analyses have mainly been based on qualitative and descriptive data. In the current study we performed a systematic review of all currently available screening tools for early language disorders that have been validated against a reference standard. We report on the variations noted in terms of (1) the type of proxy used in defining language disorders, (2) the type of test administrators, (3) the screening-diagnosis intervals and (4) age of screening. Second, we conducted a meta-analysis of the diagnostic accuracy of the screening tools and examined the contributions of the above four factors to accuracy.

Methods

The protocol for the current systematic review was registered at PROSPERO, an international prospective register of systematic reviews (Registration ID: CRD42020210505, record can be found on https://www.crd.york.ac.uk/prospero/display_record.php?RecordID=210505). Due to COVID-19, the registration was published with basic automatic checks in eligibility by the PROSPERO team. The Preferred Reporting Items for Systematic Reviews and Meta-Analyses for Diagnostic Test Accuracy (PRISMA-DTA) (18) checklist was used as a guide for the reporting of this review.

Search Strategy

A systematic search of the literature was conducted in 2020 October based on the following databases: CINAHL Plus, ComDisDome, PsycINFO, PsycArticles, ERIC, PubMed, Web of Science, and Scopus. The major search terms were as follows: Child* OR Preschool* AND “Language disorder” * OR “language impairment*” OR “language delay” AND Screening OR identif*. To be as exhaustive as possible, the earliest studies available in the databases and those up to October 2020 were retrieved and screened. Appendix A Table A1 showed the detailed search strategies in each database. Articles from the previous reviews were also retrieved.

Inclusion and Exclusion Criteria

The relevance of the titles, abstracts, and then the full texts were determined for eligibility. Cross-sectional or prospective studies validating screening tools or comparing different screening tools for language disorders were included in the review. The focus was on screening tools validated with children aged 6 or under from the general population or those with referral, regardless of the administration format of the tools, or how language disorder was defined in the studied. Studies that did not report adequate data on the screening results, and in which accuracy data cannot be deduced from the data reported, were excluded from the review (see Appendix A Table A2 for details).

Data Extraction

Data was extracted by the first author using a standard data extraction form. The principal diagnostic accuracy measures extracted were test sensitivity and specificity. The number of people being true positives (TP), true negatives (TN), false positives (FP) and false negatives (FN) was also extracted. Sensitivity and specificity were calculated based on 2 by 2 contingency tables in the event of discrepancy between the text description and the data reported. The data extraction process was repeated after the first extraction to improve accuracy. Screening tools with both sensitivity and specificity exceeding 0.90 were regarded as good and those with both measures exceeding 0.80 but below 0.90 were regarded as fair (19).

Quality Assessment

Quality assessment of included articles was conducted by the first author using QUADAS-2 by Whiting, Rutjes (20). QUADAS-2 can assist in assessing risk of bias (ROB) in diagnostic test accuracy studies with signaling questions concerning four major domains. The ROB in patient selection, patient flow, index tests, or the screening tools in the current review, and the reference standard tests were evaluated. Ratings of ROB for individual studies were illustrated using a traffic light plot. A summary ROB figure weighted with sample size was generated using the R package “robvis” (21). Due to the large discrepancy in the sample size across studies, an unweighted summary plot was also generated to show the ROB of the included studies.

Data Analysis

The overall accuracy of the tools was compared using descriptive statistics. Because sensitivity and specificity are correlated, increasing either one of them by varying the cut-off of test positivity would usually result in a decrease in the other. Therefore, a bivariate approach was used to jointly model sensitivity and specificity (22) in generating hierarchical summary receiver-operating characteristic (HSROC) curves to assess the overall accuracy of screening by proxy and by screening-diagnosis intervals. HSROC is a more robust method accounting for both within and between study variabilities (23).

Three factors that could be associated with screening accuracy, chosen a priori, were included in the meta-analysis: proxy used, test administrators, and screening-diagnosis interval. Effect of screening age on accuracy was also evaluated. The effect of each variable was evaluated using a separate regression model. The variables of proxy used were categorical, with the categories being “child's actual language,” “performance in clinical markers,” and “using both actual language and performance in clinical markers.” Test administrator was also a categorical variable with the categories being “parent” and “trained-examiners.” The variable of screening-diagnosis interval was dichotomously defined—intervals within 6 months were categorized as evaluating concurrent validity, whereas intervals of more than 6 months were categorized as evaluating predictive validity. The variable of screening age was also dichotomously defined with age 4 as the cut-off– those screened for children under the age of 4yo and those for children above 4yo. This categorization was primarily based on the age range of the sample, or the target screening age reported by the authors. Studies with age range that span across age 4 were excluded from the analysis. Considering the different thresholds used across studies and the correlated nature of sensitivity and specificity, meta-regression was conducted using a bivariate random effect model based on Reitsma et al. (22).

For studies examining multiple index tests and/or multiple cut-offs using the same population, only one screening test per category per study was included in the HSROC and meta-regression models. The test or cut-off with the highest Youden's index was included in the meta-analytical models. Youden's Index, J, was defined as

All data analyses were conducted with RStudio Version 1.4.1106 using the package mada (24). Sensitivity analysis was carried out to exclude studies with a very high ROB (with 2 or more indicating a high risk in rating) to assess its influence on the results.

Results

A total of 2351 articles, including 815 duplicates, were located using the search strategies, and an additional 15 articles were identified from previous review articles. After the inclusion and exclusion criteria were applied, a final sample of 47 studies were identified for inclusion in the review. Figure 1 shows the number of articles included and excluded at each stage of the literature search.

Figure 1

Risk of Bias

The weighted overall ROB assessment for the 47 studies is shown in Figure 2A, and the individual rating for each study is shown in Appendix B. Overall, half of the data was exposed to a high ROB in the administration and interpretation of the reference standard test, while almost two-thirds of the data had a high ROB in the flow and timing of the study. As indicated by the unweighted overall ROB summary plot in Figure 2B, half of the 47 studies were unclear about whether the administration and interpretation of the reference standard test would introduce bias. This was mainly attributable to a lack of reporting of the reference standard test performance. About half of the studies had a high ROB in the flow and timing of the study. This usually arose from a highly variable or lengthy follow-up period.

Figure 2

Types and Characteristics of Current Screening Tools for Language Disorder

A total of 67 different index tests (or indices) were evaluated in the 47 included articles. The tests were either individual tests per se or part of a larger developmental test. The majority (50/67, 74.6%) of the screening tools examined children's actual language. Thirty of these index tests involved parents or caregivers as the main informants. Some of these screening tools were in the form of a questionnaire with Yes-No questions regarding children's prelinguistic skills, receptive language, or expressive language based on parent's observations. Some used a vocabulary checklist (e.g., CDI, LDS) in which parents checked off the vocabulary their child can was able to comprehend and/or produce. Some tools also asked parents to report their child's longest utterances according to their observation and generated indices. The other 20 index tests on language areas were administered by trained examiners such as nurses, pediatricians, health visitors or speech language pathologists (SLPs). These screening tools were constructed as checklists, observational evaluations, or direct assessments, tapping into children's developmental milestones, their word combinations and/or their comprehension, expression, and/or articulation. Some of these direct assessments involved the use of objects or pictures as testing stimuli for children.

A small proportion (3/67, 4.48%) of tests evaluated clinical markers performance including non-word repetitions and sentence repetitions rather than children's actual structural language skills or communication skills. About nine percent (6/67, 8.96%) screened for both language abilities and clinical markers. Both types of tests required trained examiners to administer them. The tests usually made use of a sentence repetition task and one test also included non-word repetition. Another nine percent (6/67, 8.96%) utilized indices from language sampling, such as percentage of grammatical utterances (PGU), mean length of utterances in words (MLU3-W), and number of different words (NDW) as proxies. These indices represented a child's syntactic, semantic, or morphological performance. The smallest proportion (2/67, 2.99%) of the tests elicited parental concerns about their children being screened for language disorder. One asked parents to rate their concern using a visual analog scale, while the other involved interviews with the parents by a trained examiner.

Sixty-five of the 67 screening tools had reported concurrent validity. Tables 15 summarize the characteristics of these 65 studies by the proxy used. Nine studies investigated the predictive validity of screening tools. Table 6 summarizes the studies. All the studies used child's actual language ability as the proxy.

Table 1

ReferencesAgentIndex testReference standard test(s)Sc. age (months)aNSNSPAccuracybIncluded in meta-analysis
Allen and Bliss (25)Trained personnelThe Northwestern Syntax Screening Test (26)Sequenced inventory of communication development (27)36–471820.920.48Below fair
Blaxley et al. (28)Trained personnelBankson Language Screening Test (29)Developmental sentence scoring (30)48–72900.460.94Below fair
Burden et al. (31)Parents/caregiversThe Parent Language Checklist and The Developmental Profile II (32)Action Picture Test (33), Bus Story test (34), self-developed test on receptive and phonological ability36–394250.870.45Below fair
Carscadden et al. (35)Parents/caregiversSpeech and Language Pathology Early Screening Instrument (35)Receptive Expressive Emergent Language Test – 3rd Edition (36)17–23530.910.95Good
Chaffee et al. (37)Parents/caregiversMinnesota Child Development Inventory – Comprehension Conceptual LanguageReynell Developmental Language Scales – revised (38)24–87
M = 49
1520.760.63Below fair
Minnesota Child Development Inventory – Expressive Language (39)0.890.45Below fair×
Dias et al. (40)Parents/caregiversScreening Tool by ASHA (41)ABFW test (42)0–609620.830.99Fair
Dixon et al. (43)Trained personnelThe Hackney Early Language Screening Test (43)Reynell Developmental Language Scales (44), Lowe and Costello Symbolic Play Test (45)30400.940.95Good
Gray et al. (46)Trained personnelExpressive One-word Picture Vocabulary Test (47)Referred by speech-language pathologist48–60620.710.71Below fair×
Peabody Picture Vocabulary Test – III (48)0.740.71Below fair×
Receptive One-word Picture Vocabulary Test (49)0.770.77Below fair
Expressive Vocabulary Test (50)0.710.68Below fair×
Guiberson (14)Parents/caregiversParent reported vocabularyBilingual early childhood assessment team identification, parent report of concern, Spanish Preschool Language Scale – 4th Edition (51)24–35620.860.88Fair
Parent report of mean length of child's three longest utterances0.460.93Below fair×
Guiberson and Rodriguez (52)Parents/caregiversPilot Inventories III, translated version of MacArthur- Bates Communicative Development Inventory-III (53)Spanish Preschool Language Scale – 4th Edition (51)36–62
M = 45.5
480.820.81Fair
Ages and Stages Questionnaire – communication subscales (54)0.590.92Below fair×
Guiberson et al. (55)Parents/caregiversReported children's three longest utterancesParent concern, enrollment in speech-language intervention services, Spanish Preschool Language Scale – 4th Edition (51)24–35
M = 29.4
450.910.86Fair
Ages and Stages Questionnaire – communication subscales (56)0.560.95Below fair×
The Inventarios del Desarrollo de Habilidades Communicatives Palabras u Enunciado (57)0.870.86Fair×
Guiberson et al. (58)Parents/caregiversVocabulary scoreSLP assessment, parental concern, Spanish Preschool Language Scale – 4th Edition (51)37–69
M = 53.7
820.790.77Below fair
Language questions0.740.69Below fair×
Heilmann et al. (59)Parents/caregiversMacArthur- Bates Communicative Development Inventory – Words and Sentences (60)Preschool Language Scale – 3rd Edition (61), language sampling24
M = 23.8
1000.680.98Below fair
Klee et al. (62)Parents/caregiversThe Language Development Survey (63)Mullen Scales of Early Learning (64), language sampling, parent interview, direct observation24–26
M=24.7
640.910.87Fair×c
Klee et al. (65)Parents/caregiversThe Language Development Survey (63)Mullen Scales of Early Learning (64), language sampling, language sampling, parent interview, direct observation24–26
M = 24.7
640.910.96Good
Laing et al. (66)Trained PersonnelStructured Screening TestReynell Developmental Language Scales – III (67)30–36
M=32
2820.660.89Below fair
Law (68)Trained personnelStructured Screening TestReynell Developmental Language Scales (2nd revision) (44)301890.860.76Below fair
Levett and Muir (69)Trained personnelLevett-Muir Language Screening Test (69)Reynell Developmental Language Scales (revised) (70), Goldman-Fristoe Test of Articulation (71), Language Assessment and Remediation Procedure (72)34.9–39.64211Good
Visser-Bochane et al. (73)Parents/caregiversEarly Language Screen (73)LLC (74), SLC (75), LLP (76), SWP, SSP (77), LS-CCS (78), CCC-PCS (79)12–721240.790.86Below fair
Visser-Bochane et al. (80)Trained personnelThe Dutch well child language screening protocol (80)SLC (75), SWP, SSP (77)262650.620.93Below fair
Mattsson et al. (81)Parents/caregivers and trained personnelQuestionnaire and Direct Observation by nurseClinical Examination by SLP28–32
M = 30
1050.810.87Fair
McGinty (82)Parents/caregivers and trained personnelThe Mayo Early Language Screening Test (83)Reynell Developmental Language Scales (44), Edinburgh Articulation Test (84)18–602000.840.7Below fair
Nair et al. (85)Trained personnelThe Language Evaluation Scale Trivandrum For 0–3 Years (85)Receptive-Expressive Emergent Language Scale (86)0–366430.960.78Below fair
Nayeb et al. (87)Trained personnelNurse screeningClinical Examination by SLP29–3110010.85Fair
Puglisi et al. (15)Trained personnelScreening for Identification of Oral Language Difficulties by Preschool Teachers (15)Expressive Vocabulary Test (88), Test for Reception of Grammar Version 2 (89), The Brazilian Children's Test of Pseudoword Repetition (90),51–65
M = 57
1000.860.95Fair
Rescorla (63)Parents/caregiversThe Language Development Survey (63)Reynell Developmental Language Scales (38)23.7–34.4
M = 25.9
810.760.89Below fair
Rescorla and Alley (91)Parents/caregiversThe Language Development Survey (63)Reynell Developmental Language Scales (44)23.7–34.4
M = 25.9
660.890.77Below fair
Sachse and Von Suchodoletz (92)Parents/caregiversGerman version of the CDI, Toddler Form-2 (93)Language Test for 2-Year-Old Children (94)24–261170.930.87Fair
Stokes (95)Trained personnelNurse screenLanguage sampling, Reynell Developmental Language Scales (70)34–403660.770.97Below fair
Parent/caregiversParent Questionnaire0.750.95Below fair×
van Agt et al. (96)Parents/caregiversVan Wiechen (96)Specialists' judgement26–58
M = 39
8,8770.710.89Below fair
General Language Screen (97)0.810.78Below fair×
Language Screening Instrument – Parent Form (98)0.860.73Below fair×
Trained personnelLanguage Screening Instrument – Child Test (98)0.540.88Below fair×
Walker et al. (99)Parents/caregiversEarly Language Milestone Scale (100)Sequenced Inventory of Communication Development (27)0–36770.770.85Below fair
Wetherby et al. (101)Parents/caregiversCommunication And Symbolic Behavior Scales – Developmental Profile, Infant-Toddler Checklist (102)Behavior Sample12–24
M = 14.5
1510.890.74Below fair

Studies involving tools based on a child's actual language ability.

For tests that were validated against multiple cut-offs, only the one with highest Youden's index was shown; Sc. Age, screening age; MA, Meta-analysis; ASHA, American Speech-Language and Hearing Association; ABFW, Andrade CRF, Befi-Lopes DM, Fernandes FDM, Wertzner HF. Teste de Language Infantil nas Áreas de Fonologia, Vocabulário, Fluência e Pragmática. 2nd ed. Barueri: Pró-Fono, 2011; LLC, Lexilist Comprehension; SLC, Schlichting test for Language Comprehension; LLP, Lexilist Production; SWP, Schlichting test for Word Production; SSP, Schlichting test for Sentence Production; LS-CCS – Language Standard – Communication Schlichting test for Language Composite Score; CCC-PCS, CCC-2-NL-Pragmatic Composite Score.

a

Age of screening is reported in range or mean in the form of X1-X2 and M=X3; In case range or mean is not reported, the intended age for screening of the tool will be reports as X4.

b

Based on Plante and Vance (19), Fair = over 0.8 in both sensitivity and specificity; Good = over 0.9 in both sensitivity and specificity.

c

Not included because the sample was identical to Klee et al. (65).

Table 2

ReferencesAgentIndex testReference standard test(s)Sc. agea (months)NSNSPAccuracybIncluded in meta-analysis
Guiberson et al. (58)Trained personnelNon-word RepetitionSLP assessment, parental concern, Spanish Preschool Language Scale – 4th Edition (51)37–69
M = 53.7
820.740.75Below fair
Kapalkova et al. (103)Trained personnelNon-word repetitionClinical judgment and qualitative assessment51–66320.941Good
Nash et al. (104)Trained personnelThe Grammar and Phonology Screening (GAPS) Test (105)Clinical Evaluation of Language Fundamentals – Preschool, 2nd Edition (106)36–72
M = 62.3
1060.30.91Below fair
Sturner et al. (107)Trained personnelThe Sentence Repetition Screening Task (108)Illinois Test of Psycholinguistic Abilities (109), Bankson Language Screening Test (29)54–66
Med = 60
3230.620.91Below fair
van der Lely et al. (110)Trained personnelThe Grammar and Phonology Screening (GAPS) Test (105)Assessment by SLP and educational psychologist43–804111Good

Studies involving tools based on clinical marker.

For tests that were validated against multiple cut-offs, only the one with highest Youden's index was shown; Sc. Age, screening age.

a

Age of screening is reported in range, mean or median in the form of X1-X2, M=X3 and Med=X4, respectively.

b

Based on Plante and Vance (19), Fair = over 0.8 in both sensitivity and specificity; Good = over 0.9 in both sensitivity and specificity.

Table 3

StudyAgentIndex testReference standard test(s)Sc. agea (months)NSNSPAccuracybIncluded in meta-analysis
Allen and Bliss (25)Trained personnelThe Fluharty Preschool Screening Test (111)Sequenced Inventory of Communication Development (112)36–471820.60.81Below fair
Benavides et al. (113)Trained personnelTamiz de Problemas de Lenguaje (113)Clinical Evaluation of Language Fundamentals- 5th edition, Spanish Version (114)48–722000.940.92Good
Blaxley et al. (28)Trained personnelThe Fluharty Preschool Screening Test (115)Developmental Sentence Scoring (116)48–72900.360.96Below fair
Bliss and Allen (117)Trained personnelThe Screening Kit of Language Development (118)Sequenced Inventory of Communication Development (112), clinical judgment by SLP30–4810010.93Good
Lavesson et al. (119)Trained personnelLanguage tasks and non-word repetition (119)SLP judgment based on test results46–53
M = 48.5
3280.840.96Fair
Matov et al. (120)Trained personnelShort Language Measures (121)Clinical Evaluation of Language Fundamentals-4 (122)63.61260.940.93Good
Wright and Levin (123)Trained personnelPreschool Articulation and Language Screening (123)SLP judgement based on test results26–811520.710.94Below fair

Studies involving tools based on both language ability and clinical marker.

For tests that were validated against multiple cut-offs, only the one with highest Youden's index was shown; Sc. Age, screening age.

a

Age of screening is reported in range or mean in the form of X1-X2 and M=X3; In case range or mean is not reported, the intended age for screening of the tool will be reports as X4..

b

Based on Plante and Vance (19), Fair = over 0.8 in both sensitivity and specificity; Good = over 0.9 in both sensitivity and specificity.

Table 4

ReferencesAgentIndex testReference standard test(s)Sc. agea (months)NSNSPAccuracybIncluded in meta-analysis
Eisenberg and Guo (124)Trained personnelPercentage Grammatical UtterancesLI2: Previously diagnosed LI3: Parent rating, Structured Photographic Expressive Language Test – Preschool 2nd Edition (125)36–473410.88Fair
Percentage Sentence Point3410.82Fair×
Percentage Verb Tense Usage (126)3410.82Fair×
Guiberson et al. (58)Trained personnelUngrammaticality IndexSLP assessment, parental concern, Spanish Preschool Language Scale – 4th Edition (51)37–69
M = 53.7
820.590.67Below fair×
Trained personnelMean Length of Utterances in Words0.650.92Below fair
Guiberson (14)Parents/caregiversNumber of Different WordsBilingual early childhood assessment team identification, parent report of concern, Spanish Preschool Language Scale – 4th Edition (51)24–35620.730.83Below fair

Studies involving tools based on language sampling.

For tests that were validated against multiple cut-offs, only the one with highest Youden's index was shown; Sc. Age, screening age; LI2, language impairment at age 2; LI3, language impairment at age 3.

a

Age of screening is reported in range or mean in the form of X1-X2 and M=X3; In case range or mean is not reported, the intended age for screening of the tool will be reports as X4.

b

Based on Plante and Vance (19), Fair = over 0.8 in both sensitivity and specificity; Good = over 0.9 in both sensitivity and specificity.

Table 5

ReferencesAgentIndex testReference standard test(s)Sc. agea (months)NSNSPAccuracybIncluded in meta-analysis
Laing et al. (66)Parents/caregiversParent led methodReynell Developmental Language Scales – III (67)30–36
M = 32
1760.790.74Below fair
van Agt et al. (96)Parents/caregiversVisual analog scale to evaluate child's language developmentSpecialists' judgement26–58
M = 39
8,8770.760.81Below fair

Studies involving tools based on parental concern.

For tests that were validated against multiple cut-offs, only the one with highest Youden's index was shown; Sc. Age, screening age.

a

Age of screening is reported in range or mean in the form of X1-X2 and M=X3; In case range or mean is not reported, the intended age for screening of the tool will be reports as X4.

b

Based on Plante and Vance (19), Fair = over 0.8 in both sensitivity and specificity; Good = over 0.9 in both sensitivity and specificity.

Table 6

ReferencesAgentIndex testSc. age
(months)
Sc-V int. (months)F/U age (months)Reference standard test(s)NSNSPAccuracyaMA included
Bruce et al. (127)Parents/caregivers and trained personnelDirect assessment through play and parent questionnaire18–22NA54NELLI (128)b, The Test for Reception of Grammar (129)430.60.85Below fair
Frisk et al. (130)Trained personnelEarly Screening Profiles (131)54NA60Preschool Language Scale – 4th Edition (132)1100.860.81Fair
Parents/caregiversAges and Stages Questionnaire (54)Bracken Basic Concepts Scale1100.840.66Below fair×
Trained personnelBattelle Developmental Inventory Screening Test (133)Preschool (134) Language Scale – 4th Edition (132)1100.680.86Below fair×
Trained personnelBrigance Preschool Screen (135)Preschool Language Scale – 4th Edition (132)1100.910.78Below fair×
Jessup et al. (136)Trained personnelKindergarten Development Check (137)48–548–12NAClinical Evaluation of Language Fundamentals-4 (122)2860.50.93Below fair
Klee et al. (62)Parents/caregiversThe Language Development Survey (63)24NA36–40Mullen Scales of Early Learning (64), language sampling, parent interview, direct observation360.670.9Below fair
Pesco and O'Neill (138)Parents/caregiversLanguage Use Inventory (139)24–4714.54–54.76NADELV- NR (140) CELF-2 (141), Children's Communication Checklist – 2nd Edition (142)2360.810.93Fair
Sachse and Von Suchodoletz (92)Parent/caregiversGerman Version of The CDI, Toddler Form-2 (93)24–2612NALanguage Test For 3–5-Year-Old Children (94)1020.940.61Below Fair×
Trained personnelLanguage Test for 2-Year-Old Children (94)24–2612NALanguage Test For 3–5-Year-Old Children (94)1020.940.64Below Fair
Visser-Bochane et al. (80)Trained personnelThe Dutch well-child language screening protocol (80)M = 2612NASLC (75), SWP, SSP (77)1230.820.74Below Fair
Westerlund et al. (143)Parents/caregiversThe Swedish Communication Screening at 18 Months of Age (144, 145)18NA36LO-3 (146, 147)8910.50.9Below Fair
Trained personnelTraditional Methods18NA36LO-3 (146, 147)1,1890.320.91Below Fair×
Wetherby et al. (101)Parents/caregiversCommunication And Symbolic Behavior Scales – Developmental Profile Infant-Toddler Checklist (102)12–24M = 14.5NAMullen Scales of Early Learning (148), Preschool Language Scale – 3rd Edition (61)2460.810.79Below Fair×
Trained personnelBehavioral Sample12–24M = 18.2NAMullen Scales Of Early Learning (148), Preschool Language Scale – 3rd Edition (61)900.840.85Fair

Studies assessing predictive validity of screening tools.

For tests that were validated against multiple cut-offs, only the one with highest Youden's index was shown; Sc. Age, screening age; Sc-V int., Screening-validation Interval; F/U age, age at follow-up; DELV-NR, Diagnostic Evaluation of Language Variation – Norm Referenced; CELF-2, Clinical Evaluation of Language Fundamentals – Preschool, 2nd Edition; LO-3, Language Observation at 3 years of age.

a

Based on Plante and Vance (19), Fair = over 0.8 in both sensitivity and specificity; Good = over 0.9 in both sensitivity and specificity.

b

Spraklig snabbscreening av forskolebarn 3–6 arunderlag for diagnostisering av art och grad av sprakstorning, Stora Fonemtestet. Pedagogisk, Grammatiktest. Pedagogisk.

cBased on Table 5 in the paper, description in the discussion differed from the figures in the table.

Screening Accuracy

Two of the 67 screening tools only reported predictive validity. Of the 65 screening tools that reported concurrent validity, about one-third (23/65, 35.4%) achieved at least fair accuracy and a smaller proportion (9/65, 13.8%) achieved good accuracy. The nine tools which achieved good accuracy include (i) Non-word Repetition, (ii) Speech and Language Pathology Early Screening Instrument (35), (iii) The Hackney Early Language Screening Test (43), (iv) The Language Development Survey (63), (v) Levett-Muir Language Screening Test (69), (vi) The Grammar and Phonology Screening (GAPS) Test (105), (vii) Tamiz de Problemas de Lenguaje (113), (viii) The Screening Kit of Language Development (117) and (ix) Short Language Measures (120).

Screening Performance by Proxy and Screening-Diagnosis Interval

Screening tools based on children's actual language ability had a sensitivity ranging from 0.46 to 1 (median = 0.81) and a specificity of 0.45 to 1 (median = 0.86). About 30% of the studies showed that their tools achieved at least fair accuracy, while 8.89% achieved good accuracy. Screening tools using clinical markers had a sensitivity ranging from 0.3 to 1 (median = 0.71) and a specificity of 0.45 to 1 (median = 0.91). Two of the five studies1 (40%) evaluating screening tools based on clinical markers showed their tools had good sensitivity and good specificity, but the other three studies showed a sensitivity and a specificity below fair. Concerning screening tools based on both actual language ability and clinical marker performance, the sensitivity ranged from 0.36 to 1 (median = 0.84), and the specificity ranged from 0.81 to 0.96 (median=0.93) and above half of these studies (4/72, 57.1%) achieved at least fair performance in both sensitivity and specificity, and 3 of the 7 studies achieved good performance. Screening tools based on indices from language sampling had sensitivity ranging from 0.59 to 1 (median = 0.865) and specificity ranging from 0.67 to 0.92 (median = 0.825). Half of these six screening tools achieved fair accuracy, but none achieved good accuracy. None of the two screening tools based on parental concern achieved at least fair screening accuracy.

Fifteen of the 65 studies also reported predictive validity, with a sensitivity ranging from 0.32 to 0.94 (median = 0.81) and a specificity ranging from 0.61 to 0.93 (median = 0.85). Three of the tools (20%) achieved at least fair accuracy in both sensitivity and specificity, but none of them were considered to have good accuracy.

Test Performance Based on HSROC

Three HSROC curves were generated for screening tools based on language ability, clinical markers, both language ability and clinical markers, and those assessing concurrent validity. Two HSROC curves were generated for screening tools administered by trained examiners and parents/ caregivers, respectively. Two HSROC curves were generated for screening under and above the age of 4, respectively. A separate HSROC curve was generated for screening tools assessing predictive validity. Screening based on indices from language sampling (n = 3) or parental concern (n = 2) were excluded from the HSROC analysis due to the small number of primary studies.

Figure 3 shows the overall performance of screening tools based on language ability, clinical markers and both. Visual inspection of the plotted points and confidence region revealed considerable variation in accuracy in all three major types of screening tools. The summary estimates and confidence regions indicated that the overall performance of screening tools based on language ability achieved fair specificity (<0.2 in false positive rate) but fair-to-poor sensitivity. Screening tools based on clinical markers showed considerable variation in both sensitivity and specificity in that both measures ranged from good-to-poor. Screening tools based on both language ability and clinical markers achieved good-to-fair specificity, but fair-to-poor sensitivity. Figure 4 shows the overall performance of screening tools administered by parents/caregivers or trained examiners. Visual inspection revealed that both types of screening tools achieved fair-to-poor sensitivity and good-to-fair specificity. Figure 5 shows the overall performance of screening for children under and above 4yo, respectively. Visual inspection revealed screening under 4yo achieved good-to-poor sensitivity and specificity, while screening above 4yo achieved good-to-poor sensitivity and good-to-fair specificity. Figure 6 shows the performance of the screening tools evaluating predictive validity. These screening tools achieved fair-to-poor sensitivity and specificity.

Figure 3

Figure 4

Figure 5

Figure 6

Meta-Regression Investigating Effects of Screening Proxy, Test Administrator, Screening-Diagnosis Interval, and Age of Screening

The effects of screening proxy, test administrator, screening-diagnosis interval and age of screening on screening accuracy were investigated using bivariate meta-regression. Table 7 summarizes the results. Screening tools with <6-month screening-diagnosis interval (i.e., concurrent validity) were associated with higher sensitivity when compared to those with longer than a 6-month interval (i.e., predictive validity). Tools using language ability as the proxy showed a marginally significantly higher sensitivity than those based on clinical markers. Screening tools based on language ability and those based on both language ability and clinical markers appeared to show a similar degree of sensitivity. For tools assessing concurrent validity, screening under the age of 4 had a higher sensitivity with marginal statistical significance but showed similar specificity with screening above the 4yo. As for tools assessing predictive validity, screening under and above 4yo appeared to show similar sensitivity and specificity. Similarly, screening tools relying on parent report and those conducted by trained examiners appeared to show a similar sensitivity. Despite the large variability in specificity, none of the factors in the meta-regression model explained this variability.

Table 7

FactorTransformed sensitivityTransformed false positive rate
Coeff.95% CIp-valueCoeff.95% CIp-value
LLULLLUL
Types (L vs. Cm)0.657−0.0551.3700.070#0.325−0.7741.4230.562
Types (L vs. Mx)−0.300−0.8550.2550.2900.435−0.3301.2010.265
Types (Mx vs. Cm)0.885−0.2442.0150.124−0.094−0.9580.7700.832
Time (P vs. C)−0.528−1.018−0.0370.035*−0.016−0.7260.6950.965
Sc. AgeC (<4yo vs. ≥4yo)1.676−0.1151.4670.094#0.560−0.2921.4120.198
Sc. AgeP (<4yo vs. ≥4yo)1.061−1.1153.2380.3390.663−0.7372.0640.353
Informant (TP∧ vs. Pa)−0.003−0.5250.5190.992−0.031−0.8360.7730.939

Bivariate meta-regression on studies-related factors on sensitivity and false-positive rate.

First group in the bracket as the reference; L, language only; Cm, clinical markers; Mx, both language and clinical markers; P, predictive validity; C, concurrent validity; Pa, parent; TE, trained personnel; ScAgeC, Screening Age (for studies evaluating concurrent validity); ScAgeC, Screening Age (for studies evaluating predictive validity).

#

p < 0.1;

*

p < 0.05.

Results of sensitivity analysis after excluding studies with high ROB are illustrated in Table 8. The observed higher sensitivity for screening tools using actual language as proxy compared with those using clinical markers became statistically significant. The difference in sensitivity between screening tools assessing concurrent validity and those assessing predictive validity appeared to be larger than before the removal of the high ROB studies. However, the observed marginal difference between screening under and above 4yo became non-significant after the exclusion of high-risk studies. Similar to the results without excluding studies with high ROB, none of the included factors in sensitivity analysis explained variation in specificity.

Table 8

FactorTransformed sensitivityTransformed false positive rate
Coeff.95% CIp-valueCoeff.95% CIp-value
LLULLLUL
Types (L∧ vs. Cm)0.9600.2911.6290.005**−0.020−1.2951.2560.976
Types (L∧ vs. Mx)−0.173−0.7840.4390.5800.157−0.7531.0670.735
Types (Mx∧ vs. Cm)a--------
Time (P∧ vs. C)−0.819−1.377−0.2620.004*−0.104−1.0090.8010.822
Sc. Age C(<4yo vs. ≥4yo)0.234−0.9261.3940.6920.520−0.3881.4280.262
Sc. Age P(<4yo vs. ≥4yo)a--------
Informant (TE∧ vs. Pa)0.149−0.5140.8120.6600.160−0.8701.1890.761

Bivariate meta-regression of study-related factors on sensitivity and false-positive rate excluding high ROB studies.

First group in the bracket as the reference; L, language only; Cm, clinical markers; Mx, both language and clinical markers; P, predictive validity; C, concurrent validity; Pa, parent; TE, trained examiner; ScAgeC, Screening Age (for studies evaluating concurrent validity); ScAgeC, Screening Age (for studies evaluating predictive validity).

a

Too few studies after exclusion for a valid analysis.

#p < 0.1.

*

p < 0.05.

Discussion

The present review shows that currently available screening tools for language disorders during preschool years varies widely in their design and screening performance. Large variability in screening accuracy across different tools was a major issue in screening for language disorder. The present review also revealed that the variations arose from the choices of proxy and screening-diagnosis interval.

Screening tools based on children's actual language ability were shown to have higher sensitivity than tools based on clinical markers. The fact that screening tools based on clinical markers did not prove to be sensitive may be related to the mixed findings from primary studies. Notably, one of the primary studies using non-word repetition and sentence repetition tasks showed perfect accuracy in classifying all children with and without language disorder (110). The findings, however, could not be replicated in another study, using exactly the same test, which identified only 3 of the 10 children with language disorder (104). The difference highlighted the large variability in the performance of non-word and sentence repetition even among children with language disorders, in addition to the inconsistent difference found between children with and without language disorder (149). Another plausible explanation for the relatively higher sensitivity of using child's actual language skills lies in the resemblance between the items used for screening based on the child's actual language and the diagnostic tests used as the reference standard. Differences in task design and test item selection across studies may have further increased the inconsistencies (149). Therefore, in future tool development or refinement, great care should be taken in the choice of screening proxy. More systematic studies directly comparing how different proxies and factors affect screening accuracy are warranted.

There was no evidence that other factors related to tool design, such as the test administrators of the screening tools, explained variability in accuracy. In line with a previous review (13), parent-report screening appeared to perform similarly to screening administered by trained examiners. This seemingly comparable accuracy supports parent-report instruments as a viable tool for screening, in addition to their apparent advantage of lower cost of administration. Primary studies directly comparing both types of screening in the same population may provide stronger evidence concerning the choice of administrators.

As predicted, long term prediction was harder to achieve than estimating concurrent status. Meta-analysis revealed that screening tools reporting predictive validity showed a significantly lower sensitivity than that of tools reporting concurrent validity, which was also speculated in the previous review (13). One possible explanation lies in the diverse developmental trajectories of language development in the preschool years. Some of the children who perform poorly in early screening may recover spontaneously at a later time point, while some who appeared to be on the right track at the beginning may develop language difficulties later on (7). Current screening tools might not be able to capture this dynamic change in language development in the preschool years, resulting in lower predictive validity than expected. Hence, language disorder screening should concentrate on identifying or introducing new proxies or metrics that are sensitive to the dynamic nature of language development. Vocabulary growth estimates, for example, might be more sensitive to long-term outcomes than a single point estimation (150). Although the current review has shown that different proxies has been used in screening language disorder, there is a limited number of studies examining how proxies other than children's actual language ability perform in terms of predictive validity. It would be useful to investigate the interaction between the proxy used and the screening-diagnosis interval in future studies.

Age of screening was expected to be affected by the varying developmental trajectories. Screening at an earlier age might have lower accuracy than screening at a later age when language development becomes more stable. This expected difference was not found in the current meta-analysis. However, it is worth noting that screening tools used at different ages not only differed in the age of screening, but also other domains. In the meta-analysis, over half (55%, 16/29) of the screening under 4 relied on parent reports and used tools such as vocabulary checklists and reported utterances while none of the screening above 4 (0/8) were based on parent reports. Inquiry about the effect of screening age on screening accuracy is crucial as it has direct implication on the optimal time of screening. Future studies that compare the screening accuracy at different ages with the method of assessment being kept constant (e.g., using the same screening tool) may reveal a clearer picture.

Overall, only a small proportion of all the available screening tools achieved good accuracy in identifying both children with and without language disorder. Yet, there is still insufficient evidence to recommend any screening tool, especially given the presence of ROB in some studies. Besides, the limited number of valid tools may explain partly why screening for language disorder has not yet been adopted as a routine surveillance exercise in primary care, in that the use of any one type of screening tools may result in a considerable amount of over-identification and missing cases, which can lead to long term social consequences (19). As shown in the current review, in the future development of screening tools, the screening proxy should be carefully chosen in order to maximize test sensitivity. However, as tools that have good accuracy are limited, there remains room for discussion on whether future test development should aim at maximizing sensitivity even at the expense of specificity. The cost of over-identifying a false-positive child for a more in-depth assessment might be less than that of under-identifying a true-positive child and depriving the child of further follow-ups (104). If this is the case, the cut-off for test positivity can be adjusted. The more stringent the criteria used in screening, the higher the sensitivity the test yield but with the trade-off of a decrease in specificity. However, the decision should be made by fully acknowledging the harms and benefits, which has not been addressed in the current review. While an increase in sensitivity by adjusting the cut-off might lead to the benefit of better follow-ups, the accompanying increase in false positive rate might lead to the harms of stigmatization and unnecessary procedures. Given the highly variable developmental trajectories in asymptomatic children, another direction for future studies could be to evaluate the viability of targeted screening in a higher-risk population and compare it with universal screening.

This is the first study to use meta-analytical techniques specifically to evaluate the heterogeneity in screening accuracy of tools for identifying children with language disorder. Nonetheless, there were several limitations of the study. One limitation was related to the variability and validity of the gold standard in that the reference standard tests. Different countries or regions use different localized standardized or non-standardized tools and criteria to define language disorder. There is no one consensual or true gold standard. More importantly, the significance and sensitivity and specificity of the procedures used to identify children with language disorders in those reference tests were not examined. Some reference tests may employ arbitrary cut-offs (e.g.,−1.25 SD) to define language disorders while some researchers advocate children's well-being as the outcome, such that when children's lives are negatively impacted by their language skills, they are considered as having language disorders (151). This lack of consensus might further explain the diverse results or lack of agreement in replication studies.

Another limitation of the study was that nearly all the included studies had at least some ROB. This was mainly due to many unreported aspects in the studies. It is suggested that future validation studies on screening tools should follow reporting guidelines such as STARD (152). A third limitation was that the rating of ROB only involved one rater, and more raters may minimize potential bias. Lastly, not all included screening tools were analyzed in the meta-analysis. Some studies evaluated multiple screening tools at a number of cut-offs or times of assessment. Only one data point per study was included in the meta-analysis and the data used in meta-analysis were chosen based on Youden's index. This selection would inevitably inflate the accuracy shown in the meta-analysis. With the emergence of new methods for meta-analysis for diagnostic studies, more sophisticated methods for handling this complexity of data structure may be employed in future reviews.

This review shows that current screening tools for developmental language disorder vary largely in accuracy, with only some achieving good accuracy. Meta-analytical data identified some sources for heterogeneity. Future development of screening tools should aim at improving overall screening accuracy by carefully choosing the proxy or designing items for screening. More importantly, metrics that are more sensitive to persistent language disorder should be sought. To fully inform surveillance for early language development, future research in the field can also consider broader aspects, such as the harms and benefits of screening as there is still a dearth of evidence in this respect.

Publisher's Note

All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article, or claim that may be made by its manufacturer, is not guaranteed or endorsed by the publisher.

Statements

Data availability statement

Publicly available datasets were analyzed in this study. This data can be found at: Reference lists of the article.

Author contributions

KS and CT conceived and designed the study, wrote the paper, conducted the format and tables, reviewed, and edited the manuscript. KS performed the statistical analysis. All authors have approved the final manuscript for submission.

Conflict of interest

The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Supplementary material

The Supplementary Material for this article can be found online at: https://www.frontiersin.org/articles/10.3389/fped.2022.801220/full#supplementary-material

Footnotes

1.^The total number here refers to the number of studies: there were five studies evaluating tools based on clinical markers, but there were in total three different tests; hence number is different from that in types and characteristics of current screening tools for language disorder.

2.^The total number here refers to the number of studies: there were seven studies evaluating tools based on both actual language and clinical markers, but there were in total six different tests; hence number is different from that in types and characteristics of current screening tools for language disorder.

References

  • 1.

    BishopDVSnowlingMJThompsonPAGreenhalghTConsortiumCAdamsCet al. Phase 2 of CATALISE: A multinational and multidisciplinary Delphi consensus study of problems with language development: Terminology. J Child Psychol Psychiatry. (2017) 58:106880. 10.1111/jcpp.12721

  • 2.

    NorburyCFGoochDWrayCBairdGCharmanTSimonoffEet al. The impact of nonverbal ability on prevalence and clinical presentation of language disorder: evidence from a population study. J Child Psychol Psychiatry. (2016) 57:124757. 10.1111/jcpp.12573

  • 3.

    American Speech-Language-Hearing Association. Preschool Language Disorders. Available online at: https://www.asha.org/public/speech/disorders/preschool-languagedisorders/

  • 4.

    BeitchmanJHWilsonBBrownlieEWaltersHInglisALanceeW. Long-term consistency in speech/language profiles: II. Behavioral, emotional, and social outcomes. J Am Acad Child Adolesc Psychiatry. (1996) 35:81525. 10.1097/00004583-199606000-00022

  • 5.

    BrownlieEBBaoLBeitchmanJ. Childhood language disorder and social anxiety in early adulthood. J Abnormal Child Psychol. (2016) 44:106170. 10.1007/s10802-015-0097-5

  • 6.

    BeitchmanJHWilsonBBrownlieEBWaltersHLanceeW. Long-term consistency in speech/language profiles: I. Developmental and academic outcomes. J Am Acad Child Adolesc Psychiatry. (1996) 35:80414. 10.1097/00004583-199606000-00021

  • 7.

    ZambranaIMPonsFEadiePYstromE. Trajectories of language delay from age 3 to 5: Persistence, recovery and late onset. Int J Lang Commun Disord. (2014) 49:30416. 10.1111/1460-6984.12073

  • 8.

    ArmstrongRScottJGWhitehouseAJCoplandDAMcmahonKLArnottW. Late talkers and later language outcomes: Predicting the different language trajectories. Int J Speech-Lang Pathol. (2017) 19:23750. 10.1080/17549507.2017.1296191

  • 9.

    BishopDVSnowlingMJThompsonPAGreenhalghTConsortiumC. CATALISE: A multinational and multidisciplinary Delphi consensus study. Identifying language impairments in children. PLoS ONE. (2016) 11:e0158753. 10.1371/journal.pone.0158753

  • 10.

    SimFThompsonLMarryatLRamparsadNWilsonP. Predictive validity of preschool screening tools for language and behavioural difficulties: A PRISMA systematic review. PLoS ONE. (2019) 14:e0211409. 10.1371/journal.pone.0211409

  • 11.

    SiuAL. Screening for speech and language delay and disorders in children aged 5 years or younger: US Preventive Services Task Force recommendation statement. Pediatrics. (2015) 136:e47481. 10.1542/peds.2015-1711

  • 12.

    WallaceIF. Universal Screening of Young Children for Developmental Disorders: Unpacking the Controversies. Occasional Paper. RTI Press Publication OP-0048-1802. RTI International (2018). 10.3768/rtipress.2018.op.0048.1802

  • 13.

    WallaceIFBerkmanNDWatsonLRCoyne-BeasleyTWoodCTCullenKet al. Screening for speech and language delay in children 5 years old and younger: a systematic review. Pediatrics. (2015) 136:e44862. 10.1542/peds.2014-3889

  • 14.

    GuibersonM. Telehealth measures screening for developmental language disorders in Spanish-speaking toddlers. Telemed E-Health. (2016) 22:73945. 10.1089/tmj.2015.0247

  • 15.

    PuglisiMLBlasiHFSnowlingMJ. Screening for the identification of oral language difficulties in Brazilian preschoolers: a validation study. Lang Speech Hear Serv Schools. (2020) 51:85265. 10.1044/2020_LSHSS-19-00083

  • 16.

    BornsteinMHHahnCSPutnickDLSuwalskyJT. Stability of core language skill from early childhood to adolescence: A latent variable approach. Child Dev. (2014) 85:134656. 10.1111/cdev.12192

  • 17.

    TomblinJBZhangXBuckwalterPO'BrienM. The stability of primary language disorder. J Speech Lang Hear Res. (2003) 46:128396. 10.1044/1092-4388(2003/100)

  • 18.

    SalamehJ-PBossuytPMMcGrathTAThombsBDHydeCJMacaskillPet al. Preferred reporting items for systematic review and meta-analysis of diagnostic test accuracy studies (PRISMA-DTA): explanation, elaboration, and checklist. BMJ. (2020) 370:m2632. 10.1136/bmj.m2632

  • 19.

    PlanteEVanceR. Selection of preschool language tests: A data-based approach. Lang Speech Hear Serv Schools. (1994) 25:1524. 10.1044/0161-1461.2501.15

  • 20.

    WhitingPFRutjesAWWestwoodMEMallettSDeeksJJReitsmaJBet al. QUADAS-2: a revised tool for the quality assessment of diagnostic accuracy studies. Ann Internal Med. (2011) 155:52936. 10.7326/0003-4819-155-8-201110180-00009

  • 21.

    McGuinnessLAHigginsJP. Risk-of-bias VISualization (robvis): An R package and Shiny web app for visualizing risk-of-bias assessments. Res Synthesis Methods. (2021) 12:5561. 10.1002/jrsm.1411

  • 22.

    ReitsmaJBGlasASRutjesAWScholtenRJBossuytPMZwindermanAH. Bivariate analysis of sensitivity and specificity produces informative summary measures in diagnostic reviews. J Clin Epidemiol. (2005) 58:98290. 10.1016/j.jclinepi.2005.02.022

  • 23.

    MacaskillPGatsonisCDeeksJHarbordRTakwoingiY. Cochrane Handbook for Systematic Reviews of Diagnostic Test Accuracy. London: The Cochrane Collaboration (2010).

  • 24.

    DoeblerPHollingH. mada: Meta-Analysis of Diagnostic Accuracy. R package version 0.5.10. (2020). Available online at: https://CRAN.R-project.org/package=mada

  • 25.

    AllenDVBlissLS. Concurrent validity of two language screening tests. J Commun Disord. (1987) 20:30517. 10.1016/0021-9924(87)90012-8

  • 26.

    LeeLL. Northwestern Syntax Screening Test (NSST). Press Evanston, IL: Northwestern University Press Evanston (1971).

  • 27.

    HedrickDLPratherEMTobinAR. Sequenced Inventory of Communication Development.Washington, DC: University of Washington Press Seattle (1984).

  • 28.

    BlaxleyLet al. Two language screening tests compared with developmental sentence scoring. Lang Speech Hear Serv Schools. (1983) 14:3846. 10.1044/0161-1461.1401.38

  • 29.

    BanksonNW. Bankson Language Screening Test. Baltimore, MD: University Park Press (1977).

  • 30.

    LeeLL. Developmental Sentence Analysis: A Grammatical Assessment Procedure for Speech and Language Clinicians. Evanston, IL: Northwestern University Press (1974).

  • 31.

    BurdenVStottCMForgeJGoodyerIBurdenVStottCMet al. The Cambridge Language and Speech Project (CLASP). I. Detection of language difficulties at 36 to 39 months. Dev Med Child Neurol. (1996) 38:61331. 10.1111/j.1469-8749.1996.tb12126.x

  • 32.

    AlpernGBollTShearerM. Developmental profile II. J Read. (1980) 18:28791.

  • 33.

    RenfrewCE. Persistence of the open syllable in defective articulation. J Speech Hear Disord. (1966) 31:3703. 10.1044/jshd.3104.370

  • 34.

    RenfrewC. The Bus Story: A Test of Continuous Speech. Oxford: England (1969).

  • 35.

    CarscaddenJCorsiattoPEricsonLIllchukREsopenkoCSternerEet al. A pilot study to evaluate a new early screening instrument for speech and language delays. Canad J Speech-Lang Pathol Audiol. (2010) 34:8795.

  • 36.

    BzochKRLeagueRBrownV. Receptive-Expressive Emergent Language Test. Austin, TX: Pro-Ed (2003).

  • 37.

    ChaffeeCACunninghamCESecord-GilbertMElbardHRichardsJ. Screening effectiveness of the Minnesota child development inventory expressive and receptive language scales: sensitivity, specificity, and predictive value. Psychol Assessment. (1990) 2:805. 10.1037/1040-3590.2.1.80

  • 38.

    ReynellJ. Reynell Developmental Language Scales, Revised (Windsor: NFER). Windsor: NFER Publishing Company (1977).

  • 39.

    IretonHRThwingEJ. Minnesota Child Development Inventory. Minneapolis, MN: Behavior Science Systems (1972).

  • 40.

    DiasDCRondon-MeloSMolini-AvejonasDR. Sensitivity and specificity of a low-cost screening protocol for identifying children at risk for language disorders. Clinics. (2020) 75:e1426. 10.6061/clinics/2020/e1426

  • 41.

    American Speech-Language and Hearing Association. How does your child hear and talk? (2006). Available online at: http://www.asha.org/public/speech/development/chart/

  • 42.

    AndradeCRFBefi-LopesDMFernandesFDMWertznerHF. Teste de Language Infantil nas Áreas de Fonologia, Vocabulário, Fluência e Pragmática. 2nd ed. Barueri: Pró-Fono (2011).

  • 43.

    DixonJKotALawJ. Early language screening in City and Hackney: work in progress. Child Care Health Dev. (1988) 14:21329. 10.1111/j.1365-2214.1988.tb00576.x

  • 44.

    ReynellJHuntleyM. Reynell Developmental Language Scales, 2nd revision. Windsor: NFER-Nelson (1985).

  • 45.

    LoweMCostelloA. The Symbolic Play Test. Windsor: NFER-Nelson (1976).

  • 46.

    GraySPlanteEVanceRHenrichsenM. The diagnostic accuracy of four vocabulary tests administered to preschool-age children. Lang Speech Hear Serv Schools. (1999) 30:196206. 10.1044/0161-1461.3002.196

  • 47.

    GardnerMF. EOWPVT-R: Expressive One-Word Picture Vocabulary Test, Revised. Novato, CA: Academic Therapy Publications (1990).

  • 48.

    DunnLMDunnLM. Peabody Picture Vocabulary Test: PPVT-IIIB. Circle Pines, MN: American Guidance Service Circle Pines (1997). 10.1037/t15145-000

  • 49.

    GardnerMF. Receptive One-Word Picture Vocabulary Test. Novato, CA: Academic Therapy Publications (1985).

  • 50.

    WilliamsK. Expressive Vocabulary Test (EVT). Circle Pines, MN: American Guidance Service Inc. (1997).

  • 51.

    ZimmermanILSteinerVG. Preschool Language Scale 4th edition, Spanish. Antonio, TX: The Psychological Corporation (2004).

  • 52.

    GuibersonMRodriguezBL. Measurement properties and classification accuracy of two Spanish parent surveys of language development for preschool-age children. Am J Speech-Lang Pathol. (2010) 19:22537. 10.1044/1058-0360(2010/09-0058)

  • 53.

    GuibersonM. Concurrent validity of a parent survey measuring communication skills of Spanish speaking preschoolers with and without delayed language. Perspect Commun Disord Sci. (2008) 15:7381. 10.1044/cds15.3.73

  • 54.

    SquiresJPotterlBrickerD. The ASQ User's Guide (2nd ed.). Baltimore, MD: Paul H. Brookes (1999).

  • 55.

    GuibersonMRodríguezBLDalePS. Classification accuracy of brief parent report measures of language development in Spanish-speaking toddlers. Lang Speech Hear Serv Sch. (2011) 42:53649. 10.1044/0161-1461(2011/10-0076)

  • 56.

    SquiresJBrickerDMountsLPotterLNickelRTwomblyEet al. Ages and Stages Questionnaire. Baltimore, MD: Paul H. Baltimore, MD: Brookes (1999).

  • 57.

    Jackson-MaldonadoDThalDJFensonL. MacArthur Inventarios del Desarrollo de Habilidades Comunicativas: User's Guide and Technical Manual. Baltimore, MD: Brookes Pub (2003).

  • 58.

    GuibersonMRodriguezBLZajacovaA. Accuracy of telehealth-administered measures to screen language in Spanish-speaking preschoolers. Telemed E-Health. (2015) 21:71420. 10.1089/tmj.2014.0190

  • 59.

    HeilmannJWeismerSEEvansJHollarC. Utility of the MacArthur-bates communicative development inventory in identifying language abilities of late-talking and typically developing toddlers. Am J Speech-Lang Pathol. (2005) 14:4051. 10.1044/1058-0360(2005/006)

  • 60.

    BoardCA. The MacArthur-Bates Communicative Development Inventory: Words and Sentences. Baltimore, MD: Paul H Brookes (1992).

  • 61.

    ZimmermanISteinerVPondR. The Preschool Language Scale-3. Columbus: Merrill (1992).

  • 62.

    KleeTCarsonDKGavinWJHallLKentAReeceS. Concurrent and predictive validity of an early language screening program. J Speech Lang Hear Res. (1998) 41:62741. 10.1044/jslhr.4103.627

  • 63.

    RescorlaL. The language development survey: A screening tool for delayed language in toddlers. J Speech Hear Disord. (1989) 54:58799. 10.1044/jshd.5404.587

  • 64.

    MullenEM. Infant MSEL Manual: Infant Mullen Scales of Early Learning. Cranston, RI: TOTAL Child (1989).

  • 65.

    KleeTPearceKCarsonDK. Improving the positive predictive value of screening for developmental language disorder. J Speech Lang Hear Res. (2000) 43:82133. 10.1044/jslhr.4304.821

  • 66.

    LaingGJLawJLevinALoganS. Evaluation of a structured test and a parent led method for screening for speech and language problems: prospective population based study. BMJ. (2002) 325:1152. 10.1136/bmj.325.7373.1152

  • 67.

    EdwardsSFletcherPGarmanMHughesALettsCSinkaI. The Reynell Developmental Language Scales III, NFER. NELSON Publishing, The University of Reading Edition (1997).

  • 68.

    LawJ. Early language screening in City and Hackney: The concurrent validity of a measure designed for use with 2½-year-olds. Child. (1994) 20:295308. 10.1111/j.1365-2214.1994.tb00392.x

  • 69.

    LevettLMuirJ. Which three year olds need speech therapy? Uses of the Levett-Muir language screening test. Health Visitor. (1983) 56:4546.

  • 70.

    ReynellJReynell HuntleyM. Reynell Developmental Language Scales (Revised). Windsor: NFER Publishing Company Ltd. (1977).

  • 71.

    GoldmanR. Goldman-Fristoe Test of Articulation. Circle Pines, MI: American Guidance Service. Inc. (1969).

  • 72.

    CrystalDFletcherPGarmanM. The Grammatical Analysis of Language Disability: A Procedure for Assessment and Remediation. Hodder Education (1976).

  • 73.

    Visser-BochaneMIvan der SchansCPKrijnenWPReijneveldSALuingeMR. Validation of the early language scale. Eur J Pediatrics. (2020) 180:6371. 10.1007/s00431-020-03702-8

  • 74.

    SchlichtingJEPTSpelbergL. Lexilijst Begrip [Lexilist Comprehension]. Amsterdam: Harcourt Test Publishers (2007).

  • 75.

    SchlichtingLSpelbergHL. Schlichting test voor taalbegrip [Schlichting test for language comprehension]. Houten: Bohn Stafleu van Loghum (2010).

  • 76.

    SchlichtingJEPT. Lexilijst Nederlands. (2002). 10.1240/sav_gbm_2002_h_000262

  • 77.

    SchlichtingJEPTSpelbergHC. Schlichting Test voor Taalproductie-II: voor Nederland en Vlaanderen. Houten: Bohn Stafleu van Loghum (2012). 10.1007/978-90-313-9842-3

  • 78.

    Slofstra-BremerCvan der MeulenSLutje SpelbergH. De Taalstandaard [The Language standard]. Amsterdam: Pearson (2006).

  • 79.

    GeurtsH. Handleiding CCC-2-NL. Amsterdam: Harcourt Test Publishers (2004).

  • 80.

    Visser-BochaneMLuingeMDielemanLvan der SchansCReijneveldS. The Dutch well child language screening protocol for 2-year-old children was valid for detecting current and later language problems. Acta Paediatrica. (2020) 110:282532. 10.1111/apa.15447

  • 81.

    MattssonCMMårildSPehrssonNG. Evaluation of a language-screening programme for 2.5-year-olds at Child Health Centres in Sweden. Acta Paediatr. (2001) 90:33944. 10.1080/080352501300067776

  • 82.

    McGintyC. An investigation into aspects of the Mayo early language screening test. Child Care Health Dev. (2000) 26:11128. 10.1046/j.1365-2214.2000.00176.x

  • 83.

    Garvey-CecchettiBHeslinCLaundonOMcGintyCO'MalleyLDowdPet al. The Mayo Early Language Screening Test.Western Health Board: Mayo Speech and Language Therapy Department (1993).

  • 84.

    AnthonyABogleDIngramTMcIsaacM. 1971: Edinburgh Articulation Test. Edinburgh: Churchill Livingstone (1971).

  • 85.

    NairMKCHarikumaran NairGSMiniAOIndulekhaSLethaSRussellPS. Development and validation of language evaluation scale Trivandrum for children aged 0-3 years - LEST (0-3). Indian Pediatrics. (2013) 50:4637. 10.1007/s13312-013-0154-5

  • 86.

    BzochKLeagueR. Receptive-Expressive Emergent Language Scale. Gainesville, FL: Tree of Life Press. Inc. (1971).

  • 87.

    NayebLLagerbergDWesterlundMSarkadiALucasSErikssonM. Modifying a language screening tool for three-year-old children identified severe language disorders six months earlier. Acta Paediatrica. (2019) 108:16428. 10.1111/apa.14790

  • 88.

    AndradeCdBefi-LopesDMFernandesFDMWertznerHF. ABFW: teste de linguagem infantil nas áreas de fonologia, vocabulário, fluência e pragmática. São Paulo: Pró-Fono. (2004).

  • 89.

    BischopD. The Test for Reception of Grammar, Version 2 (TROG-2).Oxford: Pearson (2009).

  • 90.

    SantosFHBuenoOFA. Validation of the Brazilian Children's Test of Pseudoword Repetition in Portuguese speakers aged 4 to 10 years. Brazil J Med Biol Res. (2003) 36:153347. 10.1590/S0100-879X2003001100012

  • 91.

    RescorlaLAlleyA. Validation of the language development survey (LDS): A parent report tool for identifying language delay in toddlers. J Speech Lang Hear Res. (2001) 44:43445. 10.1044/1092-4388(2001/035)

  • 92.

    SachseSVon SuchodoletzW. Early identification of language delay by direct language assessment or parent report?J Dev Behav Pediatrics. (2008) 29:3441. 10.1097/DBP.0b013e318146902a

  • 93.

    GrimmHDoilH. Elternfragebögen für die Früherkennung von Risikokindern. ELFRA: Hogrefe, Verlag für Psychologie (2000).

  • 94.

    GrimmH. Sprachentwicklungstest für sweijährige slindes-SETK-2 und für dreibis fünfjährige Kinder-SETK 3-5. Göttingen: Hogrefe (2000).

  • 95.

    StokesSF. Secondary prevention of paediatric language disability: a comparison of parents and nurses as screening agents. Eur J Disord Commun. (1997) 32:13958. 10.1111/j.1460-6984.1997.tb01628.x

  • 96.

    van AgtHMvan der StegeHAde Ridder-SluiterJGde KoningHJ. Detecting language problems: accuracy of five language screening instruments in preschool children. Dev Med Child Neurol. (2007) 49:11722. 10.1111/j.1469-8749.2007.00117.x

  • 97.

    StottCMMerricksMJBoltonPFGoodyerIM. Screening for speech and language disorders: The reliability, validity and accuracy of the General Language Screen. Int J Lang Commun Disord. (2002) 37:13351. 10.1080/13682820110116785

  • 98.

    Brouwers-de JongEBurgmeijerRLaurent de AnguloM. Ontwikkelingsonderzoek op het consultatiebureau: handboek bij het vernieuwde Van Wiechenonderzoek. (1996).

  • 99.

    WalkerDGugenheimSDownsMPNorthernJL. Early Language Milestone Scale and language screening of young children. Pediatrics. (1989) 83:2848. 10.1542/peds.83.2.284

  • 100.

    CoplanJ. ELM Scale: The Early Language Milestone Scale. Pro-Ed (1983).

  • 101.

    WetherbyAMGoldsteinHClearyJAllenLKublinK. Early identification of children with communication disorders: Concurrent and predictive validity of the CSBS Developmental Profile. Infants Young Child. (2003) 16:16174. 10.1097/00001163-200304000-00008

  • 102.

    WetherbyAMPrizantBM. Communication and Symbolic Behavior Scales: Developmental Profile. Baltimore, MD: Paul H Brookes Publishing Co. (2002). 10.1037/t11529-000

  • 103.

    KapalkovaSPolisenskaKVicenovaZ. Non-word repetition performance in Slovak-speaking children with and without SLI: novel scoring methods. Int J Lang Commun Disord. (2013) 48:7889. 10.1111/j.1460-6984.2012.00189.x

  • 104.

    NashHLeavettRChildsH. Evaluating the GAPS test as a screener for language impairment in young children. Int J Lang Commun Disord. (2011) 46:67585. 10.1111/j.1460-6984.2011.00038.x

  • 105.

    Van der LelyHGardnerHMcClellandAGRFroudKE. Grammar and Phonology Screening Test: (GAPS)London: DLDCN (2007).

  • 106.

    WiigESecordWSemelE. Clinical Evaluation of Language Fundamentals-Preschool. Second UK edition. London: Harcourt Assessment (2006).

  • 107.

    SturnerRAFunkSGGreenJA. Preschool speech and language screening: further validation of the sentence repetition screening test. J Dev Behav Pediatr. (1996) 17:40513. 10.1097/00004703-199612000-00006

  • 108.

    SturnerRKunzeLFunkSGreenJ. Elicited imitation: its effectiveness for speech and language screening. Dev Med Child Neurol. (1993) 35:71526. 10.1111/j.1469-8749.1993.tb11717.x

  • 109.

    KirkSAKirkWDMcCarthyJJ. Illinois Test of Psycholinguistic Abilities. Champaign, IL: University of Illinois press (1968).

  • 110.

    van der LelyHKJPayneEMcClellandA. An investigation to validate the grammar and phonology screening (GAPS) test to identify children with specific language impairment. PLoS ONE. (2011) 6:e022432. 10.1371/journal.pone.0022432

  • 111.

    FluhartyNB. The design and standardization of a speech and language screening test for use with preschool children. J Speech Hear Disord. (1974) 39:7588. 10.1044/jshd.3901.75

  • 112.

    HedrickDLPratherEMTobinAR. Sequenced Inventory of Communication Development. Seattle, WA: University of Washington Press (1975).

  • 113.

    BenavidesAAKapantzoglouMMurataC. Two grammatical tasks for screening language abilities in Spanish-speaking children. Am J Speech-Lang Pathol. (2018) 27:690705. 10.1044/2017_AJSLP-17-0052

  • 114.

    ZimmermanILSteinerVGPondRE. Preschool Language Scale-Fifth Edition Spanish Screening Test (PLS-5 Spanish Screening Test). [Measurement assessment]. San Antonio, TX: Pearson (2011) 10.1037/t15141-000

  • 115.

    FluhartyNB. Fluharty Preschool Speech and Language Screening Test: Teaching Resources. Austin, TX: Pro-Ed (1978).

  • 116.

    LeeLLKoenigsknechtRAMulhernST. Developmental Sentence Scoring. Evanston, IL: North Western University (1974).

  • 117.

    BlissLSAllenDV. Screening Kit of Language Development. Baltimore, MD: University Park Press (1983).

  • 118.

    BlissLSAllenDV. Screening kit of language development: A preschool language screening instrument. J Commun Disord. (1984) 17:13341. 10.1016/0021-9924(84)90019-4

  • 119.

    LavessonALovdenMHanssonK. Development of a language screening instrument for Swedish 4-year-olds. Int J Lang Commun Disord. (2018) 53:60514. 10.1111/1460-6984.12374

  • 120.

    MatovJMensahFCookFReillyS. Investigation of the language tasks to include in a short-language measure for children in the early school years. Int J Lang Commun Disord. (2018) 53:73547. 10.1111/1460-6984.12378

  • 121.

    MatovJMensahFCookFReillySDowellR. The development and validation of the Short Language Measure (SLaM): A brief measure of general language ability for children in their first year at school. Int J Lang Commun Disord. (2020) 55:34558. 10.1111/1460-6984.12522

  • 122.

    SemelEWiigEHSecordW. Clinical Evaluation of Language Fundamentals-Fourth Edition, Australian Standardised Edition. Sydney, NSW: PsychCorp (2006).

  • 123.

    WrightRLevinB. A preschool articulation and language screening for the identification of speech disorders. Final Rep. (1971).

  • 124.

    EisenbergSLGuoL-Y. Differentiating children with and without language impairment based on grammaticality. Lang Speech Hear Serv Schools. (2013) 44:2031. 10.1044/0161-1461(2012/11-0089)

  • 125.

    DawsonJIStoutCEyerJTattersallPJFonkalsrudJCroleyK. SPELT-P 2: Structured Photographic Expressive Language Test. Preschool: Janelle Publications (2005).

  • 126.

    EisenbergSLGuoLGermeziM. How grammatical are three-year-olds?Lang Speech Hear Serv Schools. (2012) 43:3652. 10.1044/0161-1461(2011/10-0093)

  • 127.

    BruceBKornfaltRRadeborgKHanssonKNettelbladtU. Identifying children at risk for language impairment: screening of communication at 18 months. Acta Paediatrica. (2003) 92:10905. 10.1080/08035250310004414

  • 128.

    HolmbergENelliSB. Neurolingvistisk undersökningsmodell fr språkstörda barn. Utbildningsproduktion AB (kommer inom kort att ges ut i ny, något omarbetad upplaga av Pedagogisk Design) (1986).

  • 129.

    BishopD. TROG svensk manual [svensk översättning och bearbetning: Eva Holmberg och Eva Lundälv].Göteborg: SIH Läromedel (Originalarbete publicerat 1983). (1998).

  • 130.

    FriskVMontgomeryLBoychynEYoungRVanrynEMcLachlanDet al. Why screening canadian preschoolers for language delays is more difficult than it should be. Infants Young Child. (2009) 22:290308. 10.1097/IYC.0b013e3181bc4db6

  • 131.

    HarrisonPL. AGS Early Screening Profiles. Circle Pines, MN: American Guidance Service (1990).

  • 132.

    ZimmermanISteinerVPondR. Preschool Language Scale-Fourth Edition (PLS-4).San Antonio, TX: The Psychological Corporation (2002) 10.1037/t15140-000

  • 133.

    NewborgJStockJWnekLGuidabaldiJSvinickiJ. Battelle Developmental Inventory. Itasca: Riverside (1988).

  • 134.

    BrackenBA. Bracken Basic Concept Scale-Revised. San Antonio, TX: Psychological Corporation (1998).

  • 135.

    GascoeFP. Technical Report for Brigance Screens. Cheltenham, VIC: Hawker Brownlow Education (1998).

  • 136.

    JessupBWardECahillLKeatingD. Teacher identification of speech and language impairment in kindergarten students using the Kindergarten Development Check. Int J Speech-Lang Pathol. (2008) 10:44959. 10.1080/17549500802056151

  • 137.

    Office for Educational Review. Revised Kindergarten Development Check. Hobart: Office for Educational Review (2003).

  • 138.

    PescoDO'NeillDK. Predicting later language outcomes from the language use inventory. J Speech Lang Hear Res. (2012) 55:42134. 10.1044/1092-4388(2011/10-0273)

  • 139.

    O'NeillD. Language Use Inventory: An Assessment of Young Children's Pragmatic Language Development for 18-to 47-Month-Old Children. Waterloo, ON: Knowledge in Development (2009).

  • 140.

    SeymourHNRoeperTDe VilliersJGDe VilliersPA. Diagnostic Evaluation of Language Variation, Norm Referenced. Pearson: DELV (2005).

  • 141.

    WiigESecordWSemelE. Clinical Evaluation of Language Fundamentals (Preschool 2nd edition ed.). San Antonio, TX: The Psychological Corporation: Harcourt Assessment Inc. (2004).

  • 142.

    BishopDV. CCC-2: Children's Communication Checklist-2. Pearson (2006).

  • 143.

    WesterlundMBerglundEErikssonM. Can severely language delayed 3-year-olds be identified at 18 months? Evaluation of a screening version of the MacArthur-Bates communicative development inventories. J Speech Lang Hear Res. (2006) 49:23747. 10.1044/1092-4388(2006/020)

  • 144.

    BerglundEErikssonM. Communicative development in Swedish children 16-28 months old: The Swedish early communicative development inventory-words and sentences. Scand J Psychol. (2000) 41:13344. 10.1111/1467-9450.00181

  • 145.

    ErikssonMBerglundE. Swedish early communicative development inventories: Words and gestures. First Lang. (1999) 19:5590. 10.1177/014272379901905503

  • 146.

    WesterlundMSundelinC. Can severe language disability be identified in three-year-olds? Evaluation of a routine screening procedure. Acta Paediatrica. (2000) 89:94100. 10.1111/j.1651-2227.2000.tb01195.x

  • 147.

    WesterlundMSundelinC. Screening for developmental language disability in 3-year-old children. Experiences from a field study in a Swedish municipality. Child. (2000) 26:91110. 10.1046/j.1365-2214.2000.00171.x

  • 148.

    MullenEM. Mullen Scales of Early Learning. Circle Pines, MN: AGS (1995).

  • 149.

    AhufingerNBerglund-BarrazaACruz-SantosAFerinuLAndreuLSanz-TorrentMet al. Consistency of a nonword repetition task to discriminate children with and without developmental language disorder in Catalan-Spanish and European Portuguese speaking children. Children. (2021) 8:85. 10.3390/children8020085

  • 150.

    RoweMLRaudenbushSWGoldin-MeadowS. The pace of vocabulary growth helps predict later vocabulary skill. Child Dev. (2012) 83:50825. 10.1111/j.1467-8624.2011.01710.x

  • 151.

    NippoldMATomblinJB. Understanding Individual Differences in Language Development Across the School Years. New York, NY: Psychology Press (2014). 10.4324/9781315796987

  • 152.

    BossuytPMReitsmaJBBrunsDEGatsonisCAGlasziouPPIrwigLet al. STARD 2015: an updated list of essential items for reporting diagnostic accuracy studies. Clin Chem. (2015) 61:144652. 10.1373/clinchem.2015.246280

Summary

Keywords

surveillance, screening, language disorder, PRISMA review, meta-analysis, summary receiver-operating characteristics, meta-regression

Citation

So KKH and To CKS (2022) Systematic Review and Meta-Analysis of Screening Tools for Language Disorder. Front. Pediatr. 10:801220. doi: 10.3389/fped.2022.801220

Received

25 October 2021

Accepted

13 January 2022

Published

23 February 2022

Volume

10 - 2022

Edited by

Daniel Holzinger, Hospitaller Brothers of Saint John of God Linz, Austria

Reviewed by

Karin Wiefferink, Dutch Foundation for the Deaf and Hearing Impaired Child (NSDSK), Netherlands; Steffi Sachse, Heidelberg University of Education, Germany

Updates

Copyright

*Correspondence: Carol K. S. To

This article was submitted to Children and Health, a section of the journal Frontiers in Pediatrics

Disclaimer

All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article or claim that may be made by its manufacturer is not guaranteed or endorsed by the publisher.

Outline

Figures

Cite article

Copy to clipboard


Export citation file


Share article

Article metrics