Statistical Learning in Specific Language Impairment and Autism Spectrum Disorder: A Meta-Analysis

Impairments in statistical learning might be a common deficit among individuals with Specific Language Impairment (SLI) and Autism Spectrum Disorder (ASD). Using meta-analysis, we examined statistical learning in SLI (14 studies, 15 comparisons) and ASD (13 studies, 20 comparisons) to evaluate this hypothesis. Effect sizes were examined as a function of diagnosis across multiple statistical learning tasks (Serial Reaction Time, Contextual Cueing, Artificial Grammar Learning, Speech Stream, Observational Learning, and Probabilistic Classification). Individuals with SLI showed deficits in statistical learning relative to age-matched controls. In contrast, statistical learning was intact in individuals with ASD relative to controls. Effect sizes did not vary as a function of task modality or participant age. Our findings inform debates about overlapping social-communicative difficulties in children with SLI and ASD by suggesting distinct underlying mechanisms. In line with the procedural deficit hypothesis (Ullman and Pierpont, 2005), impaired statistical learning may account for phonological and syntactic difficulties associated with SLI. In contrast, impaired statistical learning fails to account for the social-pragmatic difficulties associated with ASD.


INTRODUCTION
Statistical learning of complex rules or patterns is thought to play a crucial role in the development of language, social-cognitive, and motor skills (Perruchet and Pacton, 2006;Frith and Frith, 2008;Romberg and Saffran, 2010;Ruffman et al., 2012) 1 . Deficits in statistical learning have been implicated in a range of developmental disorders, such as Specific Language Impairment (SLI) and Autism Spectrum Disorder (ASD) (Ullman, 2004;Nicolson and Fawcett, 2007). Ullman (2004), Ullman and Pierpont (2005), and Walenski et al. (2006) proposed the procedural deficit hypothesis wherein challenges with rule-based aspects of language observed across a range of developmental disorders (including SLI and ASD) can largely be explained by neurological abnormalities affecting the frontal/basal ganglia and cerebellar circuits that underpin the procedural memory system. This system underpins the acquisition of long-term knowledge that is inherently sequential or statistical in structure. Ullman (2004, p. 251) asserted that "SLI may best be viewed as an impairment in procedural memory" because phonological, morphological, and syntactic rule learning is commonly impaired in SLI while lexical knowledge is often spared. Researchers have also hypothesized that implicit learning impairments may contribute to the social-communicative and behavioral atypicalities associated with ASD by making it more difficult for individuals with ASD to extract patterns from the environments in order to understand the unspoken rules governing language and social mores (Frith, 1970a,b;Klinger et al., 2007). In this paper, we report a series of meta-analyses conducted on studies of statistical learning in SLI and ASD in order to evaluate whether the procedural deficit hypothesis provides an adequate account of impairments in SLI and ASD.

DEFINING CHARACTERISTICS OF SLI AND ASD
Specific language impairment is a neurodevelopmental disorder characterized by below age-appropriate language functioning with respect to the production and/or comprehension of language. The language problems associated with this disorder occur in the absence of general developmental delay, autism diagnosis, neurological deficit, or hearing impairment (Schwartz, 2009). ASD is characterized by difficulties in socialcommunication, as well as restricted and repetitive patterns of behavior, including sensory atypicalities (American Psychiatric Association [APA], 2013). Although language impairments are not part of the current diagnostic criteria for ASD, impairments in pragmatics, semantics, morphology, phonology, and syntax are observed among many individuals with ASD, as well as those with SLI (Tager-Flusberg, 2006;Boucher, 2012). Research suggests a similar neurological basis for language impairments in SLI and ASD (De Fossé et al., 2004;Lindgren et al., 2009). Nevertheless, pragmatics and semantics are typically more impaired than syntax and phonology across the lifespan in ASD relative to SLI (Boucher, 2012).
Despite these apparent differences, some individuals with ASD exhibit pronounced difficulties with phonological processing, grammatical morphology, and semantics akin to the difficulties exhibited by individuals with SLI (Kjelgaard and Tager-Flusberg, 2001;Tager-Flusberg, 2006). Researchers have suggested that these severely language-impaired individuals with ASD are evidence that a subset of individuals with ASD exhibits the structural language impairments characteristic of SLI. Additionally, a subset of individuals with SLI exhibits social and pragmatic difficulties similar to those exhibited by individuals with ASD (Leyfer et al., 2008;Durkin et al., 2012); as adults, these individuals may have difficulties in social functioning similar to those experienced by adults with ASD (Whitehouse et al., 2009).
Children with ASD and SLI experience a number of challenges beyond the core social-communicative impairments. For example, individuals with SLI typically perform worse than controls on tasks assessing working memory, attention, executive functioning, and motor skills (Im-Bolter et al., 2006;Marton, 2008). Similarly, individuals with ASD also perform worse than controls on certain tasks assessing attention, executive functioning, and motor functioning (Dawson et al., 2002;Hill, 2004;Landry and Bryson, 2004;Provost et al., 2007;Robinson et al., 2009). In contrast to SLI, working memory may be intact in ASD (Russell et al., 1996). In addition, evidence that specific aspects of executive functioning are similarly impacted by ASD and SLI remains limited and similarities that have been observed may be attributable to shared linguistic challenges (Taylor et al., 2012). Nevertheless, the aforementioned commonalities in terms of challenges experienced by individuals with ASD and SLI, in addition to high rates of comorbidity between the two disorders and evidence of shared genetic risk factors, have led researchers to postulate that both disorders may arise from shared underlying mechanisms (Ullman, 2004;Conti-Ramsden et al., 2006;Nicolson and Fawcett, 2007;Bishop, 2010;Tomblin, 2011;Bartlett et al., 2012). Williams et al. (2008) noted that in infancy, SLI and ASD show similar patterns of development, however, as children grow older the developmental trajectories of language impairments in each disorder follow different paths. Williams et al. (2008) noted impairments in phonology, word retrieval, and grammar (morphology and syntax) to be more persistent across the lifespan in individuals with SLI, which contrasts with the pragmatic difficulties more consistently evident in individuals with ASD (see also Demouy et al., 2011). The linguistic challenges associated with each disorder may be heritable, as evidenced by distinctive patterns of impairments among the first-degree relatives of people with autism or SLI. For example, Lindgren et al. (2009) found that first-degree relatives of children with SLI showed poorer performance on measures of receptive and expressive language, phonological processing, reading ability, and IQ than relatives of children with ASD. Similarly, Whitehouse et al. (2007) found that parents of children with SLI exhibited better pragmatic language skills but performed more poorly on structural language measures relative to parents of children with ASD.

NEUROLOGICAL AND BEHAVIORAL EVIDENCE FOR THE PROCEDURAL DEFICIT HYPOTHESIS
To evaluate the procedural deficit hypothesis as a unifying account of impairments in SLI and ASD, one must consider both neurological and behavioral data as potentially informative. With regards to neurological evidence, structural atypicalities of the cerebellum, frontal lobe and basal ganglia have been documented in both SLI (reviewed by Ullman, 2004) and ASD (e.g., Sears et al., 1999;Carper and Courchesne, 2000); however, evidence that these brain structures are atypical in ASD and SLI remains conflicted (e.g., Brambilla et al., 2003;Mayes et al., 2015). Moreover, atypicalities in brain activity have not been well linked to behavioral evidence of impairments in statistical learning. For instance, a recent study reported "high-functioning" youth with ASD to exhibit less activity in the basal ganglia during an implicit language-learning task relative to youth without ASD, yet found no group differences in behavioral learning outcomes, with both groups performing at chance (Scott-Van Zeeland et al., 2010). Another recent study, which examined electrophysiological responses to visual statistical learning among young children with and without ASD, reported that young children with ASD as a group exhibited less neural evidence of statistical learning, yet did not include any behavioral assessments of learning outcomes (Jeste et al., 2015). In a study that only assessed behaviors and did not assess brain functioning, faster than normal use of grammatical rules (with no decrements in accuracy) by youth with ASD relative to youth without ASD was interpreted as evidence that atypicalities in the basal ganglia can lead to either speeding up or slowing down of performance on language tasks that depend on the procedural memory system (Walenski et al., 2014).
The behavioral evidence that statistical learning is actually impaired in ASD appears to be much weaker than the behavioral evidence that statistical learning is impaired in SLI (e.g., Nemeth et al., 2010;Lum et al., 2014). In fact, researchers have theorized that enhanced implicit learning skills might explain savant abilities in ASD (Mottron et al., 2006). Contradictory assertions concerning whether statistical learning is impaired in ASD or SLI, in conjunction with evidence that a subset of individuals with ASD exhibits a structural language profile that strongly resembles SLI (Kjelgaard and Tager-Flusberg, 2001;Tager-Flusberg, 2006), suggests that research is needed to evaluate if statistical learning is an underlying impairment in both SLI and ASD. The metaanalyses described in this report were designed to address this question in order to help inform future interventions. Before describing the current study, we review findings from two recent meta-analyses of statistical learning in SLI and ASD.

Statistical Learning in SLI
The literature on statistical learning in SLI provides considerable support for Ullman and Pierpont's (2005) procedural deficit hypothesis (see Kemeny and Lukacs, 2010;Hedenius et al., 2011). Lum et al. (2014) used meta-analysis to evaluate whether impairments in statistical learning, as assessed using the Serial Reaction Time (SRT) task, constitute a core deficit in SLI. In a typical SRT task, stimuli appear at one of four positions on a computer screen with blocks of trials following either a fixed or random sequence. Participants are required to press buttons corresponding to the positions of stimuli as they appear. If learning of the fixed sequence of stimuli occurs, reaction times (RTs) will be significantly faster for trials in sequenced as compared to random blocks. Basing their methodology on a prior meta-analysis of learning deficits in the SRT task in patients with schizophrenia (Siegert et al., 2008), Lum et al. (2014) calculated effect sizes by assessing the difference between the mean RTs in the final sequenced block vs. the first random block. Lum et al. (2014) showed that 7 out of the 8 studies comparing SRT task performance of children with SLI with age-matched controls reported effects in the predicted direction, corresponding to impaired statistical learning in SLI, although only two reported statistically significant differences between groups, due to the small sample sizes of the individual studies contributing to low statistical power. Given the consistent direction of the effect across studies, the weighted average effect size (g = 0.33) indicated a statistically significant impairment in statistical learning among children with SLI relative to agematched peers, in support of Ullman and Pierpont's (2005) procedural deficit hypothesis. Lum et al. (2014) limited their meta-analysis to consider performance on only a single statistical learning task. However, the results of a handful of studies employing multiple measures of statistical learning suggest that performance across tasks is only weakly interrelated, and may not reflect a unified underlying capacity (Gebauer and Mackintosh, 2007;Misyak et al., 2010;Siegelman and Frost, 2015). These discrepancies may partially reflect the influence of task modality on statistical learning performance. Typically developing people exhibit a statistical learning advantage in the auditory domain relative to tactile and visual modalities (Conway and Christiansen, 2005). Although not universal, advantages in visual relative to auditory learning have been reported by people with ASD (Grandin, 1995). Thus, the current meta-analysis considered performance across a range of statistical learning tasks to determine the robustness of possible impairments in statistical learning in SLI and ASD across task modalities (visual vs. auditory).

Statistical Learning in ASD
Research examining statistical learning in ASD has reported mixed findings (e.g., Mostofsky et al., 2000;Smith, 2003;Gordon and Stark, 2007;Barnes et al., 2008;Brown et al., 2010;Nemeth et al., 2010). Foti et al. (2015) recently conducted three metaanalyses of implicit learning in ASD, with the first comparing effects across seven studies using Serial Reaction Time (SRT) or Alternating Serial Reaction Time (ASRT) tasks, the second comparing effects across four studies using the Contextual Cueing (CC) task, and the third comparing effects across two studies using the Pursuit Rotor (PR) task. Note that the SRT, ASRT, and CC tasks are considered to be measures of statistical learning, whereas the PR task is a measure of motor skill learning. In each of the meta-analyses, Foti et al. (2015) failed to find evidence that learning was impaired in individuals with ASD.
A limitation with the approach used by Foti et al. (2015) was their assessment of performance on the SRT and ASRT tasks. The authors examined reductions in RTs across sequenced blocks, as opposed to measuring RT differences for sequenced vs. random blocks, as is conventional (Nissen and Bullemer, 1987). The method adopted by Foti et al. (2015) is problematic because changes in RT over sequence blocks confound statistical learning with gains in perceptual and biomechanical efficiency at responding to visual stimuli. That is, RTs may become faster on the SRT and ASRT tasks because participants become faster at pressing a response box button following stimulus onset as opposed to acquiring information about the repeating sequence. For this reason sequence-specific effects are examined by comparing changes in RT from sequenced vs. random blocks (Gordon and Stark, 2007;Lum et al., 2010Lum et al., , 2012Travers et al., 2010;Gabriel et al., 2011Gabriel et al., , 2013Lum and Bleses, 2012;Hsu and Bishop, 2014;Mayor-Dubois et al., 2015). Furthermore, results from several meta-analyses shows this latter approach is associated with basal ganglia functioning (Hardwick et al., 2013;Clark et al., 2014). Thus the approach used by Foti et al. (2015) makes it impossible to compare their findings for ASD with existing meta-analyses demonstrating statistical learning impairments in other clinical populations (SLI: Lum et al., 2014;Dyslexia: Lum et al., 2013;Parkinson's Disease: Siegert et al., 2006;Clark et al., 2014;Schizophrenia: Siegert et al., 2008).
To draw conclusions about putative similarities or differences in statistical learning across disorders, researchers must use the same indices of learning across populations. Therefore, the current meta-analysis expanded upon the meta-analyses by Lum et al. (2014) and Foti et al. (2015) by employing the standard procedures for assessing learning and by including a broader range of statistical learning tasks. Given the limited number of studies, to increase statistical power to detect group differences and precision of effect size estimates, we entered effect sizes for multiple statistical learning tasks into the same meta-analysis while examining task modality as a moderator of effects.

THE CURRENT STUDY
We hypothesized that based on common linguistic and nonlinguistic challenges associated with SLI and ASD both disorders might share a common underlying deficit in statistical learning. The aims of the current meta-analysis were (1) to examine whether impairments in statistical learning are a shared challenge for individuals with SLI and ASD; and (2) to examine whether task modality and age moderated effect sizes. In the prior meta-analysis of performance on the SRT task in children with SLI, Lum et al. (2014) found age to be a significant moderator of effect sizes, with larger effects apparent in studies with younger participants. Hence, we used meta-regression to examine whether age moderated effects across statistical learning tasks in SLI and ASD. In an effort to replicate and extend Lum et al.'s (2014) findings of impaired statistical learning in SLI, we incorporated multiple commonly used measures of statistical learning, including Serial Reaction Time (SRT), Alternating Serial Reaction Time (ASRT), Contextual Cueing (CC), Artificial Grammar Learning (AGL), Observational Learning (OL) and Probabilistic Classification Learning (PCL). We also sought to re-evaluate Foti et al.'s (2015) claim that individuals with ASD do not show impaired statistical learning by using the standard procedure for measuring learning in SRT and ASRT tasks (Nissen and Bullemer, 1987).
Understanding whether SLI and ASD share an underlying processing deficit is essential for identifying potential common neural circuits that may contribute to a range of different developmental disorders; as such, the current research is well aligned with a recent shift toward identifying common pathways that may be implicated in a range of disorders (Geschwind, 2011;Ullman and Pullman, 2015). A better understanding of shared and unique mechanisms underlying different disorders may support the development of more effective interventions by indicating if interventions developed for one disorder are likely to be helpful for the other and by identifying specific treatment targets that may be shared across disorders or unique to each disorder. If SLI and ASD show varying patterns of statistical learning, this finding would suggest that common symptoms in both conditions likely arise from different underlying mechanisms.

Statistical Learning Tasks
Statistical learning tasks are typically designed so that cooccurrence patterns and ordering of stimuli are based on complex sets of rules. The next section describes, in detail, the tasks represented in this meta-analysis. To be included, studies needed to have a testing phase with learning assessed by comparing performance across sequenced vs. random/control trials, using either RT or accuracy as the dependent variable. Thus, the Pursuit Rotary task, for example, was excluded from the meta-analysis because it measures time-on-target across blocks and does not have a control condition.

Serial Reaction Time (SRT)
The SRT task, introduced by Nissen and Bullemer (1987) is widely used with clinical populations. In a standard SRT task, sequences of visual stimuli appear at one of four positions on a computer screen. Each position corresponds to a button on a pad; as each stimulus appears, the participant is required to press the corresponding button as quickly as possible. Across blocks of trials, stimuli may follow a fixed sequence that, through learning, leads participants to anticipate the location of each successive stimulus in the series. Learning is measured through reductions in RTs for blocks of trials following the fixed sequence, as compared to blocks of trials following a random sequence (Nissen and Bullemer, 1987;Robertson, 2007). The Alternating Serial Reaction Time Task (ASRT; Howard and Howard, 1997) is similar to the SRT tasks in many respects except that it inserts random items within the sequence of trials that follow a fixed order to reduce explicit knowledge or awareness of the recurring sequences (Brown et al., 2010;Nemeth et al., 2010).

Contextual Cueing (CC)
The CC task is a visual search task where participants are required to locate a visual target (e.g., a rotated T shape) in a field of distractors (e.g., ∼10-12 rotated L shapes) (Chun and Jiang, 1998). Across blocks of trials, a fixed sequence of displays is used, with the location of the target (the rotated T shape) determined by the configuration of the distractors. Participants are required to press a key corresponding to the rotation of the target as quickly as possible. Similar to the SRT task, if learning is achieved, participants become faster in responding to targets in familiar configurations, where the target's location is fully predictable based on contextual cues, in comparison to random (baseline) configurations.

Artificial Grammar Learning (AGL)
The AGL task (Miller, 1958;Reber, 1967) involves presenting participants with meaningless auditory or visual sequences of stimuli (non-sense syllables, letters) that are generated by a complex set of rules (e.g., a finite-state grammar). Participants are instructed to memorize the sequences presented. After a period of exposure to a representative set of sequences generated by Frontiers in Psychology | www.frontiersin.org the grammar, learners are tested on their implicit learning of the underlying rules by means of a grammaticality judgment task in which familiar and unfamiliar sequences generated by the grammar are contrasted with ungrammatical foils (random sequences of the same stimuli), with accuracy used as the dependent measure.

Speech-Stream (SS)
The SS task examines whether participants can use transitional probabilities between non-sense syllables (i.e., the conditional probability of one syllable following another) to detect word boundaries in continuous speech (Saffran et al., 1996). Participants briefly listen to a speech stream comprising three-syllable non-sense words concatenated into a random sequence, with each non-sense word occurring multiple times. The speech stream is synthesized to eliminate any cues to the word boundaries besides the recurring three-syllable sequences. To measure statistical learning, participants are testing on their ability to distinguish the recurring three-syllable non-sense words from other three-syllable sequences that occur less often in the speech stream (i.e., "part-word" sequences that span word boundaries), with accuracy as the dependent measure. Note that in some variants of the SS task, tones are used in place of syllables to evaluate statistical learning across different types of stimuli.

Observational Learning (OL)
The OL task (Fiser and Aslin, 2001) examines statistical learning of shape co-occurrences in complex visual scenes. Stimuli are organized into "base pairs" comprising two arbitrary shapes in a particular spatial arrangement (vertical, horizontal, or diagonal). During the familiarization phase, participants are briefly exposed to a series of 3 × 3 arrays, with each array displaying several of the base pairs in various locations. Note that across arrays, the location of one shape within each base pair is fully determined by the location of the other shape within the pair. Participants are instructed to pay attention to the continuous sequence of arrays for a later test. During the two-alternative forced-choice test, base pairs from the familiarization trials were presented along with novel (random) pairs. Participants were asked to decide which of the two pairs seems more familiar, with accuracy in selecting the base pairs as the dependent measure.

Probabilistic Classification Learning (PCL)
In the PCL task, participants learn which of two outcomes is predicted by combinations of four different cues (Estes et al., 1957). The Weather Prediction task is a commonly employed PCL task where trials utilize four different geometrical shapes presented in various combinations on a computer screen. For each combination, comprising one to three of the geometrical shapes, participants are asked to predict whether the combination is associated with rain or sunshine. Participants respond by pushing one of two corresponding buttons and are shown the correct response as feedback to facilitate learning (Knowlton et al., 1994); note that this contrasts with other statistical learning tasks, where participants are given no feedback on their performance. Learning is typically measured by calculating the number of correct responses (learning the association between the cue and the outcome) across trials (Mayor-Dubois et al., 2015).

Criteria for Study Inclusion
Figures 1 and 2 provide flowcharts depicting, for each metaanalysis, the main steps of the literature search and selection of studies in accordance with the Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA) guidelines (Moher et al., 2009). As a first step, we searched for published articles or dissertations on statistical learning in SLI and ASD using PsycINFO, Academic Search Complete, and Google Scholar with searches conducted periodically from June 2013 to March 2016. Computerized searches were conducted using the terms implicit learning, sequence learning, statistical learning, or procedural learning coupled with the terms: SLI, Language Impairment, ASD, or Autism. Having identified a large number of potential articles, the abstracts were screened to determine whether they were empirical studies on statistical learning. All studies that met this screening criterion were examined to determine whether they met eligibility criteria. Eligibility required the study to use a statistical learning task (see description of tasks) and to include one diagnostic group of individuals identified as language impaired or ASD, in addition to a control group. As a final step, we excluded SRT studies that did not include a random block of trials. We also excluded two studies with a general language/learning disabled group that was not explicitly identified as SLI (e.g., Fletcher et al., 2000;Plante et al., 2002).
For the SLI sample, a total of 14 studies (15 comparisons) were included. For the ASD sample, a total of 13 studies (20 comparisons) were included. All lead authors were contacted and asked to provide further statistical information and unpublished data, if available. No studies were lost due to missing data, and no additional unpublished data were identified. Tables 1 and 2 provide summaries of study participants and the tasks employed for SLI and ASD, respectively.

Meta-Analytic Procedures
Statistical analyses were conducted using the Comprehensive Meta-Analysis (CMA) program 2.0 (Borenstein et al., 2005). To examine overall effect size differences between the diagnostic groups (SLI and ASD) and controls, a random-effects model was used, which pools effect sizes from individual studies to create a weighted average effect size (Hedges, 1983). The I 2 statistic was used to determine whether the variability within the sample is due to heterogeneity between studies and not due to sampling error (Huedo-Medina et al., 2006). Mixed-effects subgroup analyses were used to examine whether task modality moderated effect sizes.

Effect Size Extraction
As mentioned, the most widely accepted method for assessing learning in the SRT task involves examining whether a difference exists between the final sequenced block and the first random block (Nissen and Bullemer, 1987). Thus, when RT was the outcome measure, we were interested in whether there existed a significant Group (i.e., SLI or ASD vs. Control) × Condition (Sequenced vs. Random/Baseline) interaction. When accuracy was the outcome measure, we were interested in whether the groups differed in distinguishing grammatical from ungrammatical sequences (AGL task), threesyllable words from part-words (SS task), base pairs from random pairs (OL task), or using probabilistic cues to predict outcomes (PCL task).
Note that when multiple tasks were included in the same study, we computed an effect size for each task. Hence, studies with multiple tasks yielded multiple comparisons; these multiple comparisons were averaged together when conducting a meta-analysis as the level of studies. We used Hedge's g as the computed effect size measure. Note that positive g values indicate that the control group in the study showed higher statistical learning compared to the diagnostic group. This approach has been used in previous metaanalyses of SRT tasks (e.g., Siegert et al., 2006;Lum et al., 2014).
Data was extracted from each study so that an effect size and its variance could be computed. The effect size used for this meta-analysis was Hedges g, which expressed the difference between two groups in standard deviation units. For each study the value was computed so that positive values indicated that the control group evidenced better statistical learning than a clinically defined group (ASD or SLI). The data extracted from primary studies to compute Hedges g were results from statistical tests, summary data presented in either tables or figures, or by contacting authors. Conversion of these data to Hedges g was undertaken using CMA 2.0 (Borenstein et al., 2005).
Prior to conducting the meta-analysis, we correlated the effect sizes that we extracted from our studies to those extracted from the same studies by Foti et al. (2015). A total of 6 studies were included in the correlational analysis of SRT and ASRT task performance 2 . We found a marginally significant negative correlation between our effect size estimates and those calculated by Foti et al. (2015), r = −0.80, p = 0.06. This negative correlation strongly suggests that the standard measure of statistical learning in SRT/ASRT tasks (reported here) is distinct from measuring changes in RT due to practice (reported in Foti et al., 2015).

Moderators
The current meta-analysis incorporated multiple tasks as indices of statistical learning. Tasks varied in whether the stimuli were visual or auditory in nature. For this reason, we examined task modality (visual versus auditory) as a potential moderator in both the SLI and ASD data. To examine whether age moderated group differences in statistical learning, we conducted a meta-regression with age as a predictor variable. Age was entered in the analysis as the mean age for each group (Tables 1 and 2). 4 show preliminary results of possible publication bias using funnel plots for the SLI and ASD data, respectively. Funnel plots show publication bias when their individual effect sizes are distributed in an asymmetrical manner around the weighted average effect size. Egger's test of asymmetry was not significant for either group [SLI: Intercept = 0.33, t(13) = 0.30, p = 0.77, ASD: Intercept = −0.20, t(18) = 0.14, p = 0.89], indicating that bias was not found in our search.

Meta-Analysis: Specific Language Impairment
A mixed-effects meta-analysis addressed the first aim of the study, to determine whether statistical learning is impaired in SLI, by extending Lum et al.'s (2014) findings using a larger dataset that was not restricted to the SRT task. Figure 5 shows a forest plot depicting effect sizes for each study and weighted averages for the SLI group. The results of the mixed-effects analysis examining statistical learning in SLI are reported in Table 3. Positive effect sizes indicate that the control group displayed higher learning compared to the SLI group. In line with Lum et al.'s (2014) meta-analysis, results showed a significant Hedge's g of 0.46 at the level of studies and 0.47 at the level of comparisons (p < 0.001), suggesting that participants with SLI show significant impairments in statistical learning compared to controls. A mixed-effects subgroup analysis was computed to examine whether impairments in statistical learning in SLI were moderated by task modality (auditory versus visual). This subgroup analysis was not significant Q(1) = 1.36, p = 0.24, indicating that task modality did not moderate the effect sizes.

Meta-Analysis: Autism Spectrum Disorder
A mixed-effects analysis addressed the second aim of the study, to determine whether statistical learning is impaired in ASD. We extended Foti et al.'s (2015) meta-analysis by employing the standard measure of learning for the SRT/ASRT tasks and including our full set of statistical learning tasks. A forest plot depicting study effect sizes and weighted averages for the ASD group is presented in Figure 6. Table 4 presents the results of the mixed-effects analysis examining statistical learning in ASD, with positive effect sizes indicating higher learning in the control group relative to the ASD group. The Hedge's g was not significant at the level of studies, g = −0.11, p = 0.30, or comparisons, g = −0.13, p = 0.22. This suggests that ASD participants did not differ significantly in learning when compared to controls, which is in line with the conclusion drawn by Foti et al. (2015). Mixed-effects subgroup analyses were computed to examine whether effect sizes varied significantly by task modality for the ASD group. This analysis indicated that task modality did not moderate the finding of intact statistical learning in ASD, Q(1) = 1.25, p = 0.26.

Between-Groups Meta-Analysis: SLI and ASD
To examine the final aim of this study, whether impairments in statistical learning are a common underlying deficit in individuals with SLI and ASD, we employed random effects models to examine overall effect size differences between groups (SLI versus ASD). Two meta-analyses were conducted to examine possible differences in statistical learning between SLI and ASD. The first analysis compared all retrieved SLI and ASD studies regardless of task, thus resulting in 14 studies (15 comparisons) for the SLI group and 13 studies (20 comparisons) for the ASD group. In the second analysis we matched the studies in both groups by task. For this analysis, we included only SRT, ASRT, AGL, SS, and PCL tasks, as those tasks were included in both SLI and ASD datasets. That is, CC and OL tasks for the ASD group were dropped from the analysis as none of our retrieved SLI studies used these tasks. This resulted in 14 studies (15 comparisons) for the SLI group and eight studies (12 comparisons) for the ASD group. Results of the first between-groups analysis showed that there was a significant difference in statistical learning between the SLI and ASD groups both at the level of studies, Q(1) = 15.54, p < 0.001, and at the level of comparisons, Q(1) = 17.84, p < 0.001. Such between-group differences remained robust even when matching on type of task (study level: Q(1) = 8.90, p = 0.003; comparison level: Q(1) = 10.90, p = 0.001). This finding suggests that while individuals with SLI show impairments in statistical learning, this ability appears to be intact in individuals with ASD.

Meta-Regression with Age as a Moderator
The final meta-analysis was a multivariate meta-regression to evaluate whether participants' ages predicted the effect sizes shown in Figures 5 and 6. The predictor variable in the analysis was participants' age while controlling for diagnostic group. Overall, the model was not significant; hence, age was not found to be a predictor of effect sizes, Q(1) = 0.65, R 2 = 0.26, p = 0.42. Table 5 shows a summary of the coefficients of the regression model.

DISCUSSION
Deficits in statistical learning have been hypothesized to be present in SLI and ASD (Frith, 1970a,b;Ullman, 2004;Ullman and Pierpont, 2005;Walenski et al., 2006;Nicolson and Fawcett, 2007). The procedural deficit hypothesis, proposed by Ullman (2004) and Ullman and Pierpont (2005), claims that impairments in language development observed among individuals with SLI and those with ASD (specifically rule-based processes critical for phonological and grammatical development) may be explained by deficits in the neural networks that underpin procedural memory. However, the degree to which shared symptoms of ASD and SLI arise from common mechanisms remains disputed (Frith, 1970a,b;Mottron et al., 2006;Klinger et al., 2007;Nicolson and Fawcett, 2007;Williams et al., 2008;Lum et al., 2014;Foti et al., 2015).  Nemeth et al. (2010) used three groups of participants, the results of the same participants in the ASD group were compared to separate age-matched and IQ-matched typically developing participants. For this reason, we divided the ASD sample by 2 when computing effect sizes for each group. The current meta-analyses were designed to address conflicting speculations concerning whether statistical learning is a shared impairment among individuals with SLI and ASD. Findings suggest that statistical learning is commonly impaired in SLI, but not ASD. No evidence of moderation by task modality or age was observed. While our conclusion regarding intact learning in ASD was similar to Foti et al.'s (2015), the variables measured in the two meta-analyses were distinct and trended toward being negatively correlated. In Foti et al. (2015), nonsignificant group differences in implicit learning (assessed via decreased RTs over blocks of sequenced trials) favored the control group; in the current study, non-significant group differences in RTs for sequenced vs. random trials favored the ASD group.
The prior meta-analysis examining performance on the SRT task in children with and without SLI  suggested that impairments in statistical learning may become weaker with age in participants ranging in age from 7 to 15 years. Given this prior finding and the different mean ages of participants in the ASD and SLI studies examined in this report, we conducted a meta-regression to assess whether age predicted variations in    effect sizes computed. We found no relationship between age and performance on tasks of statistical learning, thus age of the participants in the sample did not affect variations in computed effect sizes. Therefore, we did not replicate an age effect reported by Lum et al. (2014), wherein implicit learning impairments in SLI became less apparent with age. Our inability to replicate may have arisen because the original effect was relatively weak (only significant with a one-tailed t-test). In addition, performance on different types of statistical learning tasks may develop differently with age. For instance, SRT performance tends to improve with age whereas ASRT performance tends to get worse with age (Janacsek et al., 2012).

Interpreting Group-Level Differences in Statistical Learning
The observed group-level difference in statistical learning in SLI and ASD suggests that the different manifestations of language impairments in each disorder stem from different underlying mechanisms. The current findings are consistent with the procedural deficit hypothesis of Ullman and Pierpont (2005) wherein impairments in statistical learning account for deficits in rule-based aspects of language, such as phonological, morphological, and syntactic processing for SLI but not for ASD. It is important to note that it is possible that deficits in statistical learning are apparent in the subgroup of individuals with ASD who exhibit structural language difficulties reminiscent of SLI (Kjelgaard and Tager-Flusberg, 2001). We could not evaluate whether individuals with ASD with lower language abilities than comparison groups have impaired statistical learning due to the paucity of studies examining statistical learning among nonverbal "low-functioning" individuals with ASD and the lack of detailed information about language skills in most prior work on statistical learning in ASD. The one study in our meta-analysis that carefully documented language impairments among participants with ASD found evidence of impaired statistical learning in ASD relative to controls (Gordon and Stark, 2007). Findings suggest that a more focused alternative to Ullman's (2004) general procedural deficit hypothesis is needed wherein statistical learning impairments are not expected to be apparent across all developmental disorders but rather are expected to be apparent only among people who exhibit challenges in specific rule-based aspects of language. Deficits in each of the areas of language that would be expected to be impaired according to the procedural learning hypothesis are hallmark characteristics of SLI (Schwartz, 2009). In contrast, phonology, morphology, and syntax tend to be relatively intact in ASD, at least at later stages of language development (Williams et al., 2008;Boucher, 2012). Indeed, the two domains of language that are most commonly impaired in ASD, semantics and pragmatics, are either described by Ullman (2004) as primarily arising from declarative learning (e.g., semantics) or not discussed in either of his seminal papers about his procedural deficit hypothesis (e.g., pragmatics : Ullman, 2004;Ullman and Pierpont, 2005).
The current findings highlight the importance of examining associations between implicit learning, verbal and non-verbal pragmatic skills and specific domains of language longitudinally using cross-lagged designs in order to understand the contributions of each to language development. Unfortunately, none of the studies included in our meta-analysis focused on the development stage when the linguistic profiles of individuals with ASD and SLI are presumed to be most similar; difficulties with syntax and articulation are apparent early in development in ASD but typically resolve by the school-age years (Boucher, 2012). Future longitudinal research should be conducted with individuals with ASD or SLI beginning in preschool in order to identify potential commonalities that are apparent at that developmental stage, but not later. Such research could examine the hypothesis that early commonalities in language profiles across ASD and SLI are attributable to different underlying mechanisms, which yield reduced opportunities to learn language among children with ASD (due to poor joint attention and coordinated social engagement) versus reduced retention of information from such opportunities among children with SLI (due to reduced capacities in statistical learning and/or verbal working memory).
Evaluating how statistical learning might contribute to language impairments is complicated by the variety of tasks used to assess statistical learning. We assumed for the purpose of group-level meta-analyses that the various tasks measured the same underlying construct. However, statistical learning is complex and may not represent a single construct (Erickson and Thiessen, 2015;Siegelman and Frost, 2015), and some tasks (or task variants) allowing learners to rely on explicit strategies, such as chunking sequences of elements in memory, to achieve apparent success in statistical learning (cf. Nemeth et al., 2010). Although we found no moderating effect of task modality on effect sizes for either SLI or ASD groups, we cannot rule out the possibility that different tasks relate to language outcomes in fundamentally different ways.
To understand relationships between statistical learning and language impairments, one needs to look not only at group-level differences, but also at relationships between statistical learning and individual differences in different aspects of language development and processing. Such correlational designs should control for other variables, such as non-verbal (fluid) intelligence, that are likely to impact performance on a broad range of tasks. Perhaps due to claims that implicit forms of learning are robust across populations differing widely in age and intelligence (e.g., Reber, 1993;Stanovich et al., 2009), studies focusing on individual differences in statistical learning in relation to language outcomes are still relatively few in number. In the following subsections, we review this research in order to highlight its implications for distinguishing the impairments associated with SLI and ASD.

Statistical Learning in Relation to Grammar, Phonology, and Reading
Under the procedural deficit hypothesis, statistical learning is presumed to play a critical role in the mastery of rulebased aspects of language, such as grammar (morphology and syntax) and phonology. Several studies have explored the putative relationship between statistical learning and grammatical development with mixed results. In a study involving typically developing 4-6-year-olds, Kidd (2012) linked performance on the SRT task with syntactic priming, i.e., increased likelihood of producing complex passive-voice sentences (e.g., the guitar was played by the man) as descriptions of pictures after hearing another person use the passive-voice construction as a description of a different scene. Similarly, in a study of typically children of ages 6-8 years, Kidd and Arciuli (2016) linked performance on a visual statistical learning task with comprehension of complex sentence structures (passives and object relative clauses). In contrast, two other studies failed to find a relationship between SRT task performance and morphology acquisition-i.e., rule-based production of pasttense forms for regular and novel verbs in Finnish children between 4 and 7 years of age (Kidd and Kirjavainen, 2011) and in English-speaking children at around 5 years of age (Lum and Kidd, 2012).
To date, only one study has linked individual differences in statistical learning directly to aspects of phonological processing. Using the SS task with a group of 8-12-year-olds (half with SLI, half age-matched controls), Mainela-Arnold and Evans (2014) found statistical learning to predict the extent to which children experienced intrusions from phonologically related words in a spoken word recognition task. This study used a gating procedure in which children heard progressively longer fragments of words (starting from the word onset) and attempted to identify the words based on partial information. Poor performance on the SS task was associated with greater lexical-phonological competition in the word recognition task. In contrast, performance on the SS task was unrelated to the richness of children's semantic representations, as indexed by a word definition task.
Other evidence suggesting a relationship between phonological processing and statistical learning comes from studies of individual differences in reading-a process that relies on phonological awareness to achieve fluency in decoding letter sequences into sound patterns. In support of the procedural deficit hypothesis, Arciuli and Simpson (2012) reported correlations between performance on a visual analog of the SS task and reading ability in a group of 6-to 12-year-old children (N = 38) as well as in an adult sample (N = 37). In a recent meta-analysis, Lum et al. (2013) synthesized results of 14 studies comparing SRT task performance in dyslexic (N = 314) and age-matched controls (N = 317) and found robust evidence of a deficit in implicit learning associated with dyslexia, g = 0.45, 95% CI [0.20, 0.69], p < 0.001. These studies appear to contradict an earlier large-scale study of 422 children of ages 7-11 years (Waber et al., 2003), wherein SRT performance failed to distinguish good and poor readers. Given the complexity of learning to read, and its reliance on other aspects of language development such as vocabulary growth, additional research is required to elucidate how specific components of reading, such as the acquisition of grapheme-to-phoneme correspondence rules, might be linked to underlying statistical learning mechanisms.

Statistical Learning in Relation to Vocabulary Development
The procedural deficit hypothesis views vocabulary development as a relative strength in SLI due to its reliance on declarative as opposed to procedural memory (Ullman and Pullman, 2015). This position contrasts with the perspectives of infancy researchers focusing on the problem of word segmentation in relation to vocabulary acquisition (e.g., Romberg and Saffran, 2010;Erickson and Thiessen, 2015). The SS task (Saffran et al., 1996) originated as an experimental demonstration that infants could extract word forms from continuous speech solely on the basis of syllable co-occurrence statistics. Extracting word forms is considered to be a prerequisite to associating them with their referents, i.e., learning the meanings of the words. Indeed, several studies using the SS task have demonstrated links between the output of statistical learning, i.e., the identification of word forms, and subsequent mapping of the word forms onto referents by children (Estes et al., 2007) and adults (Mirman et al., 2008).
If language learners vary with respect to the efficiency of the underlying word segmentation process, this should impact the growth of their vocabularies. Using a visual sequence learning (VSL) task in which 8.5-month-old infants were exposed to threeelement sequences of visual images appearing in predictable spatial-temporal sequences (e.g., left-center-right, left-centerright, left-center-right), Shafto et al. (2012) demonstrated links between infants' ability to predict the location of the next element in the sequence and their receptive vocabulary size, measured using the MacArthur-Bates Communicative Development Inventories (CDI; Fenson et al., 2006). Infants who were faster to look at images occurring in predictable, as opposed to random, locations had greater vocabulary comprehension (vocabulary production was not assessed) at the time of the test than infants who responded at chance on the VSL task.  linked performance on the SS task to individual differences in receptive and expressive vocabulary in children with typical language development (age 6-14 years). For children with SLI, individual differences in performance on the SS task predicted receptive (but not expressive) vocabulary; furthermore, this association was only apparent after prolonged exposure to the SS task, when performance was no longer at chance. Although more work is needed to identify the contribution of statistical learning to lexical-semantic development, these findings suggest that Ullman's procedural deficit hypothesis may need to be broadened to recognize greater contributions of statistical learning to lexical development than his theory initially accounted for.

Alternative Accounts of Language Impairments in ASD
Our findings that individuals with ASD have intact statistical learning suggest that social-communicative deficits associated with ASD cannot be explained by an underlying deficit in statistical learning. In contrast to evidence linking individual differences in statistical learning to language development in the domains of grammar, phonology, and vocabulary, we are not aware of any studies that have demonstrated associations between implicit learning and pragmatic language development, which is the area of language development most impacted by ASD. Although statistical learning might contribute to the development of semantics-the other language domain that is most commonly impacted in ASD (Boucher, 2012)-via word learning Shafto et al., 2012), the current findings suggest that semantic deficits associated with ASD are unlikely to arise from an impairment in statistical learning. Indeed, prior research suggests that semantic deficits in ASD are more likely to arise through their associations with core social-cognitive symptoms of ASD, such as reduced joint attention, as early words are often learned in contexts where children are able to coordinate their attention and interests with caregivers ( Baron-Cohen et al., 1997;Adamson et al., 2009).
Although Frith (1970a,b) was one of the first researchers to document statistical learning impairments in ASD, she has stated in more recent work that pragmatic and semantic language impairments associated with ASD likely arise from socialcognitive difficulties in understanding other people's perspectives (Perner et al., 1989;Frith and Happé, 1994). Difficulties in sharing attention and perspective taking may in turn arise from nonsocial atypicalities, such as difficulties with motor coordination (Gernsbacher et al., 2008) and/or reductions in interactional synchrony arising from atypicalities of time perception among individuals with ASD (Wimpory et al., 2002;Szelag et al., 2004).
Atypical timing of responses may also contribute to variations in performance on statistical learning tasks among individuals with ASD. In a classical eye-blink conditioning paradigm, wherein a tone is paired with an air puff to the eye (Clark and Squire, 1998), Sears et al. (1994) documented rapid classical conditioning in participants with ASD, who required significantly fewer trials than controls to associate the tone with the air puff. However, participants with ASD showed abnormalities in the timing of their responses. They blinked more rapidly after hearing the tone than controls, and more often re-opened their eyes before the air puff, and then blinked again. This atypical response topography suggests that the ASD group had difficulties anticipating the exact timing of the air puff and were unable to modulate their responses accordingly. However, as other research suggests that timing may not be atypical among individuals with ASD (Wallace and Happé, 2008), additional studies are needed in order to draw firm conclusions.

CONCLUSION, FUTURE DIRECTIONS, AND IMPLICATIONS FOR TREATMENT
The main finding of this report, that SLI, but not ASD, is associated with deficits in statistical learning, suggests that the language and communicative difficulties associated with each disorder have distinct underlying mechanisms. These results support the procedural deficit hypothesis with implications for the diagnosis and treatment of SLI, but suggest an alternative account is needed to explain the social and pragmatic difficulties associated with ASD. Core social symptoms of ASD, such as reduced joint attention and difficulty understanding others' perspectives, likely contribute to semantic and pragmatic difficulties among individuals with ASD (Frith and Happé, 1994;Adamson et al., 2009). However, additional research is needed to evaluate statistical learning in individuals with ASD with varying language abilities, including participants who may have language impairments similar to those associated with SLI. It is a major limitation of the field that only one study to date has examined statistical learning in "low-functioning" individuals with ASD with presumably weak language abilities. In addition to sampling individuals across the full spectrum of ASD, it would be fruitful for future studies to utilize a broader range of implicit learning paradigms, such as syntactic priming (cf. Garraffa et al., 2015), and to consider attentional control as an additional factor that may distinguish children with ASD and SLI (Norbury, 2014).
Future work on statistical learning in SLI should involve longitudinal investigations of late-talking toddlers at risk for SLI to determine whether age-appropriate measures of statistical learning, such as the SS task (Saffran et al., 1996), the AGL task (Gómez and Gerken, 1999), and the VSL task (Shafto et al., 2012), are predictive of individual differences in language outcomes in vocabulary, phonology, and grammar. Such studies will inform decisions as to whether statistical learning should be a direct target for intervention. If impaired statistical learning proves to be an early clinical marker of SLI, behavioral interventions should be designed to help children with SLI develop pattern extraction and integration skills (Erickson and Thiessen, 2015) and/or compensatory strategies (Ullman and Pullman, 2015). Research suggests that children with SLI may experience a rapid decay rate of auditory traces of speech in short-term memory (McMurray et al., 2010), may need a greater amount of exposure to extract recurrent patterns in auditory input , and may struggle with consolidating implicitly learned information over time (Hedenius et al., 2011). Experimental manipulations that increase the availability of redundant cues to linguistic structure have been shown to facilitate pattern extraction and generalization in infants, children, and adults (e.g., Brooks et al., 1993;Gerken et al., 2005). Similarly, word-learning studies suggest that inter-sensory redundancy and temporal synchrony between faces and voices, and between speech and gesture, aid speech perception and word-to-world mapping (cf. Gogate and Hollich, 2010, for a review). Although few studies have evaluated the putative benefits of providing redundant cues in intervention, in a promising line of research with a computer-generated avatar, Massaro and Bosseler (2006) provide evidence that attending to faces enhances speech perception in children with ASD, with benefits for vocabulary growth. Whether similar computerbased programs can be developed to help children with SLI extract and generalize statistical patterns in speech remains to be seen.

AUTHOR CONTRIBUTIONS
RO played a significant role in all aspects of this project including: Contributions to the design of the work and the acquisition, entry, analysis, and interpretation of the data, the write-up, drafting and revising the work, and submission. PB played a significant role in all aspects of this project including: Major contributions to the design of the work, data analysis and data interpretation, in addition to the write-up, revising the work, and final approval for publication. KP played major roles when it came to contributions to the design of the work and the acquisition, analysis, entry, and interpretation of the data, and write-up. KG-L played a significant role in many aspects of this project including: Contributions to the design of the work, data interpretation, in addition to the write-up, revising the work, in addition to final approval for publication. JL played a significant role in all aspects of this project including: Major contributions to the design of the work, data analysis, data entry, and data interpretation, in addition to the write-up, revising the work, and final approval for publication. All authors worked as collaborators and ensure accountability, integrity, and accuracy in the work.

FUNDING
This study was funded by the Doctoral Student Research Grant (DSRG) at the Graduate Center, CUNY.