# INVESTIGATING GRAMMAR IN AUTISM SPECTRUM DISORDERS

EDITED BY : Anna Gavarró and Stephanie Durrleman PUBLISHED IN : Frontiers in Psychology

### Frontiers Copyright Statement

© Copyright 2007-2018 Frontiers Media SA. All rights reserved. All content included on this site, such as text, graphics, logos, button icons, images, video/audio clips, downloads, data compilations and software, is the property of or is licensed to Frontiers Media SA ("Frontiers") or its licensees and/or subcontractors. The copyright in the text of individual articles is the property of their respective authors, subject to a license granted to Frontiers.

The compilation of articles constituting this e-book, wherever published, as well as the compilation of all other content on this site, is the exclusive property of Frontiers. For the conditions for downloading and copying of e-books from Frontiers' website, please see the Terms for Website Use. If purchasing Frontiers e-books from other websites or sources, the conditions of the website concerned apply.

Images and graphics not forming part of user-contributed materials may not be downloaded or copied without permission.

Individual articles may be downloaded and reproduced in accordance with the principles of the CC-BY licence subject to any copyright or other notices. They may not be re-sold as an e-book.

As author or other contributor you grant a CC-BY licence to others to reproduce your articles, including any graphics and third-party materials supplied by you, in accordance with the Conditions for Website Use and subject to any copyright notices which you include in connection with your articles and materials.

All copyright, and all rights therein, are protected by national and international copyright laws.

The above represents a summary only. For the full conditions see the Conditions for Authors and the Conditions for Website Use. ISSN 1664-8714 ISBN 978-2-88945-549-2 DOI 10.3389/978-2-88945-549-2

### About Frontiers

Frontiers is more than just an open-access publisher of scholarly articles: it is a pioneering approach to the world of academia, radically improving the way scholarly research is managed. The grand vision of Frontiers is a world where all people have an equal opportunity to seek, share and generate knowledge. Frontiers provides immediate and permanent online open access to all its publications, but this alone is not enough to realize our grand goals.

### Frontiers Journal Series

The Frontiers Journal Series is a multi-tier and interdisciplinary set of open-access, online journals, promising a paradigm shift from the current review, selection and dissemination processes in academic publishing. All Frontiers journals are driven by researchers for researchers; therefore, they constitute a service to the scholarly community. At the same time, the Frontiers Journal Series operates on a revolutionary invention, the tiered publishing system, initially addressing specific communities of scholars, and gradually climbing up to broader public understanding, thus serving the interests of the lay society, too.

# Dedication to Quality

Each Frontiers article is a landmark of the highest quality, thanks to genuinely collaborative interactions between authors and review editors, who include some of the world's best academicians. Research must be certified by peers before entering a stream of knowledge that may eventually reach the public - and shape society; therefore, Frontiers only applies the most rigorous and unbiased reviews.

Frontiers revolutionizes research publishing by freely delivering the most outstanding research, evaluated with no bias from both the academic and social point of view. By applying the most advanced information technologies, Frontiers is catapulting scholarly publishing into a new generation.

### What are Frontiers Research Topics?

Frontiers Research Topics are very popular trademarks of the Frontiers Journals Series: they are collections of at least ten articles, all centered on a particular subject. With their unique mix of varied contributions from Original Research to Review Articles, Frontiers Research Topics unify the most influential researchers, the latest key findings and historical advances in a hot research area! Find out more on how to host your own Frontiers Research Topic or contribute to one as an author by contacting the Frontiers Editorial Office: researchtopics@frontiersin.org

# INVESTIGATING GRAMMAR IN AUTISM SPECTRUM DISORDERS

Topic Editors: Anna Gavarró, Universitat Autònoma de Barcelona, Spain Stephanie Durrleman, Université de Genève, Switzerland

Autism Spectrum Disorder (ASD hereafter) is a neurodevelopmental condition characterized by deficits in communicative and social skills. The vast majority of research on language in ASD has focused on pragmatic difficulties, while less is known about structural aspects of language in this population. Work on syntax and phonology is not only sparse, but the heterogeneity in these grammatical domains has moreover led to conflicting reports that they are either intact or impaired. More remains to be understood about variations in grammatical profiles in ASD, as well as the relation of grammar to other cognitive abilities.

The body of research gathered here increases our understanding of the grammatical strengths and weaknesses in ASD. The contributions carefully elucidate the relations between grammar and other areas of cognition, as well as unveil the similarities and differences of grammar in ASD compared to other conditions. The result is a volume that provides new ways to think about language and communication in ASD, and beyond, which should be of interest to both linguists and clinicians.

Citation: Gavarró, A., Durrleman, S., eds. (2018). Investigating Grammar in Autism Spectrum Disorders. Lausanne: Frontiers Media. doi: 10.3389/978-2-88945-549-2

# Table of Contents


Nadezhda Modyanova, Alexandra Perovic and Ken Wexler

*29 ASD Is Not DLI: Individuals With Autism and Individuals With Syntactic DLI Show Similar Performance Level in Syntactic Tasks, but Different Error Patterns*

Nufar Sukenik and Naama Friedmann


Vikki Janke and Alexandra Perovic


Clara Andrés-Roqueta and Napoleon Katsos

# Editorial: Investigating Grammar in Autism Spectrum Disorders

### Stephanie Durrleman<sup>1</sup> \* and Anna Gavarró<sup>2</sup> \*

<sup>1</sup> Département de Psycholinguistique, Université de Genève, Geneva, Switzerland, <sup>2</sup> Departament de Filologia Catalana, Autonomous University of Barcelona, Barcelona, Spain

Keywords: autism spectrum disorders, syntax, finitiness, DLI, c-command, pragmatics, Wh- movement, control

### **Editorial on the Research Topic**

### **Investigating Grammar in Autism Spectrum Disorders**

Autism Spectrum Disorder (ASD hereafter) is a neurodevelopmental condition characterized by deficits in communicative and social skills. The vast majority of research on language in ASD has focused on pragmatic difficulties, while less is known about structural aspects of language in this population. Work on syntax and phonology is not only sparse, but the heterogeneity in these grammatical domains has moreover led to conflicting reports that they are either intact or impaired. More remains to be understood about variations in grammatical profiles in ASD, as well as the relation of grammar to other cognitive abilities.

### Edited by:

Alain Morin, Mount Royal University, Canada

### Reviewed by:

Markus Paulus, Ludwig-Maximilians-Universität München, Germany

### \*Correspondence:

Stephanie Durrleman stephanie.durrleman@unige.ch Anna Gavarró anna.gavarro@uab.cat

### Specialty section:

This article was submitted to Cognitive Science, a section of the journal Frontiers in Psychology

Received: 24 April 2018 Accepted: 30 May 2018 Published: 19 June 2018

### Citation:

Durrleman S and Gavarró A (2018) Editorial: Investigating Grammar in Autism Spectrum Disorders. Front. Psychol. 9:1004. doi: 10.3389/fpsyg.2018.01004

The purpose of this Frontiers Research Topic is to bring together investigations of grammar in ASD suggesting novel meaningful ways to parse the associated heterogeneity. Topics addressed include experimental investigations of domains delayed in Developmental Language Impairment (DLI), comparisons of the grammatical profiles of ASD with those of other language-impaired populations, careful analyses of subgroups, and the grammar-cognition interface.

Regarding domains delayed in DLI, the paper by Modyanova et al. focuses on the production of tense marking in a large study with Language-Impaired (ALI) and Language-Normal (ALN) English-speaking children with ASD. As a general finding, ASD children show no problem with subject-verb agreement or case, indicating that impairment does not affect syntax in a broad sense. The authors conclude that, while the ALN are not different from their verbal- and non-verbalmatched controls, the ALI are indeed impaired in tense production, even more severely than in the DLI population for which tense marking is well-established as a marker of impairment.

In this same vein, Sukenik and Friedmann investigate movement to non-argumental positions in ASD and DLI by means of subject and object relative clause elicitation, reading and rephrasing object relatives, and sentence repetition. While the results for the two populations appear similar, under closer scrutiny the errors of ASD and DLI participants are different in nature, with a distinct error pattern, syntactically driven in DLI but not in ASD, and consistently arising in individuals with DLI, but not in ASD. This result challenges the claims for a common source of language deficits in the two pathologies.

Khetrapal and Thornton examine linguistic competence in ASD via experiments tapping into knowledge of the structural relation of c-command. Previous work had suggested that children on the spectrum exhibited difficulties with this structural constraint, given that they struggled with reflexives. The high-functioning children in the current study, however, performed on a par to typically developing (TD) peers in computing both operator scope (negation and disjunction) and binding (reflexives), allowing the authors to conclude that the hierarchical relation of c-command is intact in high-functioning children with ASD.

**4**

Some contributions revisit the issue of pragmatic difficulty alongside structural language in ASD. Andrés-Roqueta and Katsos explain that the seemingly contradictory reports on pragmatic competence in ASD make sense once we separate linguistic pragmatics from social pragmatics. Linguistic pragmatics would be required by certain tasks assessing informativeness, metaphors, and idioms, and affected to the extent that structural language and vocabulary are impaired. Social pragmatics involves the ability to take others' perspectives into account and is affected to the extent that there is also a Theory of Mind (ToM) deficit.

Similarly, the paper by Janke and Perovic studies the interpretation of sentences with control in three conditions: complement control and temporal adjunct control, both syntactic dependencies, alongside controlled verbal gerund subjects, a pragmatic dependency. The two groups tested, high-functioning ASD children and a TD control group, performed in the same way in complement control and controlled verbal gerund subjects, and only marginally differently in temporal adjunct control, showing that syntax is unimpaired and pragmatics is not pervasively impaired.

Jyotishi et al. turn to wh-questions and argue that the lag in comprehension of these structures in ASD seems in part grammatical and in part social-pragmatic. It appears partially grammatical because in their study (i) it is observable despite a procedure itself reducing social/pragmatic demands and (ii) improved performance on wh-questions is predicted by higher performance on SVO word order. At the same time, socialpragmatic scores also play a role in predicting both ASD and TD groups' later comprehension of wh-questions.

Peristeri et al. investigate narrative production in highfunctioning children with ASD to reveal that higher linguistic abilities of some groups with ASD boost their narrative skills, both in syntactic and pragmatic domains. However, persistent difficulties in certain pragmatic domains can be observed alongside good language skills, suggesting that the latter do not necessarily allow children with ASD to overcome their pragmatic challenges.

In an attempt to elucidate the linguistic heterogeneity in ASD, Wittke et al. explore different language subtypes in a large group of 5-year-old children with ASD. Going beyond standardized tests for defining language groups, the authors also probe grammatical problems via a detailed analysis of natural language samples. The findings suggest the presence of several linguistic subtypes, ranging from intact language to minimally verbal. The identification of children showing high non-verbal reasoning and vocabulary in the presence of low grammatical abilities provides support for a specific impairment in grammar in this ASD subgroup.

Burnel et al. address the language/cognition interface by assessing belief attribution in neurotypical adults and adults with Asperger Syndrome (AS). In their results, neither neurotypicals nor those with AS were significantly affected by verbal shadowing; however adults with AS performed more slowly than neurotypicals, and were more disrupted in ToM tasks when asked to repeat complement clauses than relative clauses. The findings suggest that ToM reasoning in adults with AS involves compensation because of persistent ToM difficulties, a compensation which may specifically solicit complementation syntax.

The body of research gathered here increases our understanding of the grammatical strengths and weaknesses in ASD. The contributions carefully elucidate the relations between grammar and other areas of cognition, as well as unveil the similarities and differences of grammar in ASD compared to other conditions. The result is a volume that provides new ways to think about language and communication in ASD, and beyond, which should be of interest to both linguists and clinicians.

# AUTHOR CONTRIBUTIONS

All authors listed have made a substantial, direct and intellectual contribution to the work, and approved it for publication.

# ACKNOWLEDGMENTS

AG acknowledges the support of grants FFI2014-56968-C4-1-P and FFI2017-87699-P.

**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2018 Durrleman and Gavarró. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

# Grammar Is Differentially Impaired in Subgroups of Autism Spectrum Disorders: Evidence from an Investigation of Tense Marking and Morphosyntax

### Nadezhda Modyanova<sup>1</sup> \*, Alexandra Perovic<sup>2</sup> \* and Ken Wexler <sup>3</sup>

1 INTERLINK Language Center, Montana State University, Bozeman, MT, USA, <sup>2</sup> Department of Linguistics, Psychology and Language Sciences, University College London, London, UK, <sup>3</sup> Department of Brain and Cognitive Sciences and Department of Linguistics and Philosophy, Massachusetts Institute of Technology, Cambridge, MA, USA

### Edited by:

Anna Gavarró, Autonomous University of Barcelona, Spain

### Reviewed by:

Rosalind Jean Thornton, Macquarie University, Australia Maria Garraffa, Heriot-Watt University, UK

### \*Correspondence:

Nadezhda Modyanova nnm@alum.mit.edu Alexandra Perovic a.perovic@ucl.ac.uk

### Specialty section:

This article was submitted to Language Sciences, a section of the journal Frontiers in Psychology

Received: 19 November 2016 Accepted: 20 February 2017 Published: 28 March 2017

### Citation:

Modyanova N, Perovic A and Wexler K (2017) Grammar Is Differentially Impaired in Subgroups of Autism Spectrum Disorders: Evidence from an Investigation of Tense Marking and Morphosyntax. Front. Psychol. 8:320. doi: 10.3389/fpsyg.2017.00320 Deficits in the production of verbal inflection (tense marking, or finiteness) are part of the Optional Infinitive (OI) stage of typical grammatical development. They are also a hallmark of language impairment: they have been used as biomarkers in guiding genetic studies of Specific Language Impairment (SLI), and have also been observed in autism spectrum disorders (ASD). To determine the detailed nature of finiteness abilities in subgroups of ASD [autism with impaired language (ALI) vs. autism with normal language (ALN)], we compared tense marking abilities in 46 children with ALI and 37 children with ALN with that of two groups of nonverbal mental age (MA) and verbal MA-matched typically developing (TD) controls, the first such study described in the literature. Our participants' performance on two elicited production tasks, probing third-person-singular -s and past tense -ed, from the Rice/Wexler Test of Early Grammatical Impairment (TEGI, Rice and Wexler, 2001), revealed extensive deficits in the ALI group: their ability to correctly mark tense was significantly worse than their much younger TD controls', and significantly worse than that of the ALN group. In contrast, the ALN group performed similarly to their TD controls. We found good knowledge of the meaning of tense, and of case and agreement, in both ASD groups. Similarly, both ASD groups showed distributions of null or overt subjects with nonfinite and finite verbs in line with those found in young TD children. A key difference, however, was that the ALI group used (rather than simply omitted) the wrong tense in some sentences, a feature not reported in the OI stage for TD or SLI children. Our results confirm a clear distinction in the morphosyntactic abilities of the two subgroups of children with ASD: the language system responsible for finiteness in the ALN group seems to be functioning comparably to that of the TD children, whereas the ALI group, despite showing knowledge of case and agreement, seems to experience an extensive grammatical deficit with respect to finiteness which does not seem to improve with age. Crucially, our ALI group seems to have worse grammatical abilities even than those reported for SLI.

Keywords: autism, language development, language impairment, verbal inflection, optional infinitives, finiteness, morphosyntax

### Modyanova et al. Grammar in Subgroups of ASD

# INTRODUCTION<sup>1</sup>

One key difference between adult and child grammar, according to Wexler (e.g., 1994, 1998, 2003, 2004a, 2011), is that for typically developing (TD) children of a certain age, sentences may optionally be finite (with tense markers) or nonfinite (with infinitival form of the verb), hence the name Optional Infinitive (OI) Stage. Typical errors are illustrated below, where the child omits the present tense inflection -s with a regular verb in (1), or produces the actual infinitive form of an irregular verb, as in (2).


At the same time that TD children produce OIs, they demonstrate knowledge of the difference between nonfinite and finite verbs, as well as other important aspects of morphosyntactic structure (see Wexler, 1994, and Guasti, 2017, for crosslinguistic evidence). For example, in English, children's nonfinite verbs often occur with default case subject pronouns (accusative in English), and finite verbs predominantly appear with nominative case pronouns (Schütze and Wexler, 1996). Children of this age also know subject-verb agreement; they nearly always produce a third person singular subject when -s appears on the verb (e.g., Rice et al., 1995; Harris and Wexler, 1996). Children also do not misuse tense (i.e., they do not use present in the past tense context, and vice versa) (Rice et al., 1995; Schütze and Wexler, 2000). Finally, young children produce null subjects (i.e., they omit overt subjects from their sentences), and they do so more often with nonfinite verbs than with finite verbs (see Hyams, 2011; Wexler, 2013).

Difficulties with finiteness have been observed in other developmental disorders, most notably in Specific Language Impairment (SLI). Children with SLI are found to be considerably delayed in finiteness relative to both their TD language- and age- matched controls, and this phenomenon was termed the Extended Optional Infinitive (EOI) stage (Rice et al., 1995, 1998, 1999, 2000, 2009a; Rice and Wexler, 1996; Wexler, 1996; Wexler et al., 1998; among others). The work on EOI culminated in the creation of a diagnostic elicited production test, the Rice/Wexler Test of Early Grammatical Impairment (TEGI, Rice and Wexler, 2001), which we use in the present study.

Our paper addresses whether children with ASD show the pattern of morphosyntactic phenomena associated with the OI stage in TD and with the EOI stage in SLI so that we can answer whether children with ASD are in the EOI stage. This topic is particularly interesting in view of the current controversies on whether similarities in linguistic profiles of SLI and ASD are indicative of links between these two populations. Some researchers contend that patterns of linguistic impairments in ASD are reminiscent of those in SLI (e.g., Kjelgaard and Tager-Flusberg, 2001; Lindgren et al., 2009) and that the two populations may be on a continuum. However, others argue that the similarities in linguistic profiles of the two populations are only superficial (e.g., Bishop, 2003a; Whitehouse et al., 2008). To make the matter more complicated, the literature is not in agreement on whether grammatical morphology, and especially finiteness, is impaired in ASD at all. Early work on grammatical morphology in ASD reports deficits in the use of correct verbal inflection in spontaneous speech (Bartolucci et al., 1980), later confirmed experimentally by Roberts et al. (2004). However, more recent work, focusing on higher functioning children with ASD, reports little if any difficulty in this domain (Eigsti and Bennetto, 2009; Walenski et al., 2014). Crucially, the population with ASD is known for extreme heterogeneity in language and cognitive skills; thus, different patterns may be observed in children who are higher functioning in terms of language and nonverbal reasoning, vs. those who are at the lower end of the spectrum in these domains.

To establish whether there exists a deficit in finiteness in ASD, similar to that observed in SLI, we carried out an experimental study employing materials used to establish finiteness difficulties in SLI, in a large number of children with ASD of heterogeneous abilities, divided into two groups in line with the classifications in the literature (Kjelgaard and Tager-Flusberg, 2001): those with relatively spared language ("autism language normal," ALN) and those with an established language impairment ("autism language impaired," ALI). To presage our results, children with ALN will show evidence that they are outside of the OI stage; in fact, their nonfiniteness levels will be not much higher than their TD controls'. On the other hand, children with ALI will not only show very low finiteness rates and some properties typical of the OI stage, but they will also show some other properties that are quite inconsistent with the OI stage, representing a deviant and disrupted grammar. The most striking example will be the large number of errors with using the wrong tense, a property not found in TD or SLI children. These results will suggest that children with ALI are not simply children with ASD who also have SLI.

# Finiteness in ASD

Considering how well researched finiteness is in TD and in SLI, and especially the current debates of possible links between SLI and ASD (e.g., Tager-Flusberg, 2015), it is surprising how little research there exists on this topic in ASD.

Early studies focusing on grammatical morphology in spontaneous speech in children with ASD report difficulties with both past and present tense; however, results from these studies, which included small samples of children with autism and with

**Abbreviations:** ALI, autism language impaired; ALI-TD, autism language impaired—typically developing controls; ALN, autism language normal; ALN-TD, autism language normal—typically developing controls; ASD, autism spectrum disorders; ATOM, Agreement Tense Omission Model; CA, chronological age; CELF, Clinical Evaluation of Language Fundamentals; DS, Down syndrome; EOI, extended optional infinitive stage; KBIT, Kaufman Brief Intelligence Test; MA, mental age; MLU, mean length of utterance; n.s., nonsignificant; NVIQ, nonverbal intelligence quotient; OI, optional infinitive; PPVT, Peabody Picture Vocabulary Test; RS, raw score; SLI, specific language impairment; SS, standard score; TD, typical development or typically developing; TEGI, Rice/Wexler Test of Early Grammatical Impairment; TROG, Test of Reception of Grammar; WS, Williams syndrome.

<sup>1</sup>The terms "tense marking" and "finiteness" will be used interchangeably throughout the paper to refer to the phenomenon that verbs in most main clauses in adult sentences must be marked for tense, which makes them finite.

heterogeneous language and cognitive abilities, are far from clear. Bartolucci et al. (1980) report that 10 ten-year-old boys with impaired nonverbal IQ (NVIQ) marked present tense correctly 86% of the time, irregular past tense 94% of the time, but regular past tense only 77% of the time. Somewhat higher performance is reported by Howlin (1984) in a sample of 16 autistic 8-yearold boys with normal NVIQ but delayed language: they marked present tense correctly 85% of the time, regular past tense at 84%, and irregular past tense at 97%. Only Bartolucci et al. (1980) included TD controls [matched on nonverbal mental age (MA)]; Howlin (1984) did not.

More recent studies used methods more akin to those used in SLI research, such as TEGI-type tasks which employ constrained elicited production of present and past tenses, rather than spontaneous speech. Botting and Conti-Ramsden (2003) compared past tense in 29 ten-year-olds with SLI and 13 agematched children with ASD with borderline or normal NVIQ but heterogeneous language skills. An equally poor performance on past tense was reported in both populations; however, no details of differences on regular vs. irregular verbs were given, making finiteness rates impossible to determine.

In the only study that divides children with ASD into subgroups according to impaired or unimpaired language, as measured by vocabulary skills, Roberts et al. (2004) report high rates of tense marking in children with normal language and normal NVIQ (n = 27, 86% correct on composite past and 81% on present tense), somewhat worse performance in the group of children with "borderline" language scores (n = 16, 86% correct on past and 69% on present tense), and the worst performance in the group with impaired language abilities and borderline NVIQ (n = 19, 68% correct on past and 65% correct on present tense). This study did not have any control participants.

Studies using different methodologies still show subtle differences in the mastery of finiteness in ASD compared to control children. Eigsti and Bennetto (2009), using a grammaticality judgment task, report a relatively good performance in 21 high-functioning children with ASD aged 9–17 years (with high VIQ, NVIQ, and vocabulary scores). However, their performance on present and past tense was still worse compared to TD controls matched on a range of variables, such as age, nonverbal and verbal IQ, vocabulary, gender, and socioeconomic status.

The only study to show an age appropriate performance by children with ASD is Walenski et al. (2014). 20 ten-year-old high functioning boys with ASD, with normal nonverbal and verbal IQ and reading levels, showed performance of 96% correct for regular past tense and 64% overall for irregular past tense form (with 23% over-regularization rates), which was comparable to their age, IQ, and reading level-matched TD controls.

Taken together, these findings demonstrate a wide range of finiteness abilities for children with ASD. A clear trend is that children with ASD show finiteness performance below their chronological age level, just like children with SLI. Compared to TD controls (matched at least on NVIQ for all studies that used them), children with ASD are usually worse on tense marking, but this largely depends on whether ASD participants have higher or lower NVIQ levels.

# The Present Study

A major aim of the present study is to compare the production of finiteness in children with ASD to that of TD children functioning at similar nonverbal MA level. This approach allows us to characterize precisely the severity of the deficit in finiteness found in children with ASD relative to TD peers. Furthermore, we aim to infer whether the language system underlying finiteness in ASD is intact, delayed (showing similar patterns as younger matched TD controls), deviant (showing patterns not found in TD at all) or disrupted (showing worse performance than the youngest TD controls, that is a severe delay, suggesting an "asynchrony" in development, as is the case for children with SLI) (Rice, 2007:416). Thirdly, we aim to distinguish our results depending on whether the children with ASD are classified on the basis of their general language skills into those who have normal language (ALN) or those with impaired language (ALI).

In our study, the children with ALI have not only a deficit in general language ability but also nonverbal IQ deficits. The question we will want to ask is whether any delay that these children show on finiteness is due solely to their lower IQ and their lower level of general language abilities, or whether it goes beyond these. Our hypotheses are illustrated in (3).


Each of these four possibilities is a potential hypothesis. Of course, given the literature survey that we have just presented, we do not expect that children with ALI will show profile (3a), an intact pattern. So we can take profiles (3b, c, d) (deviant, delayed, disrupted) as hypotheses to test. We will not select one of these as our only hypothesis here; rather, the goal is to carry out a study that allows us to decide between these.

Although our study does not contain children with SLI, we can compare rates of finiteness with children with SLI from the literature. If our children with ALI show similar rates of finiteness as children with SLI functioning at comparable levels of cognitive and language abilities, this would support the idea that children with ALI have both ASD and SLI. Additionally, if children with ALI show a disrupted pattern of finiteness with respect to TD controls, then children with ALI would be like children with SLI with respect to this piece of language. However, if children with ALI show lower rates of finiteness than children with SLI, we can conclude that development of children with ALI is even more disrupted than that of children with SLI, and that there is more to language impairment in ALI than what is found in SLI. Moreover, children with ALI may potentially show a pattern of deviance that children with SLI do not show.

For children with ALN, grammatical deficits are not expected by definition. However, it is still important to compare knowledge of grammar in children with ALN relative to that of TD children to establish whether indeed children with ALN are "language normal" with respect to finiteness. It is always possible that a deficit in finiteness is not picked up by the standardized tests that establish a child as being ALN. We can thus consider the same hypotheses (3) for ALN.

Our choice of method of constrained elicited production, rather than natural production, is motivated by the following considerations. To determine the rate of finiteness, it is necessary to count not only children's production of relevant morphemes, but also omission thereof in obligatory contexts. Proportion of usage of an obligatory morpheme in obligatory contexts is the central measure that has been used in studies of production data concerning morphology since at least Brown (1973). These contexts can be difficult to determine precisely in natural production, especially in a language like English with a relatively impoverished system of verbal morphology. If a child omits verbal inflection, and also omits the subject, it is often impossible to tell whether this is a third person singular null subject with a bare stem (an optional infinitive, e.g., "[he] go"), or a first or second person null subject with the correct form of the verb with null inflection (e.g., "[I] go"). It is even more difficult to determine which tense was intended in a natural production context. These issues could be further compounded by the difficulties children with ASD have with attention and coherent dialogues, among other factors. Therefore, it is important to gather data on rates of finiteness in a controlled context, one in which the intended subject and the intended tense are unambiguously provided by the experimental context. The TEGI elicitation task does exactly this, providing past and present tense contexts with third person singular subjects, allowing for accurate measurement of finiteness rates. Furthermore, since TEGI was used to study language in other impairments (especially SLI), we have the possibility of comparing rates of finiteness in our participants with those in the literature.

As observed in the literature review, one of the challenges in making sense of the data in ASD is the heterogeneity in the verbal and nonverbal abilities in this population. To control for the heterogeneity of our participants' abilities, we follow recent literature and divide our participants into two groups based on their language-related phenotypes: Autism Language Normal (ALN) and Autism Language Impaired (ALI) (e.g., Kjelgaard and Tager-Flusberg, 2001; Roberts et al., 2004; Whitehouse et al., 2008; Perovic et al., 2013b). This yielded two relatively homogeneous ASD groups with respect to their productive and receptive language abilities, as well as nonverbal reasoning abilities.

We focus on comparisons of ALI and ALN groups and their matched TD control groups on finite responses on past and present tenses, as well as a recalculation based on all response types (percent correct vs. bare form vs. other responses), in obligatory contexts. We analyze participants' responses for other morphosyntactic aspects of the OI stage. We further evaluate our participants' finite responses via the criterion scores developed by Rice and Wexler (2001) to determine whether a participant scores at or below his or her chronological age.

To establish the influences of general grammar and vocabulary skills, nonverbal IQ and autistic symptomatology, we calculate correlations between children's finiteness levels and their scores on standardized tests of language, nonverbal reasoning, and measures that are used as a gold standard for diagnosis of ASD in the research literature, ADOS and ADI-R (Autism Diagnostic Observation Schedule: Lord et al., 2000; Autism Diagnostic Interview-Revised: Rutter et al., 2003). To our knowledge, this is the first such range of analyses described in the literature specifically for markers of tense. Our goal here is to understand whether deficits with finiteness correlate with autistic symptomatology, or whether they are independent of any autistic traits, and what the results mean for the computational linguistic abilities of these two groups of children with ASD. Finally, we compare our results with those from children with SLI from Rice and Wexler studies, to try to determine whether linguistic deficits in ASD are the same as linguistic deficits in SLI, with respect to early morphosyntax.

# METHODS

# Participants

One hundred and sixty-four children participated in the study: 83 children with ASD (Chronological Age (CA): 4.35–16.3 years; 11 female)<sup>2</sup> , and 81 TD controls (CA: 3.5–17.1; 36 female). Their age and scores on standardized measures of nonverbal and verbal abilities are given in **Table 1**: NVIQ: the Matrices subtest of Kaufman Brief Intelligence Test (KBIT; Kaufman and Kaufman, 1990); expressive vocabulary: the Vocabulary subtest of KBIT (only for children with ASD); receptive vocabulary: the Peabody Picture Vocabulary Test Third Edition (PPVT-3; Dunn and Dunn, 1997); receptive grammar: Test of Reception of Grammar Second Edition (TROG-2; Bishop, 2003b).

This study was approved by the Committee on the Use of Humans as Experimental Subjects at the Massachusetts Institute of Technology. Written parental consent was obtained for all participants.

Fifty-eight participants with ASD were recruited with the help of the Division of Developmental Medicine, Boston Children's Hospital (BCH), Harvard Medical School, for the Simons Simplex Collection of the phenotypic and genetic factors in ASD (Lord et al., 2012). Twenty-five were recruited via parent support groups based in the greater Boston, Massachusetts area. All participants met the clinical criteria for ASD according to DSM-IV (American Psychiatric Association, 2000), which were confirmed for 49 participants by BCH via Autism Diagnostic Interview–Revised and Autism Diagnostic Observation Schedule, scores for which were provided to us<sup>3</sup> .

<sup>2</sup>A further 29 children were recruited, but had to be excluded for various reasons (detailed here and in the footnotes to follow). Of these, 11 participants with ASD were excluded due to their inability to complete the battery.

<sup>3</sup>For 15 participants, confirmed diagnoses were not available due to difficulties with data sharing. On average, our study was performed with ASD participants 1.14 years (SD = 1.06) after the administration of ADOS and ADI-R.


TABLE 1 | Ages and mean scores (standard deviation) and ranges on standardized tests of language and cognition for the four participant groups.

\*p < 0.05, \*\*p < 0.01, \*\*\*p < 0.001.

Measures on which relevant groups were individually matched: KBIT Matrices raw score for ALI and ALI-TD, and ALN and ALN-TD. ALI, Autism Language Impaired; ALN, Autism Language Normal; TD, Typically Developing controls; RS, Raw Score; SS, Standard Score.

<sup>a</sup>Significances for group differences are based on pairwise comparisons (Bonferroni corrected) following a MANOVA with age, raw and standard scores as dependent variables, the four participant groups as the independent variable, and gender as a covariate. There was not a significant effect of gender F(7, <sup>144</sup>) = 1.142, p = n.s., but a significant effect of group F(21, <sup>414</sup>) = 14.83, p < 0.001; Wilks' lambda = 0.2.

In six of these participants, diagnosis was confirmed by either ADI-R or ADOS, but not both<sup>4</sup> .

In part following Perovic et al. (2013b) we divide our participants into two groups based on their language abilities: Autism Language Normal (ALN, n = 46, 5 female) and Autism Language Impaired (ALI, n = 37, 6 female), according to their scores on the Vocabulary subtest of KBIT, a measure of expressive vocabulary, and measures of receptive vocabulary, PPVT-3, and receptive grammar, TROG-2 (cf. Whitehouse et al., 2008). To balance the one measure of grammar against the two measures of vocabulary, we used the procedure in (4) and (5) to form groups<sup>5</sup> .


scores on both vocabulary tests, but with TROG-2 scores in the impaired range (SS of 69 or below), were also assigned to the ALI group. Finally, one participant who scored below the 10th percentile on both vocabulary tests, but had SS of 85 on TROG-2, joined the ALI group<sup>6</sup> .

TD controls were recruited from Boston and Cambridge, Massachusetts area daycares and afterschool programs, and had no known cognitive or language delays or hearing impairments. They were individually matched to ASD participants on the raw score of KBIT Matrices, forming two groups: TD controls of ALI group (ALI-TD, n = 36, 14 female) and TD controls of ALN group (ALN-TD, n = 45, 22 female)<sup>7</sup> .

**Table 1** shows that the ALN and ALI groups were also matched to their respective control groups on verbal MA (raw scores on PPVT-3), while the ALN group was also matched to their TD control group on receptive grammar (raw scores on TROG-2). It should be noted that the ALI group's raw scores on receptive grammar were age-equivalent to 4;5 in TD, which is much lower than the ALI-TD group's chronological age of 6;0. The ALN and ALI groups were matched on age, but on all other measures the ALN group had significantly higher scores than the ALI group. Finally, the ALN-TD and ALI groups are comparable in their age.

The groups were not gender-matched due to a limited sample size of TD controls. To control for effects of gender, we included this variable as a covariate in our subsequent analyses: no significant effect of gender was found.

<sup>4</sup>Three participants did not meet the criteria of either ADOS or ADI-R and were thus excluded despite their clinical diagnoses of Pervasive Developmental Disorder—Not Otherwise Specified (PDD-NOS).

<sup>5</sup>This classification left us with 14 "borderline" participants who scored at or above the 10th percentile on both vocabulary tests, but below the 10th percentile on TROG-2, with SS of 72–79, and thus could not be classified into either ALN or ALI (cf. Roberts et al., 2004). Their small number, 14, compared to large numbers in the ALN (n = 46) and ALI (n = 37) groups precluded us from forming a separate group; thus, it was decided to exclude these children from the current analysis. Of note, this group's performance on finiteness and other measures was in between that of the ALI and ALN groups.

<sup>6</sup> Seventeen participants with ALN and 19 participants with ALI were included in the sample in Perovic et al. (2013b)'s study of binding.

<sup>7</sup>For one ALI participant and one ALN participant, the KBIT Matrices scores were unavailable, leaving them with no matched TD controls.

Finally, because not all children with ASD had ADOS and ADI-R scores, we did not match the ALI and ALN groups on these measures. It is notable that for the subgroups of ALN (n = 28) and ALI (n = 21) that had these scores, there were significant group differences on ADOS Communication [ALI mean = 4.52, SD = 2.04; ALN mean = 2.04, SD = 1.64; t(47) = 4.73, p < 0.001] and ADOS Social interaction [ALI mean = 8.86, SD = 3.34; ALN mean = 5.25, SD = 1.43; t(47) = 5.14, p < 0.001]. The higher scores here indicate a greater severity of ASD symptoms. There were no other significant differences (after correcting for multiple comparisons) between ASD groups on other ADOS or any of the ADI-R scores.

# Experimental Materials and Procedure

Three picture probes of the Test of Early Grammatical Impairment (TEGI, Rice and Wexler, 2001) were used: phonological probe, present tense, and past tense.

The phonological probe determined whether children could pronounce the consonant sounds relevant to present and past tense inflections, /s/, /z/, /t/, and /d/. Participants had to correctly articulate at least four of five words in order to pass. If participants could not provide the required word on their own, they were asked to repeat it after the experimenter.

The present tense probe assessed whether children could produce third person singular inflection using a representative picture for a profession and the following prompt: "This is a teacher. Tell me what a teacher does." If participants replied with a plural subject, e.g., "Teachers teach" or without a subject, e.g., "Teach," they were reprompted to provide an answer with a singular subject. Following the manual, we used such phrases as "Say a whole sentence," or "Start with he or she." If that did not work, the experimenter started the sentence for the participant, saying, e.g., "A teacher..." after which the participant sometimes completed the sentence. Finally, if a child produced an answer which was semantically appropriate but was neither finite nor nonfinite, e.g., a progressive form, the experimenter agreed with the participant, and then prompted him or her with, "Tell me what else a teacher does?" Often, especially with lower functioning or younger participants, these prompts did not produce the desired kind of response, and we simply recorded whatever answers the participants provided. There was one training example and 10 trials.

The past tense probe assessed whether children could produce the -ed suffix on regular verbs, or the irregular past tense form of irregular verbs. The prompt involved two pictures, with one picture showing ongoing action, e.g., "Here the girl is skating." The next picture showed the action completed and was accompanied with the prompt "Here she is done," followed by "Tell me what she did." The same reprompting questions were used by the experimenters in this probe as for the present tense probe, above. Occasionally, children produced a regular verb instead of an irregular verb, and reprompting did not yield the correct verb. There were two training examples, and 10 trials for regular verbs and 8 for irregular verbs.

# Scoring and Analysis

Answers were scored following the instructions in the TEGI Manual (Rice and Wexler, 2001). Correct (finite) answers included appropriate -s or -ed inflections on the verbs or the correct irregular past tense form. Incorrect (nonfinite) answers included bare stems of the verbs. Over-regularized verb forms, e.g., "digged," were scored as correct, i.e., finite. The rate of finiteness was calculated as: finite responses/(finite + nonfinite responses). The rate of over-regularization was analyzed as the number of over-regularized forms out of the total number of irregular verbs that were scorable, including nonfinite (bare) forms. A series of univariate ANOVAs was performed for rates of finiteness, following analyses carried out in Rice and Wexler studies. Group differences throughout were identified by pairwise comparisons, Bonferroni corrected by SPSS Version 23.

"Unscorable" answers, ones that we did not purposefully elicit, were those without verbs (nouns only), inappropriate tenses for the prompt, and responses with "does," "did" or "done." Repeated use of "he/she finished" was also marked unscorable (c.f. Rice and Wexler, 2001). Verbs with the -ing suffix, whether or not produced with a relevant auxiliary, were also marked as unscorable, following Rice and Wexler (2001). Unscorable answers were not included in the denominator for percent of finite responses.

The distribution by all response types, that is percentage of correct forms vs. bare forms vs. unscorable or unattempted answers, was also analyzed separately, following Roberts et al. (2004). Here, the number of both attempted and unattempted verbs was included in the denominator. In this case, a series of univariate ANOVAs was performed for each tense and response type.

Unscorable answers were analyzed separately for those participants that produced them for any tense or other errors. All answers were analyzed for other aspects of morphosyntax, namely subject-verb agreement and case.

We also counted whether there was an overt or a null subject with an inflected or nonfinite (bare) verb. Because the only appropriate answers to our elicitation task contain third person singular subjects and verbs (and also because when the children do produce a subject, it and the verb are overwhelmingly in third person, if the verb is finite), it is highly unlikely that participants intended first or second person subjects as their null subjects, so we count bare stem verbs without subjects as being nonfinite. Responses in which subjects were prompted by the experimenter were excluded from these counts. Only responses in which it was clear that the participants produced or omitted a subject by themselves (without an extra prompt) were counted. Correct responses consisted of an overt third person singular nominative pronoun subject (or a noun phrase) and a finite verb. The relationship between presence of subjects and finiteness of verbs was tested using chi-square.

# RESULTS

A few participants from the ALI group did not produce any scorable responses for certain tenses, and were thus excluded from analyses for that tense: two were excluded from analyses for present tense (n = 34), one from regular past tense (n = 35), and three from irregular past tense (n = 33). Their scorable responses for the remaining tenses were included, so that for all past tense there were 36 ALI participants.

For one third of ALI-TD and about half of ALN-TD participants, detailed scores of the probes were not available, just the rates of finiteness for present and all past tenses in percent (included in **Table 2** and counted for Criterion scores in **Table 14**). There were no significant differences in the rates of finiteness for present and overall past tenses between the TD subgroups with vs. without such scores. For this reason, we move forward with the reduced number of TD participants with respect to the details of regular and irregular past tenses (**Table 3**), and reanalysis by response type (**Tables 4**–**6**) (n = 26 for ALI-TD, n = 24 for ALN-TD)<sup>8</sup> .

# Phonological Probe

All ALN and ALN-TD participants passed all 4 subtests of the phonological probe. One ALI participant did not pass one of the

TABLE 2 | Percent finite responses with mean (standard deviation) for present tense and all past tense (regular and irregular combined) probes.


\*\*\*p < 0.001.

four subtests of the probe, the /z/ sound, but passed the other sound relevant to present tense: /s/. He scored 43% on present tense, which is identical to his performance on past tense (43%). One child in the ALI-TD group did not pass the /t/ sound. This child scored 100% correct on past tense, however.

# Rates of Finiteness

### Present and Past Tenses

The participants' mean rates of finiteness, that is the mean of individual finite responses divided by the individual's sum of finite and nonfinite (bare) responses, are illustrated in **Table 2** for present tense and all past tense (average of finite irregular and regular past tense responses). A series of univariate ANOVAs was performed, with participant group (ALI, ALI-TD, ALN, ALN-TD) as between subjects factor and percent of finite responses for each tense as the dependent measure. Gender was entered as a covariate. While there was no significant effect of gender for any tense, the effect of group was significant for present tense F(3, 156) = 11.37, p < 0.001 (η 2 <sup>p</sup> = 0.18) and for all past tense F(3, 158) = 13.23, p < 0.001 (η 2 <sup>p</sup> = 0.2). Pairwise comparisons (Bonferroni corrected) indicated that the ALI group performed well below all other groups (all ps < 0.001) on both tenses, while the ALN group performed no differently from either of the TD control groups on the same probes. Differences between present and all past tense performance were not significant in each group.

### Regular and Irregular Past Tenses

**Table 3** shows the specifics of participants' finiteness rates for regular and irregular past tense verbs. The finiteness rate for regular past tense was calculated as in the previous section for present and all-past tenses. The rate of morphologically correct irregular past tense forms was derived by dividing the number of such forms by the total number of scorable irregular verbs (including bare forms) for each participant. The rate of over-regularization was similarly calculated as the number of over-regularized forms divided by the total number of scorable irregular verbs. The rate of finite responses for irregular past tense was the sum of the rate of correct irregular form responses and the rate of over-regularized form responses.


\*p < 0.05, \*\*p < 0.01, \*\*\*p < 0.001.

<sup>8</sup>When we report the data that go beyond the major planned measure of this paper, that is finiteness rates, we compare ALN and ALI. We do not report results for TD. This is because in moving a laboratory, we lost the original data sheets for the TD participants. We already had their finiteness measures, per participant, but had not yet analyzed the extra measures (like null-subjects) that we calculated. Since we have the ALN measures, which show very little error on many of these responses, we can compare them to ALI.

### TABLE 4 | Percent of response types with mean (standard deviation), and sums of responses, for present tense probe.


\*p < 0.05, \*\*\*p < 0.001.



\*p < 0.05, \*\*p < 0.01, \*\*\*p < 0.001.

TABLE 6 | Percent of response types with mean (standard deviation), and sums of responses, for irregular past tense probe.


\*p < 0.05, \*\*p < 0.01, \*\*\*p < 0.001.

The statistical analyses were performed as in the previous section.

We find a significant effect of group for regular past tense finite responses F(3, 125) = 7.63, p < 0.001 (η 2 <sup>p</sup> = 0.16), irregular finite responses F(3, 123) = 6.36, p < 0.001 (η 2 <sup>p</sup> = 0.13), and irregular correct form responses F(3, 123) = 6.18, p = 0.001 (η 2 <sup>p</sup> = 0.13), but not for over-regularized past tense forms. Gender was not significant as a covariate. Pairwise comparisons (Bonferroni corrected) indicated that the ALI group performed well below all other groups (p = 0.001 for ALI-TD, p < 0.001 for ALN, and p = 0.009 for ALN-TD) on regular past tense. On finite responses for irregular past tense, the ALI group was also worse than other groups: ALI-TD (p = 0.025), ALN (p = 0.001), and ALN-TD (p = 0.011). On the correct forms for irregular past tense, however, the ALI group did not differ from the ALI-TD group, but did perform worse than the ALN and the ALN-TD groups (p = 0.005 and p = 0.002, respectively). The ALN group did not differ from either of the TD control groups. No significant differences were observed within groups between regular and irregular finite tense responses.

# Performance by Response Type

We reanalyzed our data by percent of all response types, so that the denominator includes all verbs in the probe, not just those responses that are scorable. Sums of raw numbers are also indicated in **Tables 4**–**6**. A series of univariate ANOVAs was performed for each tense and response type, with participant group (ALI, ALI-TD, ALN, ALN-TD) as the between-subjects factor. Gender was added as a covariate.

There was significant effect of group for most response types in most tenses. There was no significant effect of gender. Pairwise comparisons (Bonferroni corrected) indicated that the ALI group performed worse than other groups on percent correct, and had more percent bare, percent unscorable and percent unattempted (no response) responses.

For present tense (**Table 4**), the following variables had a significant effect of group: correct F(3, 126) = 14.17, p < 0.001 (η 2 <sup>p</sup> = 0.25, ALI < all others, all ps < 0.001), bare stem F(3, 126) = 3.87, p = 0.011 (η 2 <sup>p</sup> = 0.08, ALI > ALN-TD p = 0.011 only), and unscorable F(3, 126) = 13.65, p < 0.001 (η 2 <sup>p</sup> = 0.25, ALI > all others, all ps < 0.001). Unattempted responses were not significantly different among groups F(3, 126) = 1.85, p = n.s.

For regular past tense (**Table 5**), the following variables had significant effect of group: correct F(3, 126) = 15.77, p < 0.001 (η 2 p = 0.27, ALI < all others, all ps < 0.001), unscorable F(3, 126) = 11.7, p < 0.001 (η 2 <sup>p</sup> = 0.22, ALI > ALN and ALN-TD, p < 0.001, and ALI > ALI-TD, p = 0.001), and unattempted F(3, 126) = 4.82, p = 0.003 [η 2 <sup>p</sup> = 0.10, ALI > ALI-TD (p = 0.038) and ALN (p = 0.004)]. Bare stem responses were not significantly different among groups F(3, 126) = 2.25, p = n.s.

For irregular past tense (**Table 6**), the following variables had significant effect of group: finite forms F(3, 126) = 15.75, p < 0.001 (η 2 <sup>p</sup> = 0.27, ALI < all others, all ps < 0.001), correct irregular form F(3, 126) = 10.91, p < 0.001 [η 2 <sup>p</sup> = 0.21, ALI < ALI-TD (p = 0.039), ALI < ALN and ALN-TD (p < 0.001)], unscorable F(3, 126) = 8.89, p < 0.001 [η 2 <sup>p</sup> = 0.18, ALI >ALI-TD (p = 0.004), ALI > ALN (p < 0.001), ALI > ALN-TD (p = 0.002)], and unattempted F(3, 126) = 4.56, p = 0.005 [η 2 <sup>p</sup> = 0.1, ALI >ALI-TD (p = 0.042), ALI > ALN (p = 0.005)]. For bare stem and over-regularized form responses, the effect of group was not significant.

# Unscorable Responses: Tense Substitutions, Use of Progressives, and Auxiliary Omissions

In the ALI group, there was a total of 193 "unscorable" answers for 26 participants, a rate of 19.0% given 1,015 total responses across tenses and participants.

The ALN group had just 24 unscorable responses in 16 participants, a rate of 1.9% given 1,261 total responses across tenses and participants.

**Table 7** summarizes participants' unscorable responses for present and past tense probes, and also the correct and bare stem responses for comparison. The total number of unscorable responses characterized in the table for the ALI group is greater than 193 because some participants' responses included multiple types of answers.

### Unscorable Tense Responses in the ALI Group

Overall, the ALI participants were hardly ever confused between the simple past and simple present tenses, giving a total of 10 errors out of 572 uses of simple tenses for all participants for all probes, a rate of 1.7%.

In the present tense probe, the progressive participle -ing was used 45 times, 12.4% of the 363 total responses for that probe. Of these 45, 24 were with present tense auxiliary, and 21 were without auxiliary, a 46.7% rate of auxiliary omission.

TABLE 7 | Number of each type of response, including correct, nonfinite, and "unscorable" responses (with number of participants giving each type of response).


In the past tense probe, present tenses were used 40 times, 6% of the 652 total responses for that probe. The majority of these (21) were in present progressive tense, with 2 participants contributing 14 of these. These two participants produced proper past tense morphology only four times between them. Other present tense responses included simple present, present tense auxiliary with bare verb, and "s/he is (all) done."

A present participle (stem + ing) without an auxiliary occurred an additional 35 times (5.4% of total responses for past tense probe), with two participants contributing 24 of those (these are different participants from the 14 present progressive responses, above). This yields a 62.5% auxiliary omission rate. Participants who omitted auxiliaries were not significantly more likely to produce bare stem verbs.

In the past tense probe, there was one case of simple present tense together with past tense overregularization, "catchesed." Finally, there was one future tense that is interpreted as future/intention, "she's gonna run" for the picture with a girl tying her shoelaces.

Across the probes, there were three instances (one each from three ALI participants) using the auxiliary "is" and a bare form of a verb, omitting -ing: in present tense, "a girl is ride(ing) her bike," "he's take," giving a rate of 8% of -ing omission for finite progressive tense responses; in the past tense, "the boy is splash" (5%)<sup>9</sup> . There was also "he's clean" in the past tense probe for a picture of a boy having brushed his hair, and this could be either missing -ing or adjective use.

### Unscorable Tense Responses in the ALN Group

Nobody in the ALN group misused past tense in the present tense probe and vice versa, that is 0 out of 1,130 simple tense responses for all ALN participants in all probes. In addition to those responses detailed in **Table 7**, there was also one instance of negation use in the present without auxiliary, "baby not get hurt." Rate of -ing omission was 0%. Rate of auxiliary omission in present progressive use was 62.5% (5/8) responses across all probes. Participants who omitted auxiliaries were not significantly more likely to produce bare stem verbs.

# Case and Subject-Verb Agreement with Respect to Number and Person

The responses of participants with ALI and ALN were also examined to establish the presence of any difficulties with morphosyntax, specifically with case marking and subject-verb agreement. We found no such errors: for example, no participant used a first or second person pronoun with third person singular verbal inflection in present tense, and none misused case on pronouns.

There were only three instances of first person pronoun in nominative case for the subject, and only two in accusative case for the object. The second person pronoun "you" was used for a subject by only one person in two complex sentences. "You" was regularly used for objects, especially with a picture of a dad or a nurse, "... helps you," a total of 11 times for 8 ASD participants. The determiner "your" was used primarily with a picture of a dentist, e.g., "(...) clean(s) your teeth," a total of 19 times for 18 ALI/ALN participants.

All pronouns that were used were in appropriate cases for their sentence role, with nominative for subjects, accusative for objects, possessive/genitive in relevant constructions.

For third person singular present tense, pronouns "he" or "she" or "it" were 100% correctly used, as the subject by 17 ALI participants: 49 times with finite verbs and 10 times with nonfinite verbs; and by 11 ALN participants: 46 times with finite verbs and 3 times with nonfinite verbs. Noun phrases were used as the subject by 10 ALN participants: 56 times for finite and 3 times for nonfinite verbs; and by 9 ALI participants: 33 times for finite verbs and 8 times for nonfinite verbs. There were no instances of incorrect use.

Because the probes focused on the elicitation of singular subjects, plural subjects were not purposefully elicited. Children with ASD did not make any agreement errors here, with the two overt plural subjects that two participants in the ALN group produced showing correct agreement. One plural subject in the ALI group was also appropriate.

# Null vs. Overt Subjects with Nonfinite and Finite Verbs

The presence of null or overt subjects was calculated in 70% of ALI participants (n = 26) and 41% of ALN participants (n = 19)<sup>10</sup> .

We begin by comparing null vs. overt subjects within and between groups (collapsing across non/finite verbs) (**Table 8**). There were no significant differences between groups for these measures: both groups produced similar overall proportions of null and overt subjects. Within groups, for both ALI and ALN, the difference between the rate of null and overt subjects in past tense was significant: t(23) = 2.98, p = 0.007 for the ALI group, and t(16) = 3.05, p = 0.008 for the ALN group. This difference between past and present tenses is likely due to the different probes. The present tense probe asked e.g., what "a nurse" generically does, and it seems quite felicitous to respond without a subject, whereas the past tense probe asked which specific completed activity a specific person, e.g., "the girl," carried out, and a null subject seems to be not nearly as felicitous.

**Table 9** indicates the relevant sums across participants and the rates of null subjects for each verb type. For present tense, there was a significant difference for correct responses (overt

<sup>9</sup>A reviewer made the interesting suggestion that instead of being instances of -ing omission, the three sentences could be instances of tense marking in the auxiliary. This would be plausible, that is, grammatical for English (though pragmatically odd) if the auxiliary was a finite form of do, as in he does splash. With a form of be, the sentences are ungrammatical. If the child believes that such forms are grammatical, then the sentences are at least as deviant (for English) as the omission of -ing. We take -ing omission to be a descriptive term; we do not believe that our data are capable of determining the grammar of -ing omission. A study that focused on that question would be of interest although the relevant percentages are small in young TD children.

<sup>10</sup>For the remaining participants, no subjects were recorded on the answer sheets, likely because no subjects were produced. However, to be conservative, the responses from these participants were excluded from these counts.

### TABLE 8 | Proportions for null and overt subjects in ASD across verb forms.


\*\*p < 0.01.

TABLE 9 | Counts and rates of null subjects for finite and nonfinite verbs in ASD, and likelihood ratios (of having a null subject with a nonfinite verb compared to having a null subject with a finite verb).


subject with finite verb) between the ALI and ALN groups, 43.91% (83/189) vs. 61.08% (102/167) respectively, F(1, 36) = 5.4, p = 0.026 (η 2 <sup>p</sup> = 0.13). For past tense, there were two significant differences between the groups. The ALN group gave more correct responses than the ALI group, 77.54% (221/285) vs. 59.40% (177/298), respectively, F(1, 36) = 11.36, p = 0.002 (η 2 <sup>p</sup> = 0.24). The ALI group produced significantly more overt subjects with nonfinite verbs than the ALN group, 14.09% (42/298) vs. 0.70% (2/285), respectively, F(1, 36) = 6.96, p = 0.012 (η 2 <sup>p</sup> = 0.16). The rate of null subjects produced with either verb form in the past or present tense did not differ between the ALN and ALI groups.

Chi-square tests for independence were used to examine the relationship between null/overt subjects and non/finite verbs within each group. We find significant relationships in the ALI group for present tense, χ 2 (1, <sup>N</sup> <sup>=</sup>189) = 14.51, p < 0.0001, and for past tense χ 2 (1, <sup>N</sup> <sup>=</sup>298) = 14.15, p = 0.0001. Similar significances were observed for the ALN group for present tense, χ 2 (1, N =167) = 15.51, p < 0.0001, and for past tense χ 2 (1, <sup>N</sup> <sup>=</sup>285) = 96.55, p < 0.0001. Thus, there is a higher rate of null subjects with nonfinite verbs, and a higher rate of overt subjects for finite verbs, for both ALN and ALI groups and both tenses. The trends are consistent with what was previously reported for elicited production in controlled contexts for very young TD children (Schütze and Wexler, 2000). **Table 9** also presents likelihood ratios of having a null subject with nonfinite verb compared to having a null subject with a finite verb, which are greater than 1 in all cases.

# Correlations between Tense Marking and Chronological Age and Standardized Tests

Pearson Bivariate Correlations<sup>11</sup> for ALN and ALI groups were calculated between response variables indicated in **Tables 2** and **3**, as well as composite tense (mean of finite responses of present, past regular and past irregular tenses) and CA, and SS on NVIQ (KBIT Matrices), KBIT expressive Vocabulary, receptive vocabulary (PPVT-3) and receptive grammar (TROG-2).

In the case of the ALI group, we find significant correlations for different aspects of tense with receptive and productive vocabulary, receptive grammar, as well as NVIQ, but not with CA (**Table 10**).

In the ALN group, on the other hand, finiteness strongly correlates with CA only (**Table 11**).

# Correlations between Tense Marking and ASD Diagnostic Measures

In the ALI group, there were significant negative correlations between knowledge of finiteness, including composite tense, and ADI-R scores on the Current Algorithm on domains of Social Interaction, Verbal and Nonverbal Communication, and Behavior (**Table 12**). ADOS scores on the domains of Communication and Social Interaction correlated significantly with irregular finite past and regular finite past forms respectively (**Table 13**). ADOS Behavior/Interaction scores correlated with regular finite past tense form. Notably, no ADOS scores nor ADI-R Diagnostic Algorithm scores correlated significantly with composite tense.

For the ALN group, there were no significant correlations between ADOS and ADI-R scores and composite tense. Only two other measures of tenses had significant correlations. Correct form of irregular past tense [r(25) = −0.412, p < 0.05] and levels of over-regularized responses [r(25) = 0.423, p < 0.05] significantly correlated with ADOS Communication scores; correct form of irregular past tense also correlated with ADI-R

<sup>11</sup>Note that the significance levels for our correlation analyses (here and in the next section) were not Bonferroni corrected; we will use them only in trying to observe particular patterns that might suggest specific hypotheses and further studies.


TABLE 10 | Correlations for the ALI group between tense marking performance and age and standard scores on KBIT matrices and vocabulary, PPVT-3 and TROG-2.

\*p < 0.05, \*\*p < 0.01, \*\*\*p < 0.001.

TABLE 11 | Correlations for the ALN group between tense marking performance and age and standard scores on KBIT matrices and vocabulary, PPVT-3 and TROG-2.


\*p < 0.05, \*\*p < 0.01, \*\*\*p < 0.001.

TABLE 12 | Correlations between tense marking and ADI-R Current algorithm scores for the ALI group (with range of number of participants for each subtest).


\*p < 0.05, \*\*p < 0.01, \*\*\*p < 0.001.

TABLE 13 | Correlations between tense marking and ADOS measures for the ALI group (with range of number of participants for each subtest).


\*p < 0.05.

Diagnostic Algorithm scores on Social Interaction [r(25) = 0.385, p < 0.05].

# Tense Marking Performance According to the TEGI Criterion Scores

While the ALI group is clearly impaired on finiteness, the ALN group was not significantly worse than its controls. Analysis of the TEGI criterion score indicates whether there are any absolute delays in finiteness, by comparing the score of each participant to the cut-off score for that participant's age. Performing at or above criterion score is considered age appropriate. Separate ANOVAs were performed for each criterion: for present tense, for all past tense, and for composite tense including all past and present tenses (**Table 14**). Gender was added as a covariate.

There was a significant effect of group for the present tense criterion F(3, 156) = 15.87, p < 0.001 (η 2 <sup>p</sup> = 0.23), past tense criterion F(3, 158) = 20.91, p < 0.001 (η 2 <sup>p</sup> = 0.28), and the composite tense criterion F(3, 159) = 30.97, p < 0.001 (η 2 <sup>p</sup> = 0.37). As in other analyses, the effect of gender was not significant. Pairwise comparisons (Bonferroni corrected) indicate that for present, past, and composite tenses criteria, the proportion of the ALI participants performing at or above criterion is significantly lower than all the other groups (all ps < 0.001). In case of composite tense, fewer ALN participants performed at or above criterion as compared to ALN-TD controls, and this difference approaches significance (ALN < ALN-TD, p = 0.06).

# DISCUSSION

## Overall Discussion

Our investigation of finiteness in ASD, the largest to date to include subgroups of children with ASD classified with regard to the presence or absence of language impairment, has revealed two main results (6).

(6) Main Results:


For the first time in the literature, we observe that the children with ALI perform significantly lower than both age-matched children with ALN, and much younger TD controls matched on verbal- and nonverbal MA, on all tenses: present, and past regular and irregular. Our youngest control group, the ALI-TD (mean age 6.0 years), shows a 91/92% rate correct (present/past). The ALI group (mean age of 10.6 years) shows a finiteness rate of only 65/68% (present/past). This is not just poor performance, but what one would expect from a very young TD child, at a completely different age level. Furthermore, relative to the composite tense criterion cut off point, only 22% of participants with ALI perform at or above their chronological age cut-off for finiteness vs. 83% of their TD controls. As such, the ALI group's performance is showing a severely delayed finiteness system, which can be called disrupted according to the definition in (3d).

In contrast, our children with ALN performed no differently from their TD controls matched on age, nonverbal and verbal-MA, and grammar: the ALN group (mean age 9.5 years) has an 88/93% finiteness rate which is somewhat (though not significantly) lower than ALN-TD group of the same age (99/97%). As such, the ALN group can be said to have intact finiteness knowledge. However, the criterion analysis for composite tense indicates that only 76% of children with ALN perform at or above their chronological age cut-off for finiteness, which is approaching significant difference from the ALN-TD control group (98%). About 24% of the ALN group does not reach the level of finiteness knowledge indicated by the criterion for their age, showing heterogeneity and variability in the ALN group compared to the TD control group (this will be discussed later in section on Insights from Correlations Analyses).

Our findings on ALN are in line with those reported in Eigsti and Bennetto (2009), Walenski et al. (2014), and Roberts et al. (2004). Eigsti and Bennetto's sample of 10–16 year-old children with autism, high-functioning both in terms of vocabulary and verbal/nonverbal IQ (and in fact higher-functioning than our ALN sample), showed a consistently high performance on all tense morphemes, on a relatively difficult task such as grammaticality judgment (Eigsti and Bennetto, 2009). The findings are also in line with Walenski et al. (2014) whose sample of 7–13 year-old children with high functioning autism and normal verbal/non-verbal IQ showed performance on regular




and irregular past tense production that was comparable to their age-matched TD controls. Furthermore, our findings for children with ALI are comparable with Roberts et al. (2004) whose children with ALI also showed substantial difficulties with tense marking compared to age-matched children with ALN and those with borderline language skills.

In short, children with ALN show age-related heterogeneity, but not a systematic deficit in their finiteness scores, while children with ALI show a severely delayed ["disrupted" in the sense that we introduced in (3d) following Rice, 2007] finiteness growth.

In the following sections, we discuss our results in more detail and consider their contribution to our understanding of language impairment in autism, and the phenomenon of the OI stage observed in TD, ASD, and in SLI.

# Is There an Extremely Extended Optional Infinitive Stage in Children with ALI?

An important question, if the study of language in ASD is to be connected to the more general field of language development, is whether children, who show these problems with finiteness, just have some kind of general linguistic deficit, concerning all of language, or whether they are showing the kind of developmental stages that TD children go through. In this paper, we have some of the means to pursue this question with respect to finiteness.

The natural question to ask is, are our participants with autism in the OI stage, a well-known stage that TD Englishspeaking children grow out of about the age of 4? The stage is sometimes misunderstood to indicate only that the child's grammar allows large amounts of nonfiniteness where finiteness is instead required in the language. As we have seen, this tendency is very strongly observed in our ALI group, and at a very late age. However, the definition of the OI stage (Wexler, 1994, 1998, and others) requires more than the difficulty with finiteness. It also requires a corresponding degree of competence in related areas, which are noted in (7).


# Children with ALI Make Few Errors with -ing

First, the OI stage is not a stage in which any morpheme (or even any verbal morpheme) is omitted. In particular, the aspectual morpheme -ing on present participles is rarely omitted in obligatory contexts during the OI stage in TD and SLI children. Brown (1973) reported very few omissions of -ing when finite be was produced in TD natural production data at ages even younger than 3 years. Rice and Wexler (1996) found that in spontaneous speech in 5-year-old children with SLI, the rate of -ing omission was 8%, comparable with that of the 3-year-old TD children from the same study (10%). Thus, when a progressive tense verb form is used by children, a finite form of the auxiliary be is followed by a present participle form of the verb (stem + -ing). In other words, this error does not appear to be a representative TD or SLI error.

Children with ALI sometimes used a present or past progressive tense form. This is inappropriate contextually, as we will discuss later in the section on Children with ALI: Inappropriate Response Patterns. These responses, however, allow us to make certain observations.

First, children with ALI produced 48 instances of a finite form of be (past or present) followed by a verb. Forty-five of these 48 were present participles; they contained -ing. This represents a rate of 93.8% of -ing production in obligatory contexts. This gives a small rate of error of 6.2%, compared to the 66–68% rate of the use of a finiteness marker by the ALI group. It seems fair to say that the ALI group does not omit -ing in obligatory contexts, which is consistent with the OI stage, in which only special types of morphemes related to tense are omitted.

Secondly, sometimes the present participle appears without the auxiliary, a well-known marker of the Optional Infinitive stage where the auxiliary is omitted due to the omission of tense (see Wexler, 1994, 2003, 2004a and, for an empirically adequate theoretical explanation, Schütze, 2004). Of 102 uses of progressives by the ALI group, 46 include an auxiliary, a finiteness rate of about 46%, which is somewhat below the finiteness rate for simple present or simple past tenses. The fact that children with ALI omit auxiliary be is another phenomenon consistent with the OI stage.

# Children with ALI Interpret Semantic Tense Correctly

A hallmark of the OI stage is that, although TD children often use a nonfinite form instead of the required tensed form of the verb, they, nevertheless, know the semantic interpretation of the tense morphemes. They do not use a present tense form for past tense or vice-versa (established experimentally for TD children in the OI stage by Schütze and Wexler, 2000). When the context is such that a present tense response is required, and the ALI group produces a tensed response, the tense of the response is present 193 times, and past 2 times, an error rate of about 1%. When the probe, on the other hand, sets the context such that a past tense is required, and the children with ALI use a tensed response, that response is tensed in past 369 times, and present 8 times, an error rate of about 2%. Clearly, children with ALI understand the semantic interpretation of the tense value of the present and past morphemes in English and also understand how the contexts make a particular tense appropriate. Not only do the children with ALI understand enough about time and language to achieve such high scores, they understand how to map them onto relevant morphemes. This basic piece of competence in the OI stage is fully realized in children with ALI.

# Children with ALI Know Subject-Verb Agreement

In the OI stage, the basic process of subject-verb agreement is known. Although children often omit a tense/agreement marker in English (in particular here, -s or -ed), when they do use a marker, the subject very strongly tends to agree with it. TD and

SLI children very rarely, for example, produce a sentence with -s on the verb and a first person subject, e.g., <sup>∗</sup> I goes/we goes (the asterisk indicates that a phrase is not grammatical) (Rice et al., 1995; Harris and Wexler, 1996). Similarly, children in the OI stage in English know the positions of accusative and nominative pronouns; for example, they never place a nominative pronoun in the object position (in Schütze and Wexler, 1996).

Our probes were not specifically set up to test agreement and case, being purposely designed to elicit third person singular subjects and verbs. However, as our results revealed above, we can in general conclude that there were no errors that showed any problems with agreement or case. For example, of the 3 instances of subject I, none were followed by a verb with an -s inflection. Conversely, whenever a verb with -s had an audible subject, it always had a third person pronoun or a noun phrase subject. So far as we can tell from the small number of relevant instances, the ALI group showed the relevant properties of the OI stage.

### Children with ALI: Inappropriate Response Patterns

Children in the ALI group produced a number of responses that were wrong in the sense that they did not answer the elicitation question. The most frequent response of this type was the use of a noun as a response (34 instances in both present and past tense probes). It is possible that these errors are due to difficulties in attending to the task: Roberts et al. (2004) also found some unusual errors that they attributed to difficulty in understanding the instructions or following the task. This is unlike children with SLI who always answered the prompt with either a finite or a nonfinite verb (e.g., Rice et al., 1995).

The second type of inappropriate response was the use of present progressive participles (with or without auxiliaries) for both the present tense probe (45 instances, 12.4% of total responses) and the past tense probe (57 instances, 8.7% of total responses). The contextual conditions that necessitate the present and past tense responses are somewhat different. In the present tense probe, the lead-in question used a generic with third person and a profession title, "This is a teacher," followed by the prompt, "Tell me what a teacher does." This should elicit a generic response, "A teacher teaches." However, the child uses a progressive form. It is possible that some children with ALI have difficulties understanding the concept of generic/habitual. Further research will have to determine whether this is true. Roberts et al. (2004:441) observe similar errors in their Impaired group of ASD participants.

In the past tense probe, children with ALI provided 21 instances of the present tense progressive and only one instance of past progressive (and 35 present participles). This simply seems false instead of inappropriate, or non-answering. Perhaps the child did not understand the intention of the elicitation, which was to point out with simple past that the actor finished the activity described in the picture. However, we know that these children make only about 2% of errors of using a simple present instead of a simple past, suggesting they know how to use present and past tenses. Still, the child, when being asked to describe a completed event, instead describes an on-going event. It seems as if the child is simply ignoring the instructions these 57 times, not attempting to answer in a way appropriate to "finished" but, rather, simply describes the picture that he or she sees. Is it a difficulty in paying attention to the whole context? Or is it something simpler: the child, having difficulties, imitates the tense/aspect of the elicitation sentence, which was in present progressive tense. Alternatively, the fact that some children with ALI have tense and aspect errors may mean that they do not take into account that the probe presented a past action picture given the present progressive situation established by the first picture. This may be a deficit with discourse and not with morphology. This does occur 57 times, and it appears to show a difficulty in integrating all the linguistic information.

In order to argue for deviance in children with ALI with respect to morphosyntax, as in (3b), we would have to find evidence that young TD children do not produce inappropriate responses in an elicitation context, unlike children with ALI. In fact, in case of the use of present progressive tense in contexts eliciting simple present tense, very young TD children (ages 2;5–3;4, mean age 2;11) do make such errors at a rate of 3.6% (Thornton and Rombough, 2015) <sup>12</sup>, which is lower than our ALI group (12.4%). Given the age of the TD participants in Thornton and Rombough (2015), it is quite clear that our observed rates of use of progressive in a simple tense context are at the minimum a sign of disrupted development in children with ALI: their level of cognitive and language functioning is higher, yet their use of progressives in simple tense contexts is much higher than that of 2–3 year old TD children.

Does this suggest that ALI group's inappropriate responses are not deviant? In order to answer this, we have to consider the context of elicitation in Thornton and Rombough's (2015) study, in which, in fact, progressive responses are more felicitous than in our elicitation context. Thus, it is quite plausible that the use of progressive in our contexts by the ALI group is indeed a form of grammatical deviance.

In Thornton and Rombough (2015), TD children were asked to see, for example, if a toy "would fit through the door of a bus" (p. 142). After a couple of affirmative conditions where the toy indeed fit in (in which the experimenter confirmed the observation by an utterance in third person singular simple present tense), the next toy(s) did not fit in, eliciting negation with simple present tense. Most answers were in the adult form using doesn't (∼40%), followed by such "nontarget" forms of third person singular -s as "It's not fit"<sup>13</sup> or "It not fits." Two children in particular (out of 25), ages 2;8 and 3;0, account for most of the group's responses in present progressive, using "It's not working," but notably not <sup>∗</sup> "It's not fitting" (Thornton and Rombough, 2015:153). In particular, one of these children produced many bare stem verbs in affirmative conditions, suggesting that she is firmly in the OI stage.

In order to pragmatically justify an answer like "It's not working," one needs to do only a tiny bit of accommodation; the child has already accommodated by using work instead of

<sup>12</sup>We are grateful to a reviewer for this clarification of disruption and deviance, as well as for pointing out Thornton and Rombough's work to us.

<sup>13</sup>Utterances in this form, with a be auxiliary and a bare verb, can be considered to show -ing omission. The rate of these phrases in Thornton and Rombough (2015) is very small, 3.2% of total responses, and is consistent with prior observations for -ing omission.

fit, describing the attempt/goal rather than the action of fit. Since some time element is involved in checking out the action, one only has to think about what is on-going. Our judgment is that the progressive is an almost (perhaps wholly) felicitous answer. This is quite different from our elicitation, in which we ask (in the present tense context), "This is a teacher. Tell me what a teacher does." There is no felicitous response in this context that uses a progressive. We ask for a generic or habitual answer and receive an activity response. There might be contexts in which an activity response could somehow be accommodated, though we have not thought of one. In this context, an activity response is simply not possible. This use of progressive is a semantic error, not a pragmatic choice. We have suggested that it might be caused by different impairments, not necessarily the lack of understanding of generic or habitual semantics and their relation to syntactic form (though that is possible). In short, an 8.7– 12.4% progressive response in our context in the ALI group seems to be clearly a quite deviant answer, in the way that the 3.6% of TD progressive response is not in Thornton and Rombough (2015).

In summary, the above discussion of the relevant OI properties in responses of children with ALI points to an extremely delayed OI stage, in fact "disrupted" in the sense we introduced earlier (see 3d). This is supported not only by their pattern of errors (the nontensed verbs where tensed ones are required, including omissions of auxiliary be), but also by their competencies, as described above. Nonetheless, the fact that their responses often involved inappropriate answers, especially with misuse of progressive in generic/habitual contexts, may indicate non-linguistic causes of the errors, and, in fact, deviance from TD (as in 3b). We cannot argue with confidence that the children with ALI are in a pure OI stage. Rather, their performance is consistent with being in the OI stage, but crucially with additional disabilities relevant to morphosyntax.

# Is There an Extended Optional Infinitive Stage in Children with ALN?

Our results suggest that finiteness is not a serious problem for children with ALN. Their rates of finiteness are significantly higher than for the children with ALI, and not significantly different from the rates for ALN-TD controls. This lack of a serious deficit in finiteness indicates that the children with ALN are not in the OI stage at their chronological age.

Given that rates of finiteness are so high in the ALN group, we would not expect errors that are characteristic of the OI stage. For reasons of completeness, we note that the ALN group made no errors of interpretation on past and present tense, never omitted -ing when required after an auxiliary, made no subjectverb agreement errors or case errors (although, like the children with ALI, they had limited situations in which the latter errors could occur). Of eight inappropriate uses of the progressive tense by children with ALN, five include omission of the auxiliary. This omission would be consistent with the OI stage, but this number is too small to be interpreted in any meaningful way. The lack of -ing errors is consistent with Eigsti and Bennetto's (2009) findings that children with high functioning autism easily recognize the omission of -ing and with Tovar et al. (2015) who find, using intermodal preferential looking methodology, that 4-year-old children with ASD, functioning in the borderline range, show some comprehension of the difference between progressive and simple tenses. The use of progressives instead of simple tense in our ALN group was extremely low, indicating no particular difficulty in integrating information from a few sentences to achieve the correct response given the context. While the ALI group responded inappropriately 34 times with a noun, there was only one such instance for the ALN group.

We conclude that the ALN group is not in the Extended OI stage and has the grammatical capacity that goes beyond that stage, on par with their ALN-TD controls. The ALN group's competent performance on our finiteness tasks indicates no overlap with SLI whatsoever. A future study should investigate finiteness in very young children with ALN, who would be expected to be in the OI stage in virtue of their young age.

# Optional Infinitives and Null Subjects in ALI and ALN

A well-known phenomenon in TD children (Hyams, 1986, and many papers since) is the tendency of young children to omit subjects of sentences. By now there has emerged reliable evidence concerning some of the major properties of these null-subjects, and how they relate to the OI stage.

First, there is a much larger tendency to omit the subject if the child produces a nonfinite (untensed) verb. Wexler (e.g., Wexler, 1994; Bromberg and Wexler, 1995:222) argued that this was because verbs without tense can license null-subjects in the adult language, and the child was simply in agreement with this grammatical fact. The Agreement Tense Omission Model (ATOM) of Schütze and Wexler (1996) allows tense to be omitted from the structure, thus also allowing a null-subject for untensed verbs. Once the child produces a nonfinite verb, a null-subject is grammatically appropriate.

Second, in young TD children there are still many instances of null-subjects of finite verbs, a result that cannot be explained by resorting to grammatical possibilities when the verb is nonfinite. We will discuss why this possibility exists after discussing the results concerning null-subjects in ASD.

As **Table 8** shows, both ALI and ALN groups produced large numbers of null-subject utterances, in both present and past contexts. We will return to the question of why there is no significant difference between children with ALI and ALN in proportion of utterances with null-subjects.

Let us go into more detail starting with the ALI group. **Table 9** shows that the ALI group produces a greater proportion of null-subjects when the verb is untensed than when it is tensed, for both tenses. This pattern is exactly what is found in null-subject production during the OI stage in TD children, and is well understood. The pattern provides further indication that the children with ALI are in the OI stage of grammatical development, and that their responses are based to some extent on their grammatical knowledge. Nonfinite verbs license nullsubjects grammatically in adult and child language.

Moreover, the pattern provides evidence concerning the possibility that the children with ALI are omitting subjects because they have memory limitations: it may be difficult for children to produce a long sentence that includes a subject. This is an old idea in non-grammatical approaches to child null-subjects that are not grammatical in the adult language. Suppose, as e.g., Bloom (1990) argues, that the reason children drop subjects is because of limited working memory (shown to be incorrect by Hyams and Wexler, 1993 for TD children). The idea is that a longer verb phrase leads to more null subjects. The expectation then is that children will omit more subjects with finite verbs, since those verbs are longer (bare stem + inflection morpheme). In fact, we find the opposite result. This leads us to believe that the "more null subjects with nonfinite verbs" pattern, which holds of both our ASD groups and TD children in the literature, is induced by the grammatical properties of the underlying language development mechanism, as is generally accepted in TD. Even the ALI group is seen to have a grammatical system that is the cause of much behavior, even when the system is quite immature.

The children with ALN also produce particularly large rates of null-subjects with nonfinite verbs, reaching 93% for past tense and 73% for present tense probes. Possibly these large proportions are a consequence of the grammatical possibility of null-subjects with nonfinite verbs (there are relatively few of these for the ALN group, as we have seen).

It is also possible that some of these responses are due to the possibility of potential, almost grammatical, replies in our experiment. The past tense probe showed a picture in which e.g., a girl is skating, then one in which the action was completed, and the child was told, "Here she is done. Tell me what she did." One can imagine an almost grammatical answer, "skate." One possibility is that the answer is a reduced form of "What she did was skate," with everything but skate elided, given its recoverability. The answer cannot be <sup>∗</sup> she skate; it must have a null-subject. The children with ALN are only taking advantage of this nonfinite possibility 29 times (vs. 256 finite responses), but when they do, 27 of the responses have null-subjects. If this explanation is on the right track, we can see that the ALN group's responses in the past tense are again strongly consistent with being out of the OI stage. The children with ALI, however, produce fewer null-subjects in the past tense than in the present tense. It might very well be that they are not particularly taking advantage of the reduced nonfinite response, but are simply trying to indicate a simple subject and verb and putting the verb in the untensed form as part of the OI stage.

In case of the present tense probe, the elicitation says e.g., "This is a teacher. Tell me what a teacher does." It also seems possible that there is a potential response like, "What a teacher does is teach," reduced to teach, again an untensed form of the verb. The ALN group produces 73% of its nonfinite verbs with a null-subject, possibly in accordance with this possibility. The ALI group produces 68% null-subjects in this case, probably again indicating a null-subject licensed by a nonfinite verb.

We also have to allow for the possibility that the greater proportion of null-subjects is due to the fact that our elicitation provided a strong common ground (topic) and a question about what the common ground does/did, which allows for one way of answering that elides the common ground/topic and uses a nonfinite verb. Determining with more certainty why the two groups are producing the greater proportion of null-subjects with nonfinite verbs (whether it is due to the general licensing of nullsubjects with finite verbs, or whether it is due to the strategy that we have indicated that works for this particular elicitation) requires further research. The elicitation task of TEGI does not allow us to disambiguate between these two possibilities, which, as a reviewer suggests, may underestimate children's grammatical knowledge. The results on null-subjects that we have attained, however, do argue that the responses of both the ALI and ALN groups are guided by the grammatical structures that they have, rather than by any kind of simple memory limitation.

We are left with the issue of null-subjects of tensed verbs, a much-discussed issue in TD. We will adopt the model in Wexler (2013), in which sentences in which both the subject and the predicate are discourse-old are grammatically Tense Phrases (TP's) rather than Complementizer Phrases (CP's) as argued by Mikkelsen (2015) for Danish. Thus, a subject in such a sentence is the specifier of a root, which may be omitted (Rizzi, 2006) because there is no higher projection that allows its spell-out. Wexler (2013) argued that young children often take sentences to be discourse-old even when they are not, thus taking structures to be TP's too often, leading to subjects that are specifiers of a root (which may be omitted), leading to null-subjects of finite verbs. In simple terms, the ultimate explanation for children's use of null-subjects with finite verbs is their immature knowledge of information structure. Once this plays a role, the child's syntactic system will induce the possibility of a null-subject. Thus the combination of an immature knowledge of information structure and a more mature grammatical system will lead to the possibility of null-subjects (Wexler, 2013).

The null-subjects of finite verbs in both the ALI and ALN groups may follow from this lack of knowledge of information structure. It might very well be that the kind of defining issues for ASD, e.g., issues related to Theory of Mind, may be enough to cause the relevant difficulties with information structure (which is an interface module, relating syntax to pragmatic/discourse conditions) in both the ALI and ALN groups although their ages would not be consistent with this difficulty in TD. The null-subjects of finite verbs at this late age (∼9–10 years) may very well be a sign of autism, whether grammatically impaired (ALI) or grammatically not impaired (ALN). The model of autism that we are working with, and the model of nullsubjects and grammatical development more generally that we are working with, predict this particular difficulty for both groups of children with ASD. Further research could be directed toward investigating the consequences of these considerations and toward a more focused attempt to study null-subjects with finite verbs.

To compare, 4-year-olds with SLI, who used 33% nonfinite verbs in their spontaneous production, only showed 16% null subjects with nonfinite verbs, and 2% null subjects with finite verbs; TD children aged 4 and higher showed no null subjects (Schaeffer et al., 2002). The SLI rates are much lower than either of our much older ASD groups, suggesting that information structure is not impaired in SLI. This is one more piece of evidence that ALI is not ASD + SLI, which will be discussed in more detail below.

To recap, the fact that both ASD groups differentiate between null-subjects-with-finite-verbs and null-subject-with-nonfiniteverbs suggests, once again, that children with ASD are not simply omitting surface morphemes or words, but are actually producing different linguistic derivations for nonfinite vs. finite verbs, showing a somewhat functioning language system. On one explanation, children with ASD seem to be exhibiting more difficulties with the knowledge of information structure, in particular with the determination of whether or not subjects and predicates are discourse-old.

# Insights from Correlations Analyses Finiteness, Chronological Age, and Standardized Measures of Language and IQ

What other factors, linguistic and non-linguistic, influence the acquisition of finiteness? What can we discover about the relationship of grammar and other cognitive abilities by focusing on the acquisition of different aspects of finiteness by typically and atypically developing populations?

According to Wexler (1996, among many others), finiteness in TD and SLI children grows over time according to some internally set maturational schedule, which may not be directly related to other cognitive abilities. In Rice et al. (1998), the best predictor of finiteness growth in TD and SLI was age. Receptive vocabulary (PPVT), nonverbal reasoning abilities and mother's education were not significant predictors. This is quite telling, considering the well-known finding that a child's vocabulary is predicted by mother's education and is a measure of environmental input (cf. Rice et al., 1998:1418).

Our correlations analyses provide evidence that the ALI group does not have a language system that functions akin to that of the TD children and the children with SLI of Rice et al. (1998): finiteness in our ALI group is not dependent on age. In fact, overall language abilities (expressive and receptive vocabulary, receptive grammar) and NVIQ strongly correlated with finiteness deficits in the ALI group. This is partly in line with the results of Roberts et al. (2004) who found that past tense performance of their ASD group correlates with age, verbal and nonverbal IQ, and receptive vocabulary scores. Furthermore, we find the same difference as Roberts et al. for present tense: all measures except NVIQ seem to play a role. Our findings also agree with Eigsti and Bennetto (2009) who found significant or near significant correlations between their high-functioning group's scores on grammaticality judgements and expressive vocabulary, verbal and nonverbal IQ. On the other hand, our results for ALI contrast with those of Botting and Conti-Ramsden (2003) who found that NVIQ does not correlate with past tense knowledge in ASD. In this expanded respect (not related to grammatical constructions, but to developmental pattern), knowledge of finiteness, or at least the mechanisms underlying it, is deviant in children with ALI compared to SLI and TD children<sup>14</sup> .

In the ALN group, on the other hand, finiteness strongly correlates with chronological age only, and, as in TD and SLI children from Rice et al. (1998), standardized test scores rarely correlated with tense, indicating a functioning maturing language system with respect to finiteness. Evidently, some younger children with ALN are showing weaker finiteness knowledge than older children with ALN, thereby introducing some heterogeneity in the ALN group as evidenced by the criterion analysis (see the section Overall Discussion). Thus TD, SLI, and ALN pattern together in showing age as a causative factor in development, whereas ALI stands apart, a kind of (non-constructional) deviance.

The finding of a lack of correlation for the ALI group between a child's age and composite tense score is striking, but should be taken with some caution. Perhaps some other variable affected whether a child with ALI gets into the study, a variable that correlates with age. A longitudinal study, in the manner of Rice et al. (1998), could shed more light on the issue of whether children with ALI improve their scores on composite tense as they age.

## Finiteness and ADOS and ADI-R

Before discussing our results in this section, it is necessary to look into the similarities and the differences between ADOS and ADI-R tests, which are complementary measures of the ASD symptomatology.

ADI-R is a structured interview of a parent or a caregiver, with questions focusing on a child's current behaviors (Current Algorithm) as well as behaviors observed at the most abnormal stage of the child's development so far, usually 4–5 years old (Diagnostic Algorithm). ADI-R assesses abnormalities in the domains of social interaction, communication, and behavior. The measure notes whether a child is verbally fluent (able to produce phrases of three or more words).

ADOS is a structured series of activities and interactions between an evaluator and a child, providing a snapshot of the child's behavior at the time of testing. ADOS measures a similar range of social and communicative behaviors to ADI-R. The test has different modules depending on whether a child is verbally fluent (sentences with multiple clauses) or not (just three-word phrases).

Neither ADOS nor ADI-R directly addresses any specific grammar skills.

Our correlations between finiteness rates and scores from ADOS and ADI-R measures indicate distinct profiles for ALN and ALI groups. In children with ALI, finiteness, especially composite tense, is strongly associated with scores from all ADI-R Current Algorithm domains. Correlations with ADOS measures were less robust, and nonexistent for composite tense. Furthermore, there were no correlations of tense with any of the ADI-R Diagnostic Algorithm scores. The latter observation suggests that estimation of early dis/abilities does not correlate with finiteness dis/abilities at a later age.

The ALN group, on the other hand, showed only a few associations with ADOS Communication and with ADI-R Diagnostic Algorithm Social Interaction scores. Composite tense did not associate with any of the tests for ASD. These

<sup>14</sup>We are grateful to a reviewer for clarifying this aspect of deviance.

findings are in part comparable to Lindgren et al. (2009), who found that ALN and ALI groups' total language scores on measures of morphology, syntax, semantics and verbal memory from a standardized test, CELF-3, did not correlate with ADOS and ADI-R scores. In contrast, Eigsti and Bennetto (2009) found significant correlations between performance on a grammaticality judgment task of their high-functioning group (who are likely similar to our ALN group) and ADOS measures of Communication and Social Interaction (but not Repetitive Behaviors).

If ADOS and ADI-R are largely measuring the same aspects of ASD symptomatology, why do we find these differences in correlations in ALI? Lack of correlations with ADOS could be explained by the fact that we tested our participants on average a year after they were tested on ADOS and ADI-R. This explanation, however, cannot account for our correlations of the ADI-R Current Algorithm scores and finiteness, which ought to show the same differences in behavior with time. Therefore, an explanation may stem from the differences between ADOS and ADI-R measures. Could it be that parental observations of current daily behavior are in some sense more relevant to the knowledge of finiteness than a clinical interactive observation that lasts an hour or so? It is unlikely that parents estimate their children's verbal fluency by awareness of whether their children produce non/finite verbs. Rather, it may be possible that finiteness is a precursor to overall fluency which parents are sensitive to.

Putting our correlations results together, it seems that in children with ALI, their low overall verbal and nonverbal IQ and their receptive language abilities, as well as the severity of their current symptoms of autism, correlate strongly with their rates of finiteness, which is very different from what we observe in our ALN sample, and in children with SLI and their TD controls studied by Rice et al. (1998) for whom it is primarily chronological age that correlates with finiteness. As such, we can say that the mechanisms underlying grammar abilities are different in children with ALI and with ALN.

# Is ALI the Same as ASD + SLI?

Here, we compare our results on finiteness in our children with ALI with those of children with SLI and younger TD children from Rice et al. (1998). The notable difference, of course, is that their SLI group was not impaired on NVIQ (following the selection criteria for SLI) whereas our ALI group was. The standard scores of our ALI group and their 5-year-old SLI group (SLI-5) are comparable on receptive vocabulary (though our ALI group is on average older than the SLI group). However, on measures of receptive grammar, our ALI group's standard scores are substantially lower than those of SLI-5.

In terms of performance on rates of finiteness, our ALI group (aged 10.6) is much better than SLI-5 (twice as high, in fact). Our ALI group is most comparable to Rice et al.'s participants with SLI at ages 6.0 or 6.5, and is lower than their TD group at age 3.5 (but better than the 3-year-olds from the same study). In our participants, there are much greater standard deviations, suggesting a greater variability in ASD than in SLI.

Although children with ALI and SLI may show some similar levels of finiteness, albeit at different ages and levels of general language and cognitive abilities, the overall differences between groups are very great, and thus we hesitate to state that there are similarities between ASD and SLI. Furthermore, we described some kinds of errors that the children with ALI make that the children with SLI are not known to make (see the section Children with ALI: Inappropriate Response Patterns). The children with SLI are in an extended OI stage, showing the same morphosyntactic deficits and competencies as found in young TD children; the children with ALI cannot be said to be in a pure extended OI stage because they show evidence for some patterns that are not found in the OI stage.

Furthermore, there is a conceptual unclarity in what is meant by the formula: ALI = ASD + SLI. Since all researchers are ultimately interested in the etiology (including genetics) of these syndromes, the simplest assumption would be that the syndrome ASD (having no grammatical deficit by itself) is sometimes independently inherited with the syndrome SLI. Such a proposal makes grammatical deficits simply not intrinsic in any way to ASD, with grammatical language impairment in ASD being inherited by chance. Let us call this the Independent Inheritance proposal.

Given that the rate of SLI in children is about 7% (Tomblin et al., 1997), if ASD and SLI are independently inherited, there should be a rate of 7% of grammatical language impairment in all of ASD. We are unaware of epidemiological studies that measure the relative rates of ALI and ALN. Our data can give us a measure, thus allowing us to test whether this prediction is true. We selected our 83 ASD participants without any regard as to whether they were grammatically impaired or not, and tested to categorize them as 46 ALN and 37 ALI participants. The numbers of children with ALI are around 45% of our total ASD sample. This is comparable to other studies of ALI: e.g., the study by Roberts et al. (2004) had 19 children with ALI out of 62 participants (just under a third of all the participants), while Kjelgaard and Tager-Flusberg (2001) had 50 out of 82 (just under two thirds of all participants)15. These rates of ALI are much greater than the expected 7% from the Independent Inheritance assumption. Thus, we argue that biologically, ALI is not an independent chance merger of ASD and SLI in the same child; that is, ALI is not ASD co-morbid with SLI (contrary to, e.g., Tager-Flusberg, 2015) 16 .

It is likely that the disorder of ALI (unlike high-functioning autism or ALN) itself causes a range of deficits in the development of different aspects of language, just as other disorders, such as Down syndrome (DS) and Williams syndrome (WS), do. There are examples in the literature for such aspects

<sup>15</sup>It is important to also note that few studies investigate linguistic abilities in large numbers of children with ALI although there are regular references to ALI in the literature (e.g., Bishop et al., 2016).

<sup>16</sup>A weaker formulation of ALI = ASD + SLI is possible. It might be proposed that (for some reason) the genes underlying ASD and SLI have a strong tendency to be inherited together, so that the chance of the co-occurrence of inheriting SLI if a child has ASD is much larger than the chance of inheriting SLI if a child does not have ASD. The statistical argument above does not count against such a formulation. Questions of grammatical deviance and rates of finiteness, however, are still relevant, counting against the hypothesis.

of complex language as binding dependencies (Perovic et al., 2013a,b, for ASD; Perovic and Wexler, 2007, for WS; Perovic, 2006, for DS) or passive constructions (for ASD: Perovic et al., 2007, for English, and Durrleman et al., 2016, for French; Perovic and Wexler, 2010, for WS; Ring and Clahsen, 2005a, for DS). The fact that, e.g., omission of verbal inflection in ASD is showing similar patterns across disorders (also for DS: Ring and Clahsen, 2005b; and WS: Peregrine et al., 2006), as well as some similarities to TD (though with some notable differences), is simply an indication that the starting point of language acquisition, the innate genetically-guided language learning system, is the same in all disorders and TD, but is affected differently by the respective disorders. We have argued in various publications that neurodevelopmental impairments seem to allow grammar to develop up to a certain point in a maturationally (biologically) determined way, such that in a particular impairment, the child reaches only a certain level (e.g., Perovic et al., 2013a,b, for ASD; Rice et al., 2009a, for SLI). This was observed in other domains as well: Landau (2012:83) suggests that for spatial representation, "People with WS appear to hit the functional level of a 4- or 6-year-old normal child, but do not grow further." Thus, we would expect finiteness, a biologically determined slow development, to be subject to impairment, as suggested in Wexler (1996). The question of how equivalent ALI and SLI children are grammatically in general will depend on investigation of more complex grammatical structures, an investigation that is under way, but that we will not discuss here.

# CONCLUSIONS

Our extensive study of finiteness and morphosyntax in two large groups of children with autism and their matched TD controls shows different morphosyntactic abilities relative to the presence or absence of a general language impairment.

Our ALI group shows extensive deficits with finiteness, which are not only large quantitatively, but are also not construction specific, appearing in simple present and past tenses, as well as with auxiliary omission. These difficulties in children with ALI, along with their morphosyntactic competence, are similar to what is observed in very young TD children (much younger than the ALI-TD controls in our study) and indicate disrupted development. The maturational mechanisms underlying the knowledge of finiteness, however, are likely different between those with ALI and those with TD or SLI: autistic symptomatology and overall cognitive and language abilities strongly correlate with finiteness in ALI whereas age does not, indicating a deviant development. Further evidence of deviance comes from the ALI group's use of progressive in habitual/generic contexts. All this suggests that our ALI group is both deviant and disrupted in its knowledge of tense marking. The children with ALI may show some properties, both deficits and competencies, of the OI stage, but they have patterns that go beyond the observed TD or SLI profiles.

On the other hand, there is somewhat slower development of finiteness in children with ALN than their chronological age warrants, but it is still comparable to their TD controls. Furthermore, their knowledge shows evidence of a maturational language learning mechanism, not influenced by autistic symptomatology. However, information structure in ALN shows some deficits, similar to very young TD children and the impaired ALI group. This is striking because information structure deficits should be expected to apply to ASD in general, given the nature of ASD (especially difficulties with pragmatics). Thus children with ALN have pragmatic (in particular information structure, which depends on discourse) difficulties, but not grammatical difficulties, in contrast to children with ALI, who have disrupted tense-marking capacities (in addition to the difficulties with information structure).

It is possible that in all children, the same genetically coded language learning mechanism, called "Universal Grammar" by linguists, is present, and that gives us the ASD and SLI performance consistent with the OI stage of TD children. The genetic deficits of neurodevelopmental disorders then work to limit different aspects of language acquisition, whether grammar or information structure, differently depending on the disorder and its severity.

Following an original suggestion by Wexler (1996), finiteness has already been used as a biomarker to guide studies of genetics of language: in twin behavioral studies (in TD children, Ganger, 1998, and Ganger et al., 1998; and in children with SLI, Bishop et al., 2005), and in genetic linkage studies of families with SLI (Falcaro et al., 2008; Rice et al., 2009b). In comprehensive reviews, Rice (2012, 2013) integrated the findings about trajectories of language development in SLI with their possible genetic bases.

In autism, finiteness has not yet been used as a biomarker. It is possible that deficits in finiteness can not only assist in distinguishing children with morphosyntactic language impairments or delays within autism subgroups, but also guide genetic studies of language. For example, some genes that are regulated by FOXP2, a transcription factor involved in a familial speech-language disorder, have been implicated in language deficits (Graham and Fisher, 2015). One such gene is CNTNAP2, which is associated with a non-word repetition deficit in SLI (Vernes et al., 2008), with delay in producing a first word in males with autism (Alarcón et al., 2002, 2008), and with level of language-related behavior at age 2 in children from an unselected sample of the general population (Whitehouse et al., 2011). Thus, it is possible to suggest that the same genetics may underlie different aspects of language development. It will be interesting to see how and whether knowledge of finiteness in ASD associates with genetic variants.

Future studies may address other specific markers associated with Tense, and they should also address other aspects of language that are argued to be deficient in children due to the same computational mechanism that limits finiteness. In particular, the Unique Checking Constraint theory of the OI stage predicts that in some (but not all) languages, clitic pronouns should be omitted (Wexler, 1998, 2004b, 2014, among others). The theory predicts that TD children (and thus children with SLI) will not omit object clitics in Greek, and that was confirmed for TD children by Tsakali and Wexler (2003) and for children with SLI by Manika et al. (2011). One such study has already been done in Greek for 6-year-old children with high-functioning autism (Terzi et al., 2016), who show lower clitic production than age and receptive vocabulary matched TD controls, which indicates deviance. In this way, the studies in the field of grammar in autism will advance to the level of the study of the theory of developmental mechanisms, rather than individual constructions, paralleling advances in the study of typical development.

# AUTHOR CONTRIBUTIONS

KW and AP conceived the study. AP and NM contributed substantially to data collection. NM contributed substantially to transcribing, analyzing and interpreting the data as well as writing the initial draft of the manuscript, with all authors doing some of the writing of some sections. All authors contributed substantially to editing the final versions of the manuscript. All authors have agreed to be accountable for the content of the manuscript.

# FUNDING

NM was supported in part by the Singleton Fellows Graduate Fellowship from the Department of Brain and Cognitive Sciences at the Massachusetts Institute of Technology (MIT), and the Simons Postdoctoral Fellowship from the Simons Initiative on Autism and the Brain at MIT. AP was supported by the Brain

# REFERENCES


Development and Disorders Project (BDDP) Postdoctoral Award from MIT. This research was also supported by the Anne and Paul Marcus Family Foundation, and the Brain Infrastructure Grant Program to the Department of Brain and Cognitive Sciences, MIT, from the Simons Initiative on Autism.

# ACKNOWLEDGMENTS

We would like to thank especially Margaret Echelbarger, who contributed significantly to data collection, transcription, and initial analyses presented at conferences, as well as provided significant comments on the present version; Lee Mavros-Rushton, coordinator of BDDP, who assisted with recruitment and in our testing sessions both with participants' behavioral issues and transportation. We are grateful to all of the participants and their families for taking part, and the Autism Resource Center of Central Massachusetts and Boston Children's Hospital for help with recruiting. We also thank all of the students in the Wexler Lab for help with collecting data; and audiences at the 29th Annual Symposium on Research in Child Language Disorders, Madison, WI, in June 2008; the 33rd Boston University Conference on Language Development, Boston, MA in November 2008; Autism Consortium Symposium, Boston, MA, in November 2008; COST Action A33, Let the Children Speak: Final Conference, London, UK, in January 2010; Autism Consortium Symposium, Boston, MA, in October 2010, where some of the findings were presented.


ed L. Jenkins (Oxford: Elsevier), 239–284. Reprinting with small changes of Wexler (2003).


**Conflict of Interest Statement:** KW is a co-author on the Test for Early Grammatical Impairment (Rice and Wexler, 2001) which was used for testing the participants in the present study. Other than that, the authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2017 Modyanova, Perovic and Wexler. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

# ASD Is Not DLI: Individuals With Autism and Individuals With Syntactic DLI Show Similar Performance Level in Syntactic Tasks, but Different Error Patterns

### Nufar Sukenik and Naama Friedmann\*

Language and Brain Lab, School of Education, Tel Aviv University, Tel Aviv, Israel

### Edited by:

Anna Gavarró, Universitat Autònoma de Barcelona, Spain

### Reviewed by:

Nadezhda (Nadya) Modyanova, Montana State University, United States Stavroula Stavrakaki, Aristotle University of Thessaloniki, Greece

\*Correspondence:

Naama Friedmann naamafr@post.tau.ac.il

### Specialty section:

This article was submitted to Language Sciences, a section of the journal Frontiers in Psychology

Received: 05 October 2017 Accepted: 19 February 2018 Published: 04 April 2018

### Citation:

Sukenik N and Friedmann N (2018) ASD Is Not DLI: Individuals With Autism and Individuals With Syntactic DLI Show Similar Performance Level in Syntactic Tasks, but Different Error Patterns. Front. Psychol. 9:279. doi: 10.3389/fpsyg.2018.00279 Do individuals with autism have a developmental syntactic impairment, DLI (formerly known as SLI)? In this study we directly compared the performance of 18 individuals with Autism Spectrum Disorder (ASD) aged 9;0–18;0 years with that of 93 individuals with Syntactic-Developmental Language Impairment (SyDLI) aged 8;8–14;6 (and with 166 typically-developing children aged 5;2–18;1). We tested them using three syntactic tests assessing the comprehension and production of syntactic structures that are known to be sensitive to syntactic impairment: elicitation of subject and object relative clauses, reading and paraphrasing of object relatives, and repetition of complex syntactic structures including Wh questions, relative clauses, topicalized sentences, sentences with verb movement, sentences with A-movement, and embedded sentences. The results were consistent across the three tasks: the overall rate of correct performance on the syntactic tasks is similar for the children with ASD and those with SyDLI. However, once we look closer, they are very different. The types of errors of the ASD group differ from those of the SyDLI group—the children with ASD provide various types of pragmatically infelicitous responses that are not evinced in the SyDLI or in the age equivalent typically-developing groups. The two groups (ASD and SyDLI) also differ in the pattern of performance—the children with SyDLI show a syntacticallyprincipled pattern of impairment, with selective difficulty in specific sentence types (such as sentences derived by movement of the object across the subject), and normal performance on other structures (such as simple sentences). In contrast, the ASD participants showed generalized low performance on the various sentence structures. Syntactic performance was far from consistent within the ASD group. Whereas all ASD participants had errors that can originate in pragmatic/discourse difficulties, seven of them had completely normal syntax in the structures we tested, and were able to produce, understand, and repeat relative clauses, Wh questions, and topicalized sentences. Only one ASD participant showed a syntactically-principled deficit similar to that of individuals with SyDLI. We conclude that not all individuals with ASD have syntactic difficulties, and that even when they fail in a syntactic task, this does not necessarily originate in a syntactic impairment. This shows that looking only at the total score in a syntactic test may be insufficient, and a fuller picture emerges once the performance on different structures and the types of erroneous responses are analyzed.

Keywords: ASD, SLI, syntax, relative clauses, syntactic impairment

# INTRODUCTION

Autism Spectrum Disorder (ASD) is characterized by a triad of impairments that affect communication, social interaction, and behavioral repertoire (Pickles et al., 2009; Loucas et al., 2013). One of the most debated issues in the research on language abilities in children with ASD is whether their communication difficulties are a result of language impairment, with characteristics similar to those found in non-ASD children with Developmental Language Impairment (DLI, previously known as "SLI"<sup>1</sup> ). This is the question we examine in this study, by directly comparing the performance of individuals with ASD to that of individuals with DLI. We focus on a specific type of DLI that selectively affects syntax, Syntactic DLI (SyDLI for short), and compare the performance of the two groups in syntactic tests of comprehension, production, and repetition of complex syntactic structures.

The differences and possible overlap between DLI and ASD captured the interest of many researchers over the last decade (see Bishop, 2003; Botting and Conti-Ramsden, 2003; Tager-Flusberg, 2006; Eigsti and Bennetto, 2009; Tomblin, 2011; Terzi et al., 2014; Tuller et al., 2017, for extensive reviews). Studies of this question tested various language domains and reached different conclusions. Several studies examined the similarity between ASD and DLI in the lexical domain. McGregor et al. (2012) and Demouy et al. (2011) both found that the ASD groups they tested showed similar lexical performance to that of the DLI group, but the error patterns differed between the groups. Both studies reported that the ASD group committed more pragmatic errors than the DLI group (McGregor et al. also reported the same for the ASD group in comparison to the age-matched TD group). Different results were found in other studies, which reported that individuals with ASD performed more poorly than those with DLI on input lexical tasks (Loucas et al., 2013), and in linguistic concept tests (Manolitsi and Botting, 2011). Yet others found that individuals with ASD were better than those with DLI in tasks such as word association and word structure (Lloyd et al., 2006).

Other studies examined phonological abilities in ASD, in comparison to DLI, mostly through the repetition of nonwords (Whitehouse et al., 2008; Demouy et al., 2011; Riches et al., 2011; Williams et al., 2013). In this domain, too, the results are mixed. Demouy et al. (2011), Whitehouse et al. (2008), and Loucas et al. (2010) reported that children with ASD who had language impairment (ASD+LI; LI defined as scores below the norms on standardized language tasks) showed impaired performance in nonword repetition that was similar to that of the DLI group and lower than TD. In contrast, Durrleman and Delage (2016) and Riches et al. (2011) reported that the nonword repetition of the ASD (who had LI) participants was better than that of the DLI participants.

This distinction, which was made in various studies, between ASD with language impairment and ASD with normal language (e.g., Kjelgaard and Tager-Flusberg, 2001; Whitehouse et al., 2008; McGregor et al., 2012; Gavarró and Heshmati, 2014; Modyanova et al., 2017; Tuller et al., 2017) reflects an important insight with respect to language in ASD. This is also a conclusion of various studies comparing ASD to DLI: some ASD participants show language impairment, whereas others have language performance similar to TD (see e.g., Kjelgaard and Tager-Flusberg, 2001). Kjelgaard and Tager-Flusberg (2001) also report another subgroup of ASD participants, who show language difficulties across all language structures and domains they tested. Such studies indicate that it might be impossible to make a general claim regarding language in ASD, given the considerable heterogeneity within this group (Lombardo et al., 2015; see also Brock et al., 2017).

Thus, the heterogeneity of impairment within the ASD group may be one source of the differences between the results of different studies that examined whether the language difficulty in ASD resembles the language impairment in DLI. Another source is the heterogeneity within the DLI group. Several studies of DLI showed that DLI has many faces, and that various language domains can be selectively affected, giving rise to various types of DLI, which selectively affect syntax, lexicon, phonology, or pragmatics (Korkman and Hakkinen-Rihu, 1994; Conti-Ramsden et al., 1997, 2001, 2006; Conti-Ramsden and Botting, 1999; van Daal et al., 2004; Bishop, 2006; Friedmann and Novogrodsky, 2008, 2011).

One such type of DLI, which has been studied extensively, is Syntactic DLI (or SyDLI), in which the syntactic abilities are

<sup>1</sup>Notes on the choice of the term SyDLI:

**<sup>(1)</sup> Why we prefer DLI over SLI:** We choose to use the term DLI, developmental language impairment, instead of SLI, specific language impairment, following the need to change expressed in Bishop et al. (2017) and Norbury and Sonuga-Barke (2017), for two reasons: the word "specific", which contributes the S to SLI was meant to exclude individuals with language deficits who also have other types of impairment, such as ASD. Thus, by definition, children with ASD may never have SLI. The other reason is that the term DLI adds the important information that the language deficit is developmental.

**<sup>(2)</sup> Why we prefer SyDLI over DLD:** DLI has subtypes, referring to the language domain that is impaired. There are syntactic, phonological, lexical, and possibly also semantic and pragmatic types of DLI. We believe that this heterogeneity should be reflected in the nomenclature of the DLI group tested. In the current study we focus on syntactic impairment, which we term below "SyDLI", syntactic DLI. We preferred not to use the term developmental syntactic disorder, DSD, which would be too confusing with ASD; We preferred SyDLI over SyDLD because SyDLI can be pronounced as a single word, unlike SyDLD (see also Bishop et al., 2017, who suggested the term DLD, for arguments for and against preferring disorder over impairment).

specifically affected. The main syntactic constructs that have been identified as impaired in SyDLI are structures that involve syntactic movement (Adams, 1990; van der Lely and Harris, 1990; Håkansson and Hansson, 2000; Schuele and Tolbert, 2001; Stavrakaki, 2001; Friedmann and Novogrodsky, 2004, 2007, 2011; Hamann, 2005; Novogrodsky and Friedmann, 2006; Jakubowicz and Gutierrez, 2007; Levy and Friedmann, 2009; Jakubowicz, 2011; Friedmann et al., 2015; Hamann and Tuller, 2015); pronominal object clitics (Jakubowicz et al., 1998; Hamann et al., 2003; Paradis et al., 2003; Hamann, 2004; Parisse and Maillart, 2004; Jakubowicz and Tuller, 2008; Stavrakaki et al., 2011; Tuller et al., 2011), and verb inflections (Wexler, 2011; Rothweiler et al., 2012; Leonard, 2017).

These syntactic domains that have been identified as clinical markers for syntactic impairment in DLI are the best targets for examining whether ASD resembles DLI. Indeed, these domains have been tested in ASD, again, with mixed results. Terzi et al. (2014) tested structures that are considered clinical markers for DLI in Greek—passive sentences, pronouns, and pronominal clitics—in Greek-speaking children with ASD aged 5–8 years. They found that the children with ASD performed similarly to TD children in passive sentences and pronouns, but poorer than the TD children in the comprehension of pronominal clitics. Whitehouse et al. (2008) compared the performance of English-speaking ASD+LI group to a DLI group in the TROG (Test for reception of grammar, Bishop, 1989) sentence comprehension and sentence repetition task. They reported that the ASD+LI participants performed similarly to the DLI on the sentence comprehension task but better than the DLI on sentence repetition. Manolitsi and Botting (2011), who also compared comprehension and production in ASD and DLI, reached a different conclusion: the children with ASD they tested performed poorer than the DLI on receptive language tasks and similar to DLI in sentence production tasks. Roberts et al. (2004) tested the performance of English-speaking children with ASD on 3rd person- and past tense morphology. They report that a subgroup of the ASD group who had a language impairment showed high rates of omissions of tense marking, like Englishspeaking children with DLI.

Roberts et al. also noticed an important difference between the populations with respect to the types of incorrect responses they produced. The children with ASD made errors that the DLI did not make, such as echolalic responses and perseverations, as well as semantically inappropriate, off-topic responses (see also Modyanova et al., 2017, for error types that they termed "unscorable", which the ASD+LI group makes but the DLI and the TD groups do not). The same point regarding different error types was also made by Demouy et al. (2011), who assessed sentence comprehension and production in French-speaking ASD and DLI participants. They found that the ASD group performed similarly to the DLI group in comprehension, and that both groups showed impaired performance in sentence production. However, the groups crucially differed with respect to the errors they made: the children with ASD produced significantly more pragmatic errors than the DLI participants. Such pragmatic errors are responses that are inappropriate, unrelated to the stimuli, reflecting misunderstanding of the situation in the stimuli or failing to understand the intention of the experimenter and the purpose of the conversation.

Beyond the important finding that children with ASD make errors that other language impaired children do not, researchers noticed that the pattern of performance across different sentence types also differs. Gavarró and Heshmati (2014) tested the comprehension of passive sentences in Persian-speaking children with ASD. They found that a subgroup of the ASD (classified as low-functioning ASD) performed poorly on these structures. An important observation these researchers made was that the children with ASD who made errors in this task actually performed poorly on all sentence structures, including active sentences, unlike children with DLI, who are selectively impaired in passive sentences, but not active sentences (e.g., van der Lely, 1996, for English passives in DLI). Durrleman et al. (2017) also tested various types of passives vs. active sentences in ASD and also found that children with ASD performed poorly on passive sentences, but many of them also performed poorly on active sentences. The results of both studies suggest that the underlying deficit that gave rise to the difficulties of the children with ASD in this task may have been different in nature from that of children with DLI.

Durrleman et al. (2016) made a similar observation regarding the across-the-board pattern of deficit in ASD, this time in structures derived by Wh movement. They tested children with ASD aged 6–16 years on the comprehension of Wh questions and relative clauses of various levels of syntactic complexity. They found that the ASD group performed poorer than a group of younger TD children. Importantly, the children with ASD showed difficulty across the board, including in simple sentences, and not only in the sentences with syntactic movement with configurational intervention (in which the full NP object moves across a full NP subject). This, again, indicates that their deficit is different in nature from that of SyDLI children, who typically show differential performance in syntactically simple and in complex sentences, and who show clear effects of configurational intervention (Friedmann et al., 2015).

These syntactic studies thus showed that when ASD is tested with syntactic structures that are clinical markers for syntactic DLI, some children with ASD show impaired performance. However, not all children with ASD show syntactic impairments, and when they do, they sometimes show different error types and different patterns of performance. The next important step forward in our understanding of the relation between ASD and DLI comes from recent studies that compared directly between the performance of these two populations in the syntactic domain, which used specific syntactic structures that may yield differential performance in the two groups, and which looked at error types in the two groups.

Durrleman and Delage (2016) tested the production of pronominal clitics in a group of French-speaking children with ASD and a group of children with DLI. They compared 3rd person accusative clitics, known as a clinical marker for DLI in French (Parisse and Maillart, 2004; Jakubowicz and Tuller, 2008; Tuller et al., 2011), and first person accusative clitics. They found that the ASD and DLI groups performed similarly on third person accusative clitics and in sentence completion testing verbal inflection, prepositions, and passive. The groups differed on first person clitics, which were impaired in ASD (even for children with ASD whose grammar was normal), but mastered by all children with DLI. These results indicate the different sources of impairments in ASD and DLI: third person clitics may require specific syntactic abilities, whereas the use of first person clitics involves pragmatic abilities. Importantly, Prevost et al. (unpublished manuscript) found that once the pragmatic demands on the use of first person clitics are relieved through explicit instructions, children with ASD can actually produce first person clitics similarly to TD children.

Tuller et al. (2017) tested French-speaking children with ASD and children with DLI on sentence-picture matching, sentence repetition, and sentence elicitation tasks, and also analyzed samples of their spontaneous speech. They found that the two groups had similar morphosyntactic performance. The subgroup of ASD who had LI showed impaired performance in three domains that are impaired also in DLI: pronominal clitics, reduced use of embedded sentences, and a large rate of erroneous complex sentences.

Finally, in a recent study, Creemers and Schaeffer (2016) provided another clear and elegant demonstration of the differences between these groups. They compared Dutchspeaking ASD and DLI participants using a lexical-syntactic task of mass-count distinction, and a pragmatic task that tested the use of definite markers. The ASD participants outperformed the DLI participants on the grammatical mass-count task, in which they performed at the TD level, but performed below the DLI when they had to provide a definite determiner, a task that requires pragmatic abilities (Armon-Lotem and Avram, 2005; Balaban et al., 2016; Schaeffer, 2016).

Studies comparing individuals with ASD to individuals with DLI thus focus on syntactic structures that are known to be sensitive markers for syntactic DLI in the relevant languages. In Hebrew, the structures that are most indicative of a syntactic impairment for school-aged children and adults are structures derived by a syntactic movement called "Wh movement" (because this is the type of movement that derives Wh questions), such as relative clauses, topicalized structures, and Wh questions (Friedmann and Novogrodsky, 2004, 2007, 2011; Novogrodsky and Friedmann, 2006; Levy and Friedmann, 2009; Friedmann et al., 2015). These structures with Wh movement are demonstrated in examples 1–3. Relative clauses (example 1), topicalized sentences (2), and Wh questions (3) are all derived by the same type of syntactic movement: movement of a noun phrase to the beginning of the sentence (to spec-CP, in syntactic terms). This movement is schematized by an arrow in examples 1–4, and the position from which the noun phrase has moved (sometimes referred to as "the gap" or the "trace of movement") is marked by an underline.

Examples for structures derived by Wh movement in Hebrew:

(1) a. zo ha-yalda she-ha-safta mecayeret \_\_ This the-girl that-the-grandmother draws This is the girl that the grandmother is drawing \_\_ b. ha-yalda she-ha-safta mecayeret \_\_ xiyxa The-girl that-the-grandmother draws smiled The girl that the grandmother is drawing \_\_ smiled


Relative clauses can be created by movement of the subject NP (as in Example 4) or of the object NP (example 1). Whereas both types of relative clause are derived by Wh-movement, object relatives have been shown to be more impaired than subject relatives in Hebrew SyDLI (Friedmann and Novogrodsky, 2004, 2007, 2011; Novogrodsky and Friedmann, 2006; Levy and Friedmann, 2009; Friedmann et al., 2015). This difference has been ascribed to the different properties of the movement in the two structures: whereas in subject relatives the movement does not change the canonical order of the agent and the theme, in object relatives the object noun phrase moves across the subject noun phrase, and this movement is problematic in SyDLI (Friedmann et al., 2009, 2015; Hamann and Tuller, 2015) 2 .

These structures in which the (full noun phrase) object undergoes Wh movement across another full noun phrase are acquired around age 6 in Hebrew-speaking TD children (Varlokosta and Armon-Lotem, 1998; Günzberg-Kerbel et al., 2008; Friedmann et al., 2009). (The sentences with object relatives in 1a and 1b differ with respect to the position of the relative clause within the sentence—in 1a it is in the end of the sentence, and it is therefore called a "final branching" or "right branching" relative clause—this type of object relative is acquired in Hebrew around age 6. Sentence 1b includes a relative clause in the middle of the sentence ("center-embedding" relative clause), between the subject and the main verb. This kind of relative clause is acquired in Hebrew around 4th grade, age 9–10).

In the current study, we assessed the comprehension and production of these structures in ASD and SyDLI to study in detail the similarities and differences between the two groups. We analyzed the individual performance of each participant in order to examine the degree of heterogeneity within the group. To examine our research question, whether the language difficulty in ASD can be characterized as SyDLI, we looked at the patterns of performance of each participant across different sentence structures, to see whether the ASD participants show a differential pattern that resembles that of SyDLI, and analyzed

<sup>2</sup>A different type of movement of the phrase, termed A-movement, involves the movement of the object to subject position. This more local movement is tested in Experiment 3 and discussed in section Experiment 3: Sentence Repetition Task.

response types in detail, to examine the differences in error types between the two groups. If the language difficulty in ASD is similar to SyDLI, we would expect to see similarity in the patterns of performance—the ASD participants should show the same distinctions between impaired and intact structures as the SyDLI participants. We would also expect the ASD participants to make the same types of errors as children with SyDLI.

# METHODS

# Participants ASD Participants

Participants in the ASD group were 18 Hebrew-speaking children and adolescents with autism (16 boys) aged 9;0–18;0 years (M = 13;4, SD = 3;1. Nine of the participants were in 9th−11th grade, and nine were in 3rd−5th grade); all were taking part in a larger study of language skills in ASD at Tel Aviv University. Hebrew was the native language for all participants. All the participants were diagnosed with Autism Spectrum Disorder by a child psychiatrist prior to the study according to the DSM-IV criteria (American Psychiatric Association, 2000), and were recommended for an ASD-specific class<sup>3</sup> . Seventeen of the participants with autism were enrolled in autism-specific classes, and the remaining child received 1:1 support in a mainstream class. Thirteen were diagnosed as "High functioning", indicating that standard psychological assessment found their IQ to be normal. (The other 5 were diagnosed as "PDD-NOS"- Pervasive developmental disorder not otherwise specified—which bears no information as to their IQ). No participant was diagnosed as having "Low functioning" autism, defined as an IQ score < 75. Appendix A in Supplementary Material includes information on each of the participants: age, gender, and performance in lexical tasks (picture naming, word-picture matching), and in nonword reading. It also includes scores in a nonverbal task of picture association testing conceptual relations and world knowledge, which can be used as a proxy for nonverbal IQ.

# Syntactic DLI Participants

The participants in the syntactic DLI (SyDLI) groups in this study all participated in previous studies of syntactic DLI in our lab. Each of them was extensively tested for syntax, lexical retrieval, and phonological abilities, and each of them was found to be syntactically impaired. We took their raw data and re-analyzed the measures we selected for this study that would allow us to compare them to the ASD participants.

The SyDLI group to which we compared the performance of the ASD participants in the first task, the elicitation of subject and object relative clauses with pictures, was taken from the children tested by Novogrodsky and Friedmann (2006). It included 16

The children with SyDLI to whom we compared the performance of the ASD participants in the second task, the reading and paraphrasing task, were the DLI participants reported in Friedmann and Novogrodsky (2007). These were 15 Hebrew-speaking children with SyDLI (11 boys), aged 9;3–14;5 (mean age 11;6 years, in 4th−8th grade).

The SyDLI comparison group in the third task, the sentence repetition task, were 62 children with SyDLI aged 8;8–9;5 years (mean = 8;4, SD = 3.4) from Fattal et al. (2013).

All participants in these SyDLI groups met the exclusionary criteria for DLI (formerly referred to as "SLI") as formulated by Leonard (2014): They had no hearing impairment and no recent episodes of Otitis Media, no abnormalities of oral structure or problems in oral function; they showed no evidence of obvious neurological impairment; they had no symptoms of impaired reciprocal social interaction or restriction of activities that are typical of ASD. All of the DLI participants had normal IQ and were attending regular classes in regular schools.

# Typically-Developing Control Participants

The typically-developing (TD) control group for Experiment 1 included 15 TD children aged 9–10 years (mean = 9;2, SD = 2;2). These children were age-matched to the youngest participants in the ASD group. All were studying in 4th grade in regular classes.

The TD control group for Experiment 2 included 61 children aged 9;0–18;1 (mean age = 10;5, SD = 2.5).

The control group for Experiment 3 were 90 TD children aged 5;2–18 years (M = 8;9, SD = 4.02). This group was comprised of 40 younger TD children aged 5;2–6;9 years (M = 5;8, SD = 0.3) from Fattal et al. (2011) and Friedmann et al. (2010). These children were on average 3 years younger than the youngest children in the ASD group, and at a chronological age at which Hebrew-speaking children have already (just) acquired relative clauses and Wh questions according to previous research (Friedmann and Novogrodsky, 2004; Friedmann et al., 2009, 2015; Fattal et al., 2011; Szterman and Friedmann, 2015). The older group included 50 TD children aged 9–18 years (M = 11;6, SD = 2;6). We compared the younger and older participants' performance on each of the 5 sentence types, and none of these sentences types showed a difference between the groups (all p's > 0.32). The analysis of the total correct performance in the two groups also yielded no significant difference [t(88) = 0.55, p = 0.30]. Therefore, we lumped the results of all the 90 TD children together and treated them as one control group.

All the children in these three TD control groups had no reports of hearing loss, neurological development difficulties or socio-emotional problems. They were studying in public schools serving a middle class population, similarly to the participants with ASD and SyDLI.

The selection of the two comparison groups for the ASD group—the DLI and the TD groups—was done on the basis of the following rationale: we tested syntactic structures that are all already mastered by Hebrew-speaking children at age 9 (fourth grade) (the structures tested in Experiments 1 and 3 are acquired by age 6, the structure tested in Experiment 2 is acquired in 4th

<sup>3</sup> In Israel, placement in ASD-specific classes is highly regulated, requiring a diagnosis from a child psychiatrist or clinical psychologist, as well as the agreement of a seven-member multi-disciplinary panel including a pediatrician, an educational psychologist, a social worker who specializes in children with special needs, the chief inspector of special education in the ministry of education, an inspector of regular education, and a representative of the municipal education committee.

grade, around age 9). TD children have high scores in the 3 tasks we used in the current study by age 9, and then performance reaches plateau, so there is no change in scores after this age. We, therefore, selected ASD participants only from age 9 years and up, and compared their performance to children at ages that are supposed to already master these structures. This agematching can be taken as "Wh-movement-age-matching"—TD individuals aged 9 and above are performing similarly in the tasks we used, so they may all be considered as being of the same Wh movement-mastery age<sup>4</sup> .

As we will show below, the results of our study undermine the validity of matching by language test score: one could say that the ASD participants were matched to the SyDLI participants by syntactic test scores: as we report below, their total scores in the three tests did not differ from that of the SyDLI group. However, we found critical differences between the groups with respect to the types of the errors they made and the patterns of impairment and the sparing of the various syntactic structures, indicating that a similar test score does not indicate that the abilities are similar.

Additionally, the matching by some measure that is not age (e.g., IQ, vocabulary, lexical retrieval) which was applied in some earlier ASD studies was probably based on the assumption that this measure correlates with the ability tested. We found no correlations in the current study between any of the measures presented in Appendix A (Supplementary Material)—nonverbal conceptual ability, lexical retrieval/vocabulary, or reading decoding ability—and the ASD participants' performance in any of the syntactic tasks (tested with Pearson correlation and Bonferroni correction). So matching to control group by these measures is not warranted.

# Tasks

## Experiment 1: Production of Subject and Object Relative Clauses

We tested the participants' production of subject and object relative clauses using a sentence elicitation task with pictures (BAFLA ZIBUV test, Friedmann, 1998). The participant was shown a page with two pictures. Each of the two pictures on the page included the same two figures. In the top picture, one figure was performing an action on the second figure and in the bottom picture the roles were reversed. The experimenter described the two pictures using simple sentences and then asked the participant about one of the figures in each picture. The participants saw 10 picture pairs, and were asked one question about one figure in each picture (see **Figure 1** and Example 5). One question was targeted at producing a subject relative and one at an object relative, with a total of 10 target subject relatives and 10 target object relatives. The order of the subject and object relative target sentences was randomized across the picture pairs.

(5) The experimenter presented the pictures in **Figure 1** and said: "There are two boys in these pictures. In one picture the boy is drying the hippo, and in one picture the hippo is drying the boy. Which boy is this? (pointing to the boy in the top picture) . . . and which boy is this? (pointing to the bottom one, after the participant provided an answer to the first question). Start your answer with "This is . . . "."

a. **Target subject relative**: describing the boy in the top picture in **Figure 1**.

ze ha-yeled she-menagev et ha-hipo This is the boy that is drying the hippo.

b. **Target object relative**: describing the boy in the bottom picture in **Figure 1**.

ze ha-yeled she-ha-hipo menagev This is the boy that the hippo is drying.

Before the beginning of the task, an example question was shown to each participant to make sure the participant understood the task and the requirement for starting with "This is . . . " was introduced. This practice item was not included in the data analysis. If it seemed that the participant did not fully understand the task, the experimenter demonstrated the requested response to the practice item and asked the participant to do as she did (for details about this task see Friedmann and Szterman, 2006; Novogrodsky and Friedmann, 2006).

elicitation task (Experiment 1).

<sup>4</sup> In a way, the participants were also IQ-matched, because all the ASD participants had normal IQ—for 13 of the participants, the ASD diagnosis included normal IQ, and for the other 5, for whom we had no report of pre-tested IQ, we had the performance of the picture association task, which was 92–100.

In the analysis of this test's results, we counted the number of correctly produced target object relatives and subject relatives in comparison to the SyDLI and the TD children. Error analysis was done based on the error analysis described in Friedmann and Novogrodsky (2007) and new error categories were added according to new error types that appeared in the current study (mainly in the ASD group).

### Experiment 2: Reading and Paraphrasing of Object Relatives With Heterophonic Homographs

This task tested the participants' ability to understand and paraphrase written relative clauses and their ability to correctly read a heterophonic homograph whose correct reading aloud critically hinges upon the correct parsing of the grammatical structure of the sentence (BAFLA ZIKRIA, Friedmann and Gvion, 2003; Friedmann and Novogrodsky, 2007).

The task included ten verb-noun heterophonic homographs, each of which appeared in two sentences—once in a sentence with a relative clause, and once in a similar, length-matched, simple sentence. The homograph was the main verb in all the sentences. The relative clauses were center-embedded object relatives in which the heterophonic homograph appeared right after the trace (which is the original position of the moved object, right after the embedded verb, marked by an underline in Example 6).

Example (6) is an object relative clause with the homograph "gazar", the verb cut. This homograph can be read in other sentence contexts as the noun "gezer", carrot. Example (7) is a simple sentence with the same homograph.


The boy from fourth grade was cutting sports magazines.

The sentences were split into two blocks of 10 sentences, each block containing 5 object relatives and 5 simple sentences in random order. Each block was administered in a separate session. Each homograph appeared only once in each block; in one block it appeared in a relative clause and in the other block—in a simple sentence.

The participants were asked to read the sentence out loud and then explain it in their own words. If it seemed that the participant was unable to explain the sentence, s/he was asked a leading question (e.g., "Who cut?" For details on this task, see Novogrodsky and Friedmann, 2006; Szterman and Friedmann, 2014).

In the analysis of this test's results, we scored separately correct responses for reading the homograph and for paraphrasing of the sentence, and compared relative clauses and simple control sentences. Three of the ASD participants had difficulties reading the homographs because of their inaccurate reading of the rest of the sentence (they also demonstrated considerable difficulties in testing reading at the single word level) and expressed considerable frustration with the need to read in this task, so the sentence was read to them by the experimenter, and they were only asked to explain the sentence. These participants' results are reported only for the paraphrasing part.

We coded a paraphrase as correct if the participant identified correctly the agent and theme of the two verbs in the sentence.

## Experiment 3: Sentence Repetition Task

The sentence repetition task is a way of testing the participant's ability to process grammatically complex structures, in a simple task, which allows the comparison between sentences of various structures using the same task (Friedmann and Lavi, 2006; Szterman and Friedmann, 2015). In this sentence repetition task (PETEL, Friedmann, 2000), the experimenter said a sentence, and the participant was requested to count to three out loud and then repeat the sentence, as accurately as possible. The counting was included to prevent phonological memorization in the phonological loop (Baddeley, 1997; Friedmann and Grodzinsky, 1997; Szterman and Friedmann, 2015).

We used this sentence repetition task because previous studies indicated that it is a task that is very sensitive to syntactic impairment in SyDLI, agrammatic aphasia, and in children who are still in the process of acquiring syntax (Lust et al., 1996; Friedmann and Grodzinsky, 1997; Friedmann, 2001, 2007; Friedmann and Lavi, 2006; Fattal et al., 2011). The sentence repetition task uses the fact that sentence repetition cannot be a simple phonological reiteration of the input string, but it rather involves understanding the sentence and reproducing it. Consequently, syntactic impairment that affects the comprehension and production of a certain syntactic construct will result in impaired repetition of sentences with this construct.

The test assessed the participants' ability to repeat sentences derived by Wh movement: we tested the repetition of object relatives, topicalization structures, and Wh questions (object and subject questions, see Example 8). We compared these sentences to simple sentences. The simple sentences were included as minimal pairs with the sentences with Wh movement—they were identical to the Wh movement sentences in words and length and only differed from them in that they included no Wh movement. The rationale was that if the participant fails to repeat the sentences with syntactic movement but succeeds on the sentences without movement, this would point to a syntax-specific deficit in Wh movement.

The sentences derived by Wh movement were also compared to two additional types of syntactic movement: movement of the object to subject position, which is a more local movement than the Wh movement (termed "A-movement" or "Argument movement"). This short movement is often tested with passive structures. We tested it in a structure that is far more common in Hebrew—sentences in the order subject-verb in which the verb is an unaccusative verb (Example 10). (The argument of unaccusative verbs is base-generated in the object position, after the verb, so when it appears before the verb, in subject position, it appears there after moving from the original object position). Such structures are already produced by children younger than 2 years old in Hebrew (Friedmann, 2007; Friedmann and Costa, 2011; Costa and Friedmann, 2012; Reznick and Friedmann, 2017), and are far more natural than passive sentences (e.g., in an analysis of spontaneous speech of 61 Hebrew-speaking children aged 1;6–6;1, which encompassed 27,696 utterances, only a single verbal passive was produced, and even this one was ungrammatical, Reznick and Friedmann, 2017. See also Berman, 1997; Jisa et al., 2002, for the scarcity of passives in Hebrew).

We also compared sentences with Wh movement to another type of movement, in which the verb moves to the second position in the sentence (in this movement, the verb moves to a position before the subject, to the C node, and therefore this movement is sometimes termed "V-to-C movement", see Example 9). Such movement is optional in Hebrew, so the same sentence can appear either with or without the movement of the verb (compared to the simple sentence in Example 12). Finally, we examined the repetition of sentences without any of these movement types but with a different kind of syntactic complexity: embedding, which we examined through sentences with sentential complements of verbs (Example 11).

The test included 70 sentences: 10 object relatives and 10 object topicalization sentences, 5 subject and 5 object Which questions, 10 sentences with verb movement to the second sentential position, 10 sentences with A-movement in which the subject appeared before the unaccusative verb, 10 sentences with embedded sentential complement, and 10 simple sentences without Wh movement.

All sentences contained four words (accusative markers, embedding markers, and prepositions were counted with the word to which they cliticize). All the sentences derived by Wh movement were semantically reversible. The simple sentences and the sentences with verb movement to second position included half transitive and half intransitive verbs, and the sentences with embedded clauses included an embedded intransitive verb. The test started with a practice sentence that the participant was requested to repeat, which was used to make sure the participant understood the task. This sentence was not part of our data analysis.

If the participant was unable to count to three and then repeat the sentence for five sequential sentences, s/he was asked to repeat the sentence immediately without counting. (Three ASD participants could not count before repeating). Sentence types included:

### **(8) Sentences with Wh movement**


We analyzed performance in the sentence repetition task by counting for each participant the number of correctly repeated sentences for each sentence type, and compared this to the SyDLI and the TD children. We classified the repetition errors into structural errors and lexical errors. Structural errors are errors that change the thematic grid or the syntactic structure of the sentence. Lexical errors are errors of omission or substitution of a word in the sentence without affecting the thematic roles or the syntactic structure of the sentence. The ASD group made unique errors that did not fit into these error types, and we, therefore, added error categories.

# General Procedure

The three tasks reported here were administered to the ASD participants as part of a larger study of language in children with autism. In order to familiarize the participants with the experimenter, she met all the participants in their classrooms 1 day prior to testing sessions for a fun activity. Each child was tested individually in a quiet and familiar room. All children were told they were helping the researcher with a science project and were shown the tape recorder that was recording the session. They were told that they could stop whenever they wanted to go back to class or if they got tired. On completion of each task the children received a sticker and on completion of each session they received a small snack. The number of sessions for each child varied from 2 to 6 sessions, a smaller number of sessions meant longer session duration (an hour on average), whereas a larger number of sessions included shorter sessions (on average 20 min). All sessions were recorded and transcribed. All sessions were held during the morning hours to prevent results being affected by fatigue. Tasks were presented in mixed order across participants. This research was approved by the ethics committee of Tel Aviv University, as well as by the Chief Scientist of the Ministry of Education. The parents of each of the participants signed a consent form informing them of the research aim and nature of the tasks.

# Analyses

For each of the tasks, we analyzed the rate of correctly produced/understood/read/repeated target sentences of each type, and compared the performance in each sentence type to that of children with syntactic DLI and to TD children, using two preplanned comparisons. Because we could not assume normal distribution for the ASD and DLI groups, we compared the groups using non-parametric Mann-Whitney test (which we report with the statistic U, to which we add in parentheses the total N in the two compared groups). A comparison between two conditions within the ASD group was done using the Wilcoxon's signed-ranks test (which we report with the statistic T).

At the individual level, the performance of each participant with ASD was compared with the TD control group using Crawford and Howell's (1998) t-test. We used an alpha level of p < 0.05. This analysis allowed us to examine how many ASD participants performed below the norm for their age in the various tests, and also allowed us to examine the difference between different structures: individuals with SyDLI show difficulties in specific sentence structures, and succeed in other structures. We thus examined, for each individual with ASD, using the Crawford and Howell's (1998) t-test, whether their performance was below the control group in all structures or in specific structures<sup>5</sup> .

For each task, we then analyzed the error pattern of each group and compared the distribution of error types in the ASD group to that of the children with SyDLI.

# RESULTS

Below, we report for each task the percentage correct in the ASD group in comparison to that of children with SyDLI and to the TD control group. We then proceed to analyze the types of errors that the participants with ASD made, in comparison to the errors produced by the participants with SyDLI. The results were consistent across the three tasks: Whereas the total percentage correct was roughly the same for the participants with ASD and those with SyDLI, error analysis yielded different error types and different error patterns in the two groups and hence indicated that their deficits were actually different in nature.

# Experiment 1: Production of Subject and Object Relative Clauses

## ASD and SyDLI Show Similar Percentage Correct Production of Subject and Object Relatives

As a group, the participants with ASD performed poorer than the controls on both subject and object relatives [U(33) = 49.5, p = 0.0006; U(33) = 46, p = 0.0009, respectively], and similarly to children with SyDLI, U(34) = 123.5, p = 0.34; U(34) = 173, p = 0.53, for subject- and object relatives, respectively.

The percentages of correctly produced subject and object relatives in the three groups are summarized in **Figure 2**. The participants in the control group produced both subject and object relatives effortlessly and correctly (subject relatives 99% and object relative 95% correct, in line with many previous reports indicating that by the age of 7 years Hebrewspeaking children already master the production of subject

<sup>5</sup>The structure of our argument was the following: the total performance of the ASD group in the syntactic tasks is not different from that of the SyDLI group, but they do differ in error types and in the structures in which they perform poorly. Furthermore, we show that they do not show any differences between syntactic structures, unlike the SyDLI group. Therefore, our main argument is based on lack of difference (ASD vs. DLI; different syntactic structures within the same task), so in order to work against our claim, we did not use correction for multiple comparisons (which would render significant differences not different).

and object relative clauses (Friedmann and Novogrodsky, 2004; Friedmann and Szterman, 2006; Novogrodsky and Friedmann, 2006; Friedmann and Costa, 2010; Fattal et al., 2013; Friedmann et al., 2015).

# ASD and SyDLI Show Different Patterns of Performance With Respect to the Subject-Object Asymmetry

Once we looked at the pattern of subject vs. object relatives, for each participant, the large difference between the ASD and SyDLI groups started to unfold: The SyDLI group showed more consistent performance within the group and more consistent advantage for subject relatives over object relatives: except for two children with SyDLI who performed at ceiling on both relative clause types, all SyDLI participants produced more target subject than object relative clauses (we included in the target relative clauses also relatives with resumptive pronouns at the gap position, and excluded avoidance of crossing movement).

In the ASD group, the pattern was markedly different. Although as a group their production of object relatives (M = 60%) was poorer than their production of subject relatives (M = 77%), it does not seem justified to analyze their performance at the group level, as the variance within the ASD group was very large (reflected in the large standard deviations: 33% for the object relatives and 25% for the subject relatives, SDs that were 10 times larger than in the control group).

When we compare the production of object relatives to the production of subject relatives at the individual level in the ASD group, 4 of the ASD participants produced both types of relative clauses like the controls. The other 14 participants performed significantly below the control group (who were younger in age than most ASD participants) on subject relatives, on object relatives, or on both. Of the ASD participants who performed significantly below the control group, 9 participants showed impairment on both subject relatives and object relatives, and 3 ASD participants produced object relatives normally but showed impaired production of subject relatives. Only 2 ASD participants showed a pattern that is similar to that of the children with SyDLI, of impaired production of object relatives, and normal production of subject relatives.

### ASD and SyDLI Show Different Error Types

The ASD and SyDLI groups also markedly differed with respect to the types of errors they committed in this task.

### **Errors in target subject relatives**

The types of non-target responses that the ASD and SyDLI (as well as the control) groups produced for the target subject relatives are presented in **Table 1**. The error pattern of the ASD group crucially differed from that of the SyDLI group in that many of their errors were pragmatic (13% of all their responses, and 57% of their erroneous responses), whereas the SyDLI participants produced only syntactic errors, and no pragmatic errors. An error was considered pragmatic when it was unrelated to the target sentence, the question asked, or the picture presented (see Examples in 13). In the coding of responses, each response was coded separately for grammaticality (syntactically correct or including a syntactic error, or a different syntactic TABLE 1 | Distribution of responses when a subject relative was expected in the picture description task (% of responses).


structure than the required one) and for pragmatic felicity (felicitous or with a pragmatic error). Thus, some of the responses were syntactically correct but pragmatically infelicitous, some were pragmatically felicitous but syntactically incorrect, and some non-target responses were both syntactically incorrect and pragmatically infelicitous (and some included more than one type of syntactic error).

The infelicitous responses sometimes described correctly some aspect of the picture, but, importantly, they were infelicitous with respect to the task and the question the experimenter posed. The TD and SLI participants had no trouble identifying the experimenter's intent, even if they had a problem phrasing their response correctly as a relative clause. The participants with ASD often failed to understand exactly what was expected from them in the task (i.e., select a response that relates to the two options suggested to them in the lead in sentence and the action described in it), and the result were these responses, which were correct picture descriptions, yet infelicitous for the task.

### **(13) Examples for pragmatically infelicitous responses in the ASD group.**

a. Subject relative: There are two women in this picture. In one picture, the woman is drawing the girl, and in one picture the girl is drawing the woman. Which woman is this?

**Target response**: This is the woman who is drawing the girl.

**Pragmatic error**: This is the woman with the slippers.

b. Subject relative: There are two nurses in this picture. In one picture, the nurse is photographing the girl, and in one picture the girl is photographing the nurse. Which nurse is this?

**Target response**: This is the nurse who is photographing the girl.

**Pragmatic error**: This is the nurse who doesn't photograph another nurse at all.

c. Object relative: There are two cats this picture. In one picture, the cat is biting the dog, and in one picture the dog is biting the cat. Which cat is this? **Target response**: This is the cat that the dog is biting. **Pragmatic error**: This is the cat that doesn't bite cats.

### **Errors in target object relatives**

In the target object relative condition, too, the ASD participants produced different error types from the SyDLI participants: 48% of the erroneous responses of the ASD group included pragmatic errors (22% of all their responses), whereas the SyDLI participants (and the TD) produced none. The rest of the errors in both groups were syntactic errors, producing subject relatives instead of object relatives, and reducing the number of full DPs in the sentence, as shown in **Table 2** (see Examples 14 and 15).

**(14) Example for a subject relative instead of a target object relative in which one NP is reduced by the use of a reflexive verb:**

There are two girls this picture. In one picture, the girl is drying the woman, and in one picture the woman is drying the girl. Which girl is this?

**Target response**: This is the girl that the woman is drying. **Thematic role reduction and role reversal:**

zo ha-isha she-mitnagevet

This-is the-woman that-dries-reflexive.

**(15) Example for an ungrammatical object relative in a target object relative item:**

There are two boys this picture. In one picture, the boy is hugging the monkey, and in one picture the monkey is hugging the boy. Which boy is this?

**Target response**: This is the boy that the monkey is hugging.

**Ungrammatical response with doubling of the head (filled gap):**

ze ha-yeled she-ha-kof mexabek et ha-yeled This-is the-boy that-the-monkey hugs ACC the-boy.

Thus, the ASD group differed from the SyDLI group in the types of errors they produced and in the distribution of their errors: The ASD group produced pragmatic errors, whereas the SyDLI group produced only syntactic errors; moreover, the ASD group failed on both subject relatives and object relatives, unlike the SyDLI group, who failed almost exclusively on object relatives.

# Experiment 2: Object Relative Reading and Paraphrasing

The reading and paraphrasing task had several aims: it tested the way individuals with ASD understand written relative clauses, it tested the way they phrase explanations of such sentences (as well as simple control sentences), and it assessed the reading of heterophonic homographs in these sentences, whose correct reading crucially depends on the correct syntactic analysis of the sentences.

## Different Pattern of Performance in Homograph Reading

The children in the SyDLI group made very few errors in reading the homographs, and half of them did not differ from the controls in homograph reading. However, when they did make a mistake in reading the homograph, it was always in the relative clause condition—they did not make homograph reading errors in the simple sentences. Namely, the SyDLI participants' misreading of the homographs was closely related to their failure to understand object relatives<sup>6</sup> .

The children in the ASD group showed a dramatically different pattern: of the 15 children with ASD who made reading errors on the homographs, 11 made errors on both the relative clauses and the simple sentences. Namely, they did show difficulty in reading the homographs but, unlike the children with SyDLI, this difficulty was not related to the syntactic structure of the sentence in which the homograph appeared. In the relative clause condition, the ASD group read the homographs significantly poorer than control group [U(79) = 922.5, p < 0.0001], and similarly to the SyDLI group [U(33) = 86, p = 0.08]. The important difference was seen in the simple sentence condition, where the ASD group read the homographs significantly poorer than both the SyDLI group [U(33) = 57, p = 0.002] and the TD group [U(79) = 880, p < 0.0001]. **Figure 3** summarizes the homograph reading in the two sentence types in the three groups. The ASD participants' difficulty with reading the homographs, then, seems not to be related to a syntactic deficit, but rather to the lack of use of information from the semantic system to guide the choice of the correct homograph choice in the

TABLE 2 | Distribution of responses when an object relative was expected in the picture description task (% of responses).


<sup>6</sup>The SyDLI participants made very few reading errors in words other than the homographs in the relative clauses, in a rate that was not different from TD controls, see Friedmann and Novogrodsky, 2007.

phonological output lexicon (see also Brock et al., 2017). It is interesting to note that they did not use the syntactic structrue to guide their choice of the correct pronunciation of the homograph.

### Different Patterns of Performance in Sentence Paraphrasing

In the sentence paraphrasing task too, the ASD group showed impaired performance with a different pattern from that of the SyDLI group, as summarized in **Figure 4**. The main difference was that whereas the SyDLI participants showed significantly better paraphrasing of the simple sentences compared to the relative clauses (p < 0.0001), no such difference was found in the ASD group (Wilcoxon's T = 30, p = 0.17). The ASD participants performed poorly in paraphrasing both the relative clauses and the simple sentences (34 and 44% correct respectively). On the individual level analysis, 14 of the 18 participants performed significantly below the TD group on paraphrasing the object relatives, and 16 participants performed below the TD control participants on paraphrasing the simple sentences.

On paraphrasing the relative clauses, the ASD participants performed below the control participants [U(79) = 1022.5, p < 0.0001], and did not differ from the SyDLI group [U(33) = 96, p = 0.17]. Like in the reading analyses, also in paraphrasing, in the simple sentences, the ASD group performed significantly poorer than both the SyDLI group [U(33) = 33, p = 0.0002] and the control group [U(79) = 1022, p < 0.0001].

Thus, we see again that the ASD participants, although failing in the paraphrasing task, crucially differed from the SyDLI participants in their error patterns: they failed on both object relatives and simple sentences, whereas the SyDLI participants only failed on paraphrasing the relative clauses.

### Different Error Types in Sentence Paraphrasing

The analysis of the errors that the ASD and SyDLI participants produced when they tried to paraphrase the sentences shed further light on the differences between these groups. The decisive majority of paraphrasing errors of the children with SyDLI were thematic role errors—failing to understand who the agent and the theme in the sentence are. The children with ASD showed a very different pattern: as summarized in **Table 3**, they almost never made such thematic role errors. In fact, only one participant did so in a single paraphrase. This surprised us, and we checked and re-checked their errors, but indeed they did not. Instead, they provided types of responses we did not see in the paraphrases of the SyDLI and TD groups. They often (24 of their responses) simply provided a word or **several words** from the target sentence, with no specific structure (Examples 16–18). In other cases they provided a response that **did not explain the sentence** or reflected complete failure to understand the sentence (Examples 19, 20). Of these responses, 32 responses were cases in which the ASD participants used a **pronoun** to explain a sentence, even though there was no way for the experimenter to know to which NP this pronoun was referring (Example 21). We found this kind of response especially interesting because the use of pronouns without establishing a reference in discourse

TABLE 3 | Distribution of the various types of paraphrasing errors (% out of all the correctly read sentences).


is a landmark of the discourse of individuals with Theory of Mind (ToM) impairment (Balaban et al., 2016). In other cases, some ASD participants chose an avoidance strategy of saying "**I don't know**" even when asked guiding questions to see if they understood the sentence. Even genuine attempts by the experimenter to lead the participant to provide an explanation of the sentence often arrived at a dead-end (see Examples 16 and 17).

In addition, all three groups sometimes **repeated** (re-read) the written sentence instead of explaining it, but this type of non-target response occurred more often in the ASD group. When they repeated a sentence instead of explaining it, the ASD participants sometimes randomly changed the inflection of a verb in the sentence they repeated (sometimes from the present to past or vice versa, and once even changing the number agreement of the verb, so that the verb no longer agreed with the rest of the sentence). They sometimes also repeated only part of the sentence in a way that did not yield a sentence (e.g., explaining the sentence The judge that the man drew speaks on the radio: "That the judge that the man drew"). This did not happen in the two other groups.


cama The friend that daddy brought combs the braid of the girl **Paraphrase**: cama! (experimenter: le-mi yesh cama?) l-ayalda. (experimenter: yesh po od mishehu b-a-mishpat? mi?) ken. Yeled. (ma hu ose?) klum.

braid! (Exp: who has a braid?) the girl! (exp: is there anyone else in the sentence? who?) yes, a boy. (exp: what does he do?) nothing.

**(18) Target**: Ha-leican she-ha-yeled raa metaken sulamot b-akirkas

The clown that the boy saw is fixing ladders at the circus **Paraphrase**: Yeled, Leican, Sulam, Kirkas. boy, clown, ladder, circus.

**(19) Target**: Ha-leican she-ha-yeled raa metaken sulamot b-akirkas

The clown that the boy saw is fixing ladders at the circus **Paraphrase**: Kirkas. Atem yod'im ma ze kirkas? Yesh sham harbe cva'im ve-yeladim. ve-yeladot.

Circus. You know what a circus is? there are many colors and boys there. And girls.


# Variability Within the ASD Group and Indications for Ability to Produce Relative Clauses in the Paraphrases

Another insight into the ASD participants' syntax can be gained from analyzing the syntactic structures of the sentences they produced in their paraphrases. When we look at each of the 18 individuals with ASD, we see that 15 of them used voluntarily and correctly at least one relative clause when they attempted to explain the sentences, and 13 of them even produced an object relative (which is notoriously difficult to produce even when directly elicited in the SyDLI group). Indeed, some of these relative clauses were produced within responses that did not explain the target sentence, but the syntactic machinery constructing relative clauses seems to be working. (For example, to explain The guy that the boy liked cut newspapers, one participant said "This is a guy. . . [exp: what did he do?] . . . took the boy who liked carrots", spontaneously producing a subject relative). Three ASD participants did not produce even a single relative clause in their paraphrases.

To conclude, in the sentence paraphrasing task we saw a general pattern that was similar to the one we found in Experiment 1: the ASD participants performed very poorly in this syntactic task, but their patterns differed from that of the SyDLI participants: they made errors on both the relative clauses and the simple sentences, and the types of responses they provided were very different from those provided by the SyDLI participants. The syntactic structures that they used in their explanations actually showed that they are able to use relative clauses (semi) spontaneously.

# Experiment 3: Complex Sentence Repetition Task

### Different Patterns of Performance in Repetition of the Various Sentence Structures

In the sentence repetition task, again, most children with ASD showed a completely different pattern from the one evinced by the children with SyDLI. The participants with SyDLI showed impaired repetition of sentences with Wh movement (and some of them also in sentences with V-to-C movement) alongside much better and above 94% correct in the repetition of embedded sentences, sentences with A-movement, and simple sentences, as shown in **Figure 5**. The children with ASD did not show this selective syntactically-principled difficulty, and repeated correctly less than 81% of the sentences in each of

FIGURE 5 | Repetition of the various syntactic structures in the ASD, SyDLI, and control groups (average % correct; error bars indicate standard deviations). Wh-O, sentences with Wh movement of the object: topicalization; object relative, object Wh question; V-C, verb movement to second sentential position. A, A-movement of the theme of an unaccusative verb—from object to subject position; embedded, sentential complements of verbs; simple, SV sentences without any of the above movements and without embedding.

these 5 structures [their repetition was significantly poorer than that of the controls on each of the sentence types: sentences with Wh movement: U(108) = 1,485, p < 0.0001; verb movement: U(108) = 1,400.5, p < 0.0001; A-movement: U(108) = 1,393, p < 0.0001; Embedded sentential complements: U(108) = 1,227.5, p < 0.0001; simple control sentences: U(108) = 1,310, p < 0.0001].

As both the ASD and the SyDLI groups performed poorly on the Wh movement and the verb movement structures, the ASD group did not differ from the SyDLI group on Wh movement [U(80) = 730.5, p = 0.05, which would be decisively nonsignificant once a correction for multiple comparisons is applied] or verb movement [U(80) = 650.5, p = 0.30]. However, only the ASD participants failed on the A-movement sentences, the embedded structures, and the simple sentences, so the ASD group performed significantly below the SyDLI group on these structures [U(80) = 885, p < 0.0001; U(80) = 758, p = 0.004; U(80) = 828.5, p = 0.0001, respectively]. In addition, unlike the SyDLI group, the ASD participants showed impaired repetition not only of the object Wh questions, but also of the subject Wh questions (which they repeated only 72% correct), with no significant difference between the two. Namely, one straightforward difference between the ASD and SyDLI performance on the sentence repetition task was that the children with ASD showed impaired repetition of all the tested syntactic structures, including the simple sentences, whereas the SyDLI participants showed a deficit that was specific to Wh movement (and some of them also to verb movement).

### Large Variability Within the ASD Group

As was the case in the two previous tasks, the variance within the ASD group was very large. When we compare, for each individual with ASD, the performance in the various sentence structures in comparison to the 91 control participants (using Crawford and Howell's, 1998, t-test), we find many different patterns of impairment: 5 of the 18 ASD participants had very low scores on all sentence types; 3 participants had very low scores on all sentence types but the simple sentences; 2 had normal repetition of embedded sentences and sentences with Wh movement but showed very impaired repetition of the other sentence types; One participant had high scores on simple sentences and embedded sentences but very low scores on the rest; One participant had very low scores on all sentences excluding embedded sentences where he performed at the control group level; One participant had extremely low scores on verb movement but control level scores on all other sentences. Whereas 14 of the 18 ASD participants had a total score that was significantly below that of the control group, only 4 ASD participants showed the performance that is typical of SyDLI, with poorer-than-normal repetition of sentences with Wh movement and good repetition of the other structures, and one participant showed poorer-thannormal repetition of verb movement and Wh movement, similar to the response pattern of some of the SyDLI participants.

### Different Error Types in the ASD and SyDLI Groups

As in the previous two experiments, in this task, too, the ASD group differed from the SyDLI group also with respect to the types of errors they produced when they failed to repeat a sentence. We classified the errors into structural and lexical. An error was coded as structural when the participant produced a sentence using the nouns and verbs that appeared in the target sentence but changed the grammatical structure of the target sentence [either by changing its syntactic structure to a simpler structure (Example 22), or by reversing the thematic roles in the sentence]. Errors were coded as lexical errors when the participant produced a sentence that was structurally identical to the target sentence, but substituted or omitted words in the sentence (in most cases to a semantically related word) (Example 23).


Whereas the errors of the children with SyDLI (and the few errors that the TD participants made) could be classified into one of these two error types, structural errors and lexical errors, the ASD group made other types of errors that were not witnessed in the DLI and TD groups. Even the youngest children in the TD group, who were 5 years old, did not make such errors [these errors were indeed unique for the ASD group—showing a significant difference between the three groups, Friedman's test χ 2 (2) <sup>=</sup> 8927.2, <sup>p</sup> <sup>&</sup>lt; 0.0001]. The children with ASD made perseveration errors (Albert and Sandson, 1986; Cohen and Dehaene, 1998), in which the participants' previous response persisted and interfered with their new response (where one or more words from a previous response appeared in the repetition of another sentence, see Example 24).

An additional error that was unique to the ASD group was information addition: some ASD participants were able to repeat the target sentence accurately but thenwould addinformation that had not appeared in the target sentence (Example 25). In other cases, they changed the target sentence completely, producing their interpretation of it, even though they were told numerous times to only repeat what they heard (Example 26), or providing their free-associations related to the sentence they had heard.

An additional error type that was found only in the ASD group was answer-instead-of-repetition. When requested to repeat a question, some ASD participants simply answered it, instead of repeating it (Example 27).


Yesterday the girl kissed the teacher **and the monster**


**Table 4** summarizes the error types in the three groups (some responses included more than one kind of error, in which cases the response was coded for each type of error it included).

# Individual Syntactic Abilities Across the Three Tasks

Another step in our quest for an answer to the question "Do ASD individuals have (syntactic) DLI" was to look at each participant's syntactic ability individually. We saw in the three tasks that the participants make many errors that originate in a pragmatic, rather than a syntactic difficulty. Therefore, we excluded these errors, and cast the question as two separate ones: (a) Do any of the ASD individuals have a syntactic deficit that resembles that of individuals with SyDLI (in addition to a pragmatic difficulty?) and (b) Are there individuals with ASD who have completely normal syntax (in addition to a pragmatic difficulty?).

The analysis of the three tasks on the individual level yielded the following answers to these two questions:

(a) Only one participant showed a syntactic deficit that consistently resembled that of individuals with SyDLI (once we disregarded his responses that were pragmatically infelicitous), with syntactic errors in producing object relatives in the relative clause elicitation task alongside good production of subject relatives that was similar to that of the control participants; thematic role misunderstanding in the relative clause paraphrasing task; and impaired repetition of Wh movement-derived sentences (object Wh questions, relative clauses, and topicalized sentences). Another ASD participant showed impaired production of object relatives with many syntactic errors and avoidance of Wh movement, and very poor repetition of sentences with Wh movement, but his reading of the homographs in the sentences was so poor that it did not allow us to determine whether he understood the syntactic structure and thematic role grid of the sentences he read or not.

TABLE 4 | Distribution of the various error types in the sentence repetition task (% of all repetition responses).


(b) Seven of the ASD participants showed normal syntactic abilities. They produced both subject and object relatives (even though many of their responses were pragmatically odd), understood well the object relatives they read, at the level of the control participants (although their explanations were not always taking the hearer state of knowledge, or the experimenter's requests into account); and did not make syntactic errors in repeating sentences derived by Wh movement (even if they provided their own interpretation to the sentences from time to time or produced perseverations of lexical items from previous sentences)<sup>7</sup> .

This analysis shows the great variability with respect to syntax in the ASD group: seven showed normal syntax in sentences derived by syntactic movement; many participants showed poor performance across various syntactic structures including simple ones, a pattern that differs from that of syntactic DLI; and only one ASD participant showed a pattern that resembles the specific pattern that characterizes syntactic DLI.

# DISCUSSION

A question that arises often with respect to the language abilities of children with ASD is whether they have DLI (e.g., Bishop, 2001, 2003, 2006; Kjelgaard and Tager-Flusberg, 2001; Tager-Flusberg, 2006). This study has a very clear answer for this question: No. We used three tasks of comprehension and production of syntactically complex sentences and compared the performance of children with ASD to children with syntactic DLI. The results of the three tasks were the same: whereas, prima facie, the overall correct performance on the syntactic tasks is similar for the children with ASD and those with SyDLI, once we look closer, they are very different. The types of errors that the ASD group makes differ from those of the DLI group. They also differ in the pattern of performance—the children with SyDLI show impaired performance in specific sentence types, and better performance on other types, in a syntactically principled way; the ASD participants show generalized low performance on the various sentence tasks. Additionally, the huge variability within the ASD group was manifested in the analysis at the individual level: some children with ASD had completely normal syntactic performance, some showed difficulties in the syntactic tasks but these looked very different from that of children with SyDLI, and only one participant showed the syntactic pattern shown in SyDLI.

This general pattern can be seen in each of the three tasks in this study: Experiment 1 examined the production of subject and object relative clauses, a domain that has been repeatedly reported as being very difficult for children with syntactic

<sup>7</sup>Two of the participants we considered to have normal syntax made word order reversal errors in the repetition task. We nevertheless concluded they had intact syntax because they produced many correct relative clauses in all of their paraphrases, even in paraphrases of simple sentences. (One of them even spontaneously produced verb movement to second position in his paraphrases). We suspect that their failure to repeat the sentences accurately may have resulted from the way they understood the task and from their pragmatic deficit rather than from a syntactic difficulty.

DLI (Schuele and Nicholls, 2000; Novogrodsky and Friedmann, 2002, 2006; van der Lely and Battell, 2003; Friedmann and Novogrodsky, 2004, 2007, 2008, 2011; Delage et al., 2007; Friedmann et al., 2015). When we only look at percentage correct performance in this task, both groups showed impaired performance. However, the two groups differed substantially in their error types and in the pattern of performance in the different sentence structures. The children with SyDLI showed a clear asymmetry between their production of subject and object relatives, with subject relatives being consistently better produced than object relatives. This was not the case in the ASD group, where only 2 of the 18 children with ASD showed this pattern (the others showed impairment in both subject relatives and object relatives, or even a reversed pattern with normal object relatives and impaired production of subject relatives). The clear difference between the ASD and SyDLI groups was also seen in the errors the two groups produced: the ASD participants, and only them, produced pragmatic errors, which did not occur in the SyDLI group: they failed to respond to the experimenter's question, or provided a response that was unrelated to the picture presented. Similar infelicitous responses were also reported by Modyanova et al. (2017).

Experiment 2, in which we examined reading and paraphrasing of object relatives in comparison to simple sentences, provided the same insight. Indeed, the ASD participants failed on this syntactic task, but their pattern of performance across the sentence types as well as the types of errors indicated that their deficit is very different in nature from the one seen in the participants with SyDLI. The SyDLI group showed a clear difference between the object relatives and the simple sentences in both reading of the homograph and in paraphrasing the sentences. In reading the homographs, the SyDLI participants had very few errors, but when they did, it happened when the homograph was embedded in a relative clause. The ASD group, in contrast, made homograph reading errors on both sentence types. The same pattern was also seen in their paraphrasing of the written sentences: The SyDLI participants found it very difficult to understand and paraphrase the object relatives, but paraphrased the simple sentences very well. The performance of the ASD participants was markedly different: their paraphrasing of both the object relatives and the simple sentences was very poor. Like in Experiment 1, the two groups also differed in their error patterns: whereas the SyDLI participants made thematic role errors in their paraphrases, whereby they confused the agent and the theme of the verbs in the sentence, the ASD participants provided a myriad of unexpected and infelicitous responses that either included a random list of words from the sentence, or a repetition of the sentence with a random change in one of the words, or provided an explanation that had very little to do with the original sentence. These responses were unique to the ASD group and were not seen in either the SyDLI or the TD groups.

In Experiment 3, which examined repetition of various syntactic structures, the performance of the ASD group was poor but, again, very different from that of the SyDLI group. Whereas the SyDLI group struggled with structures with certain types of syntactic movement (Wh movement and verb movement), they showed good performance on all other structures—they repeated well simple sentences, as well as sentences with sentential complements to verbs, and sentences with A-movement in which the object of an unaccusative verb moves to subject position. The ASD group showed a completely different pattern, whereby they failed to repeat all kinds of structures in the test. Like in the previous tasks, here, too, the variability within the ASD group was very large, in line with many studies of ASD (Boucher, 2003; Eigsti and Bennetto, 2009; Kwok et al., 2015; Schaeffer, 2016; Modyanova et al., 2017). In this task, too, the errors the ASD participants produced differed from the errors the SyDLI group produced. The SyDLI group made lexical and structural errors only, but the ASD participants, who produced some structural and lexical errors, also produced many errors we did not see in the other groups, mainly errors of perseveration from a previously repeated sentence. They also produced some responses that indicated they understood the task differently: they often added commentaries to the sentence they repeated, interpreted it, or answered it instead of repeating it.

Thus, our study, in line with recent previous research, raises three main types of reservations against suggesting that the syntactic deficit in ASD is the same one we see in SyDLI. The first relates to the **wide variability** in language skills within the ASD group (Tager-Flusberg, 2006): only some, but definitely not all children with ASD show poor performance in syntactic tasks. This led some researchers who compared ASD to SyDLI to divide their ASD participants into subgroups with and without language impairment (e.g., Roberts et al., 2004; Whitehouse et al., 2008; Modyanova et al., 2017). The two other reservations relate to the nature of the data showing similarities between the two groups: studies arguing for similarities between ASD and SyDLI typically used standardized task scores and did not use error analysis or detailed analysis of the exact types of syntactic structures that are impaired in the two groups. Once the **types of errors** are analyzed, a clear difference emerges between the groups: they commit errors of different kinds, indicating different underlying deficits in the ASD and SyDLI groups. This conclusion regarding the importance of error analysis is in line with studies by Demouy et al. (2011), Riches et al. (2010), Modyanova et al. (2017), and Roberts et al. (2004), who tested syntax in ASD in comparison to DLI using various tasks and reported that even when the ASD and the DLI groups achieved similar scores, they showed different error patterns. The DLI group made mainly syntactic errors, but these errors did not characterize the ASD group, who made many pragmatic errors. Finally, the analysis of **patterns of impairment and sparing**, i.e., the syntactic structures on which the participants fail and those on which they perform normally, yields crucial differences between ASD and SyDLI individuals. Like in our current study, Durrleman et al. (2016), who used a careful design of sentence structures with various syntactic properties concluded that unlike children with SyDLI, children with ASD often show an across-the-board deficit in various sentence structures, including simple sentences. Gavarró and Heshmati (2014) made a similar observation in their study of passive sentence comprehension in ASD: the ASD participants made errors not only on the passive sentences but also on active sentences.

Several conclusions can be drawn from the direct comparison between individuals with SyDLI and individuals with ASD using the same syntactic tasks. First, the fact that a person fails in a syntactic task, as indicated by a low percentage correct in the test, does not mean that this person has a syntactic deficit. Failure in syntactic tasks can arise from failure to understand the task, failure to understand the situation described in sentences, failure to establish a felicitous discourse, among many other reasons. Therefore, a general task score is not enough to establish a syntactic deficit, and an indepth analysis of response types, error types, and a comparison between performance in different structures, e.g., those that involve a certain syntactic complexity and those that do not—is essential.

Secondly, individuals with ASD show great variability that does not allow for a generalization about the syntactic ability of the whole group. Some individuals with ASD also have syntactic difficulties, but some do not. In the current study, only one of the ASD participants showed a syntactic performance that resembled that of children with SyDLI, and seven ASD participants showed intact comprehension, production, and repetition of sentences derived by syntactic movement, once their pragmatic deviant responses were removed. Thus, we can conclude that poor performance in syntactic tasks still does not indicate a syntactic impairment, and that ASD is not DLI.

# REFERENCES


# AUTHOR CONTRIBUTIONS

NS and NF created together the research questions and the method of exploring them. NF developed the syntactic tests. NS tested all the ASD children. NF together with Rama Novogrodsky and Iris Fattal who are cited in the paper, tested the SyDLI children. NS and NF together analyzed the data, interpreted the data and wrote the paper.

# ACKNOWLEDGMENTS

The research was supported by the Israel Science Foundation (grant no. 1066/14, Friedmann), the German Israeli Foundation (Friedmann and Ruigendijk), grant 1113/2010, and by the Lieselotte Adler Laboratory for Research on Child Development. We wholeheartedly thank Rama Novogrodsky, Iris Fattal, Ronit Szterman, and Maya Yachini for generously allowing us to use the raw data of SLI and control groups that they collected in previous research in collaboration with Naama Friedmann. We are grateful to Laurie Tuller for fruitful discussions.

# SUPPLEMENTARY MATERIAL

The Supplementary Material for this article can be found online at: https://www.frontiersin.org/articles/10.3389/fpsyg. 2018.00279/full#supplementary-material


Development, eds A. Belletti, E. Bennati, C. Chesi, E. Di Domenico, and I. Ferrari (Newcastle: Cambridge Scholars Press), 211–217.


subtype? J. Commun. Disord. 41, 319–336. doi: 10.1016/j.jcomdis.2008. 01.002

Williams, D., Payne, H., and Marshall, C. (2013). Non-word repetition impairment in autism and specific language impairment: evidence for distinct underlying cognitive causes. J. Autism Dev. Disord. 43, 404–417. doi: 10.1007/s10803-012-1579-8

**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

The handling Editor declared a past co-authorship with NF.

Copyright © 2018 Sukenik and Friedmann. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

# C-Command in the Grammars of Children with High Functioning Autism

Neha Khetrapal 1, 2 \* and Rosalind Thornton1, 2

<sup>1</sup> Department of Linguistics, Language Acquisition Research Group, Macquarie University, Sydney, NSW, Australia, <sup>2</sup> ARC Centre of Excellence in Cognition and its Disorders, Macquarie University, Sydney, NSW, Australia

A recent study questioned the adherence of children with Autism Spectrum Disorders (ASD) to a linguistic constraint on the use of reflexive pronouns (Principle A) in sentences like Bart's dad is touching himself. This led researchers to question whether children with ASD are able to compute the hierarchical structural relationship of c-command, and raised the possibility that the children rely on a linear strategy for reference assignment. The current study investigates the status of c-command in children with ASD by testing their interpretation of sentences like (1) and (2) that tease apart use of c-command and a linear strategy for reference assignment.

### Edited by:

Stephanie Durrleman, University of Geneva, Switzerland

### Reviewed by:

Brian Dillon, University of Massachusetts Amherst, USA Maria Garraffa, Heriot-Watt University, UK Cornelia Hamann, University of Oldenuburg, Germany

> \*Correspondence: Neha Khetrapal nehakhetrapal@gmail.com

### Specialty section:

This article was submitted to Language Sciences, a section of the journal Frontiers in Psychology

Received: 05 October 2016 Accepted: 02 March 2017 Published: 28 March 2017

### Citation:

Khetrapal N and Thornton R (2017) C-Command in the Grammars of Children with High Functioning Autism. Front. Psychol. 8:402. doi: 10.3389/fpsyg.2017.00402 (1) The girl who stayed up late will not get a dime or a jewel (C-command)

(2) The girl who didn't go to sleep will get a dime or a jewel (Non C-command)

These examples both contain negation (not or didn't) and disjunction (or). In (1), negation c-commands the disjunction phrase, yielding a conjunctive entailment. This gives rise to the meaning that the girl who stayed up late won't get a dime and she won't get a jewel. In (2), negation is positioned inside a relative clause and it does not c-command disjunction. Therefore, no conjunctive entailment follows. Thus, (2) is true if the girl just gets a dime or just a jewel, or possibly both. If children with ASD lack c-command, then (1) will not give rise to a conjunctive entailment. In this case, children might rely on a linear strategy for reference assignment. Since negation precedes disjunction in both (1) and (2), they might be interpreted in a similar manner. Likewise, children who show knowledge of c-command should perform well on sentences governed by Principle A. These hypotheses were tested in experiments with 12 Australian children with HFA, aged 5;4 to 12;7, and 12 typically-developing controls, matched on non-verbal IQ. There was no significant difference in the pattern of responses by children with HFA and the control children on either (1) and (2) or the Principle A sentences. The findings provide preliminary support for the proposal that knowledge of c-command and Principle A is intact in HFA children.

Keywords: grammatical development, reflexives, negation, disjunction, c-command

# INTRODUCTION

Individuals with ASD are known to have difficulties with language and communication. They present with little functional communication at one end of the spectrum to relatively welldeveloped language skills at the other (American Psychiatric Association, 2000). Nevertheless, no matter how proficient their language skills, all individuals diagnosed with autism share impairments in everyday use of language. Difficulties with pragmatics and prosody are understood as defining universal features of the disorder (e.g., Paul et al., 2005) but the status of grammatical development is less clear<sup>1</sup> . Some researchers have argued that grammatical knowledge is simply delayed in nature<sup>2</sup> (Tager-Flusberg, 1981; Lord and Paul, 1997) while others argue that there are aspects of grammatical knowledge that are deficient (Pierce and Bartolucci, 1977; Bartolucci et al., 1980; Perovic et al., 2013a,b).

There have been few studies on complex syntactic structure in children with autism and there is not yet consensus on whether or not aspects of syntax are impaired. The issue is complicated by the range of abilities associated with ASD. Those who are classified as high-functioning (HFA) or score at least 70 on tests of non-verbal IQ<sup>3</sup> (e.g., Howlin, 2003) tend to show sophisticated grammatical knowledge. One area of weakness that has been noted for children with low non-verbal IQ scores is morphosyntax. In particular, difficulties were observed for children's production of grammatical morphemes that mark "tense" (Roberts et al., 2004). The finding is that children with ASD tend to perform worse than children diagnosed with Specific Language Impairment (SLI). More recently, there have been investigations into the comprehension of complex syntactic structures such as wh-questions (Zebib et al., 2013), relative clauses (Riches et al., 2010; Durrleman and Zufferey, 2013; Durrleman et al., 2015), raising and passives (Perovic et al., 2007). Durrleman et al. (2016) assessed the comprehension of both relative clauses and wh-questions in French and showed that children with ASD had lower performance even on simple structures as compared with their typically-developing (TD) peers who were matched on non-verbal abilities<sup>4</sup> . Riches et al. (2010) showed that English-speaking teenagers diagnosed with autism and concomitant language deficits made significantly more errors than their age matched TD counterparts on subject and object relative clauses when tested on a sentence repetition task. A similar difficulty was also reported for the comprehension of relative clauses in French-speaking adults diagnosed with HFA (Durrleman and Zufferey, 2013; Durrleman et al., 2015). In an elicitation task for wh-questions, it was reported that Frenchspeaking children diagnosed with autism avoided fronting in their wh-questions (Zebib et al., 2013). Importantly, these studies all involved movement or structures that encompass relations where the position that a phrase is interpreted differs from the position that the phrase is pronounced, a claim that parallels claims made for SLI (e.g., van der Lely and Pinker, 2014). Two recentstudies by Perovic et al. (2013a,b) investigated the syntactic relation of binding.<sup>5</sup> Binding does not involve movement but involves a dependency between two noun phrases. These studies are the impetus for our investigation on c-command in children with autism, so we introduce these in detail. Since the hierarchical relationship of c-command forms the basis for our experiments, we begin by introducing this abstract notion.

C-command is a relationship between nodes in the phrase structure representation of a sentence. A node A is said to ccommand a node B, if and only if the node that immediately dominates A also dominates B (see Koeneman and Zeijlstra, 2017). This is illustrated in **Figure 1A**. In this figure, the node that immediately dominates A is XP, because it is one step above A in the phrase structure. The node XP dominates B simply because it is higher than B in the tree, and it is possible to trace a path down the tree from XP to B. Therefore, A c-commands B. In **Figure 1B**, however, A does not c-command B. For A to c-command B, the node immediately dominating A would also have to dominate B. But the node immediately dominating A is ZP, and this node does not dominate B. This is because it is not possible to trace a path from ZP directly down the tree to reach B.

The Perovic et al. (2013a) study was based on an experiment conducted by Wexler and Chien (1985) that was designed to test children aged 2;6 to 6;6 years of age. The task was a two-picture Truth Value Judgement Task. In the original task, children were tested on sentences like Cinderella's sister points to herself/her, in which the subject noun phrase, Cinderella's sister, is a possessive noun phrase. This complex noun phrase provided the potential antecedents for herself/her. The child's task was to point to the picture that matched the sentence spoken by the experimenter. In one picture, Cinderella's sister was pointing to herself, and in the other, Cinderella was pointing to herself. The finding was that by 5 years of age, children were able to choose the correct referent for the reflexive 90% of the time.

The experimental stimuli used by Perovic et al. (2013a) used the same possessive noun phrase subjects but their stimuli featured the Simpson family. The stimuli included four different kinds of sentences, shown in (1) to (4) below.

<sup>1</sup>Grammar here refers to the structural aspects of language, or syntax.

<sup>2</sup>Mixed with this line of argument is the claim that grammar is relatively but not entirely spared in autism. In other words, the grammatical skills are better than the pragmatic functioning observed for the disorder. For example, Eigsti et al. (2007) observed that children with ASD showed grammatical impairments as compared to groups matched for both non-verbal IQ and receptive vocabulary. Other researchers have found no impairments in grammar when compared to control groups matched for cognitive function, including a Down syndrome group (Tager-Flusberg et al., 1990).

<sup>3</sup>Bishop et al. (2000) and Norbury et al. (2002) recommend a cut-off score of 80 on non-verbal reasoning.

<sup>4</sup>The results from this study indicate that non-verbal abilities cannot straightforwardly account for grammatical difficulties observed in autism as the TD group matched on non-verbal abilities was chronologically younger. However, the findings are consistent with previous studies that matched children with ASD to TD children based on non-verbal abilities (e.g., Perovic et al., 2007). The performance of the TD group developed as a function of their age and non-verbal abilities. On the other hand, the performance of the ASD group developed as a function of only their non-verbal abilities.

<sup>5</sup>The interpretation of reflexives (e.g., himself) and pronouns (e.g., him) is regulated by what is known as Binding Theory (Chomsky, 1981) within the generative approach. The Binding Theory constrains the interpretation of reflexives, pronouns and names through three linguistic principles, known as Principle A, B, and C respectively.


Possessive noun phrases "Bart's dad" were chosen as the subject noun phrase because they allow for two potential referents (Bart's dad and Bart) for the reflexive or pronoun. This gives the child a choice between a c-commanding referent (Bart's dad) and a non c-commanding referent (Bart).

First let us consider how Principle A is satisfied in the sentences with a possessive noun phrase subject, like Bart's dad is touching himself. Intuitively, we know that Bart's dad is the only legitimate antecedent for himself, and that the other potential antecedent Bart is not, but let us verify this using the notion of c-command. Consider **Figure 2**. In **Figure 2**, Bart's dad is the subject of the sentence. This complex possessive phrase is represented by a Determiner Phrase (DP1). DP<sup>1</sup> is broken down into further components—DP<sup>2</sup> (Bart) and D' containing the possessive marker and the Noun Phrase, Dad. Applying the definition of c-command, DP1,Bart's dad, c-commands himself as the node immediately dominating DP1,the TP, also dominates DP<sup>3</sup> which is the reflexive, himself. Now let us consider Bart. We see, the node that immediately dominates DP<sup>2</sup> or Bart (that is DP1) does not dominate himself. Therefore, Bart does not ccommand the reflexive and despite being in the same clause, it is not a potential antecedent.

A control condition with the same possessive subject noun phrase and no pronoun or reflexive in the predicate, as in (3), was included in order to test whether children knew the structure of possessive noun phrases and could distinguish between the two potential antecedents Bart's dad and Bart in sentences that were not related to knowledge of pronouns or reflexives. This condition also tested the c-command relation independently of binding<sup>6</sup> in the sense that if children are able to compute the subject-predicate relations correctly then they should be sensitive to c-command.

The Perovic et al. (2013a) study tested 14 children diagnosed with autism, ranging in age from 6 to 17 years (M = 11;6). Twenty seven TD children aged 3–9 years matched on the Kaufman Brief Intelligence Test (KBIT) (KBIT-TD) and the Test for Reception of Grammar (TROG) (TROG-TD) formed the control groups. For the autism group, the mean standard score (SS) for the matrices subtest of the KBIT, a standardized test assessing nonverbal IQ, was 65.93. The mean for the TROG, a standardized test that assesses grammatical comprehension, was 56.5. The task was the same 2-choice picture selection task. The experimental findings revealed poor performance on sentences containing a reflexive as compared with ones containing a pronoun. The autism group had a mean correct of 67% on the sentences containing a pronoun (NP), while the two control groups, KBIT-TD and TROG-TD both scored a mean of 71% correct. On the sentences containing reflexives, the children with autism performed below chance, with a mean of 43% correct. This was significantly different from the control groups; the KBIT-TD group scored a mean of 92% correct and the TROG-TD group was 83% correct. There was some individual variation, however, with 2 of the 14 participants showing the pattern of better performance on reflexives rather than pronouns. On the control items containing a name (CN), the children with autism were significantly worse than the KBIT-TD group but not the

<sup>6</sup>A binds B, if A c-commands B and A and B are coindexed. Reflexives that comply with Principle A are bound pronouns as these are c-commanded by their antecedents and both the reflexive and the antecedent are coindexed and appear in the same clause (e.g., Sam in "Sam<sup>i</sup> washed himselfi"). Coindexation is shown by indices below the relevant NP.

TROG-TD group. The TD matched children tended to show better performance on sentences containing reflexives as early as age 5. Their performance was in accord with the patterns established in the previous literature (e.g., Wexler and Chien, 1985; Chien and Wexler, 1990) 7 .

Perovic et al. (2013a) interpreted the experimental findings to suggest that children with ASD have a syntactic deficit over and above any well-established pragmatic difficulties that are part of the disorder. First, because ASD children did not do well on reflexives, the authors interpreted this to mean that Principle A was either lacking or deficient in some way, as this pattern of better performance on pronouns than reflexives is not seen in TD children. They consider the proposal that children are assigning reference using a linear strategy to determine the antecedent for the reflexive, but leave this possibility open. Reliance on a linear strategy would mean that a child with autism could assume that an antecedent for reflexive is a preceding noun phrase that appears in the same clause as the reflexive. Such an assumption would lead to good performance on simple sentences like Mary points to herself but is expected to give way to poor performance on sentences like Bart's dad is pointing to himself. In this case, since there are two potential antecedents in the clause, so a child would end up guessing and choosing either Bart's dad or Bart as the antecedent. As c-command is needed to establish the relationship between the reflexive and its antecedent, Perovic et al. (2013a) interpreted this to mean that "children with autism do not show sensitivity to c-command in establishing the complex syntactic dependency of binding, where the antecedent of a reflexive must c-command the reflexive" (Perovic et al., 2013a, p. 25).

Not all studies have shown poor performance on reflexives. This was not the case for a study by Terzi et al. (2014) that compared performance of reflexives and pronouns in Greek-speaking children. Before we report the results, a little background on Greek is in order. Greek differs from English in that it has two kinds of object pronouns; strong pronoun and clitic pronouns. English is considered just to have strong pronouns. The strong pronouns in Greek differ from clitic pronouns in several ways. First, strong pronouns carry lexical stress which is not the case for clitics. Second, clitic pronouns can attach to the verb. Both kinds of pronouns also share important features. They both inflect for gender, number and case, and they are never used to refer to an antecedent that appears within the same clause. Turning to reflexives, Greek reflexive pronouns are subject to Principle A just like English reflexives, and the antecedent for a Greek reflexive must appear in the same clause as the reflexive. Previous research has shown that Greek-speaking TD children master use of both strong and clitic pronouns at an early age (see Varlokosta, 2000). With this background in place we can turn to Terzi et al.'s (2014) study. In this study, the children with autism were classified as high functioning based on their high non-verbal IQ (>80). The control group consisted of TD children individually matched to the participants with autism based on raw scores of a vocabulary test. The experiment showed that the children with autism performed worse than TD children only on clitics or clitic pronouns (88.3% correct) in contrast to reflexive pronouns (97.5% correct) and strong pronouns (94.9% correct). The Greek children with autism performed better than the English-speaking children in Perovic et al.'s experiment, where they were only 43% correct on sentences containing reflexives. However, it is important to acknowledge that there are slight differences between Greek reflexive pronouns and English ones. Greek reflexives are complex forms and are inflected for case and number. Most importantly, reflexivity is not just expressed through reflexive pronouns but it is also expressed through special verbal morphology. The results from the Greekspeaking children are more in line with recent results obtained for 26 British HFA children who showed good comprehension of reflexives (Janke and Perovic, 2015) on a two-choice picture selection task.

In a later study, Perovic et al. (2013b) re-assessed their experimental findings, with a larger group of children with autism, aged 6–18 years, using the same stimuli given in (1)– (4). In this study, participants were divided into two subgroups according to the presence or absence of language impairment as measured by their scores on tests of both receptive language (TROG-2; PPVT-3) and productive language (a vocabulary subtest of the KBIT). The group with language impairment, the ALI group, consisted of participants scoring below the 10th percentile on at least 2 of the three tests. In this experiment, only the ALI group performed worse on the sentences containing reflexives (M = 49% correct) such as (1) as compared to the sentences containing pronouns (M = 71% correct) like (2). The ALI group consistently showed chance performance on sentences containing reflexives. Thus, this group of children did not seem to distinguish between potential antecedents (Bart's dad and Bart) for the reflexive. The performance of the ALI group was not different between the Name Pronoun (NP) and other control conditions, the Control Possessive (CP) condition (Bart's dad is licking a lamp post; M = 77% correct) and the Control Name (CN) condition (Bart is pointing to Dad; M = 79% correct). The performance of the children without language impairment, the ALN group, showed better performance on sentences containing reflexives (M = 96% correct). Both the ALI and the ALN groups (M = 83% correct) showed delayed comprehension of pronouns consistent with a delay in their "linguistic" pragmatic knowledge.

The results from the larger group of children with autism provided support for the proposal that Principle A is either missing or incorrectly represented in the group of children designated as ALI. The authors elaborate their proposal as follows: "It is not necessarily the case that children with ALI cannot compute c-command; they might be able to use it to constrain representations in other constructions" (Perovic et al., 2013b, p. 146). The authors further propose that the "ALI version of Principle A constrains the ALI child only to

<sup>7</sup>The delay in understanding pronouns as compared to reflexives has been explained in terms of late developing pragmatic knowledge in children (cf. Chien and Wexler, 1990). Binding principles regulate syntactic binding only. Reflexives are always bound by their antecedent. Binding is also relevant for pronouns, but in addition, the notion of coreference is relevant. One prominent proposal is that children have innate knowledge of Principle B as it regulates whether a pronoun is bound or free in its clause but they have difficulty with coreference as this invokes pragmatic knowledge which is subject to maturation (Chien and Wexler, 1990).

having a clause-mate antecedent of the reflexive, missing the c-command part of Principle A" (Perovic et al., 2013b, p. 146).

However, one could ask that why is c-command missing only in the application of Principle A, given that it is a very general hierarchical notion. The puzzle is that the ALI group performed quite well on the Control Possessive condition (CP) that was hypothesized to test for c-command relations outside the domain of binding. On these trials, the children were 77% accurate. So, we might ask why children performed poorly on the sentences containing reflexives (49%), but well on the controls, given that c-command is necessary to identify the correct antecedent in both cases. To explain this puzzle, we could interpret Perovic et al.'s discussion in the following way. In order to pick out the correct antecedent for the reflexive, children not only need to have knowledge of c-command, but they also need to understand that the relationship between the reflexive and its antecedent is one of variable binding<sup>8</sup> . Given the added complexity introduced by variable binding, children end up guessing, hence their chance performance. In the control sentences such as Bart's dad is licking a lamppost, c-command is still a necessary prerequisite for identifying the correct referent. In this case, without the added complexity of variable binding, children are able to use a linear strategy to identify a clause-mate antecedent. This strategy means they tend to take the nearest noun as the referent for the antecedent, so they choose the noun dad from Bart's dad, and overall, end up with roughly 77% correct performance on the CP control items<sup>9</sup> .

Other issues arise when considering the difference in results between the ALI and ALN groups of children (Perovic et al., 2013b). The ALI children examined by Perovic et al. (2013b) had low non-verbal abilities. This observation suggests that children on the lower end of the autism spectrum have problems with advanced syntactic structures and serves as motivation for matching TD and ASD children in terms of non-verbal abilities. In our study, we chose to examine children on the higher end of the spectrum (see Terzi et al., 2016a,b). One question that arises is why the ALN children performed well in the second study by Perovic et al. (2013b). Is it the case that this group of children has c-command in place, in contrast to the ALI children? Or, is it the case that due to their high non-verbal abilities and superior language skills, they were able to adopt a linear strategy for both the variable-binding sentences containing reflexives and the control sentences? Or, did children use c-command to identify the appropriate subject noun phrase for both the Name Reflexive sentences and the Control Possessive sentences? The sentences with a possessive NP (e.g., Bart's dad) used in this study do not allow us to tease apart the difference between using c-command and using a linear strategy to correctly identify the antecedent, so we turn to a different structure to further investigate these possibilities in children with ASD.

We introduce a novel experiment that differentiates between interpretations computed based on the basis of knowledge of c-command and ones based on linear order. The experiment rests on the interpretation of disjunction ("or"). There has been some debate in the literature over whether "or" in child language corresponds to "inclusive-or" as in classical logic, or "exclusive-or," so we review this briefly here. Some researchers have pointed out that the majority of input to children is consistent with "exclusive-or" because the contexts in which children hear disjunction are ones in which only one of the disjuncts is true (Braine and Rumain, 1981, 1983; Morris, 2008). For example, Morris (2008) examined 240 transcripts of parentchild interaction in the CHILDES database (MacWhinney, 2000). There were 465 spontaneous uses of "or" in a corpus of 100,626 conversational turns. In this corpus, "or" was used in situations where only one of the disjuncts was true between 75 and 95% of the time. As Crain and Khlentzos (2007, 2010) point out, however, a situation in which one disjunct is true is also consistent with "inclusive-or." If "or" is "inclusive-or," then A or B is true when A is true; when B is true or when A and B is true. So, the fact that children hear disjunction in a context in which one disjunct is true does not favor the proposal that children understand disjunction as "exclusive-or."

Consider a statement such as "Every child took a tiger or a dinosaur." This is true in circumstances in which there are three children; one takes a tiger, one takes a dinosaur and the third child takes both a tiger and a dinosaur. Adults reject such a statement in this circumstance, however. This is argued to be due to the implicature of exclusivity. Children, on the other hand, have been found to be less sensitive to the implicature. For example, Gualmini (2003) tested children's interpretation of sentences like Every child took a tiger or a dinosaur in the story context just described, in which one child chooses a tiger, another a dinosaur, and the third child chose both animals. Unlike adults, the child participants accepted the puppet's description 71% of the time. In certain contexts, such as contexts of betting or uncertainty, however, the implicature of exclusivity is canceled, and then the finding is that adults, too, accept sentences with "or" in all three circumstances. In a further experiment using conditional sentences such as "If a giraffe or a penguin is on the stage, then I get a coin," Gualmini et al. (2001) established a context of uncertainty, and showed that in this case, also, children assign the range of truth conditions consistent with "inclusive -or." In this experiment, certain animals, such as a giraffe, or a penguin or both, were placed on the stage, and

<sup>8</sup>Both reflexives and prnouns can serve as bound variables. To take an example, in the sentence "Every girl<sup>i</sup> thought shei/<sup>k</sup> could participate in the competition," the pronoun "she" is bound by the quantificational expression "every girl." In this example, it is easy to see that the pronoun can function as a variable. The sentence is, in fact, ambiguous. If the pronoun "she" is taken to be a referential pronoun, then it refers to some unnamed female individual. In this case, the sentence means that every girl thought that this female individual could participate in the competition. On the alternative interpretation, the pronoun "she" acts as a variable. On this interpretation of the sentence, there are multiple girls, each of whom thinks that she herself could participate in the competition. In a sentence like "Bart's dad is touching himself " the reflexive is a bound variable even though it doesn't pick out a range of individuals. This is simply because the lexical item himself is singular. If the sentence was "Bart's kids are touching themselves" then it can be seen that there are multiple individuals who could each be touching themselves.

<sup>9</sup>Whether or not variable binding is problematic for children with autism is an issue for future research. Previous research has shown that typically-developing children as young as 3 and a half years of age can produce bound variable structures. For example, Thornton (1990) elicited questions such as "Which guys said they have a blue marble?" in a situation in which each of 3 guys has a marble. In Thornton's study, children often opted for the plural form of the pronoun as the bound pronoun instead of the singular form, he.

then the stage curtains were opened to reveal the animal or animals. The puppet produced the conditional statement before the curtains were opened. This way, there was uncertainty about the outcome. Once the curtains were opened, the puppet asked if he got a coin. On some trials, disjunction "or" was replaced with conjunction "and." Children clearly distinguished the truth conditions of disjunction vs. conjunction. When conjunction was used, as in "If a giraffe and a penguin are on the stage, then I get a coin" children rejected the sentence when only one animal was on the stage, unlike when disjunction was used in the sentence. These experimental findings suggest that "or" is interpreted as "inclusive-or" in child grammar.

Crain (2008) points out that the "exclusive-or" interpretation of disjunction yields different properties in negative sentences. Recall that "A or B" is true only if exactly one of the disjuncts, A or B, is true. It follows that sentences of the form "Not A or B" are false only if exactly one of the disjuncts, A or B, is true. Sentences of the form "Not A or B" are true, therefore, if both disjuncts are true, and they are also true if both are false. The fact that sentences of the form "Not A or B" are true if both disjuncts are true is a consequence of interpreting "or" as "exclusive-or," (see Crain et al., 2000). Suppose that John says the following "Mary did not bring ice cream or cake to the party." If John's use of disjunction is interpreted as "exclusive-or," then his assertion would be true if Mary brought both ice cream and cake to the party, which is clearly not how native speakers of English interpret this sentence.

On the other hand, if disjunction is "inclusive-or," then John's statement "Mary did not bring ice cream or cake to the party" is only true in circumstances in which Mary brought neither dessert to the party. Intuitively, this is the right result for English. The meaning of such sentences containing negation and disjunction corresponds to one of the laws of propositional logic, according to which a negated disjunction "Not (A or B)" logically entails the negation of both disjuncts. It entails "Not A" and it entails "Not B." In classical logic, this law is stated in one of De Morgan's laws: ¬ (A ∨ B) => (¬A ∧ ¬B). The interpretation that is captured by De Morgan's law depends on the interpretation of disjunction as being within the scope of negation, where scope assignment corresponds to the structural notion of c-command in linguistic theory (see Crain, 2012). This can be seen in **Figure 3**, where not c-commands disjunction, or, which is contained in the object noun phrase. In other words, a "conjunctive" entailment is licensed by disjunction in the scope of negation, just as long as disjunction is analyzed as "inclusive-or" (Crain, 2008). From this point, we will simply assume that "or" is interpreted as "inclusive-or" in child grammars.

Children's interpretations of sentences containing negation and disjunction were tested in a study by Crain et al. (2002) in typically-developing children aged 3;11- to 5;9 years. This study is the basis for our study with a group of children with ASD, so we review it in some detail. The relevant sentences from the Crain et al. (2002) study are given in (5) and (6).


Notice that both of these sentences contain negation (either not or n't) as well as the disjunction word or. However, they yield different interpretations. In (5), negation is in the main clause and c-commands disjunction. See **Figure 4**. As noted above, when disjunction is in the scope of negation, this gives rise to the conjunctive entailment. That is, (5) means that the girl who stayed up late will not get a dime AND the girl who stayed up late will not get a jewel. This is the only available interpretation for this sentence. In the sentence in (6), as in (**Figure 5**), negation precedes disjunction. In this case, however, negation, which is part of the negative auxiliary verb didn't, does not c-command disjunction. This is because didn't is embedded inside the relative clause who didn't go to sleep, that modifies the subject noun

the party.

phrase. Therefore negation does not c-command disjunction in **Figure 5**. Because negation does not c-command disjunction, the sentence does not give rise to a conjunctive entailment. Rather, it gives rise to disjunctive truth conditions. This means that the sentence means the girl who didn't go to sleep will get a dime, or the girl who didn't go to sleep will get a jewel (or possibly both). Therefore, Crain et al.'s (2002) predictions were as follows. If children have knowledge of c-command, they will generate the conjunctive entailment for (5) and reject the sentence. On the other hand, they should accept (6), since there is no c-command relation between negation and disjunction in this sentence. If children do not have c-command, however, and rely on a linear strategy to interpret sentences containing negation and disjunction, they should treat the two sentences in the same way. In this case, it is likely that children will not enforce the conjunctive entailment but attribute disjunctive truth conditions to both sentences.

The experiment conducted by Crain et al. (2002) used the Truth Value Judgement Task (TVJT) to test sentences like (5) and (6) (Crain and Thornton, 1998). Thirty children, ranging in age between 3;11 and 5;9 (mean age of 5;0), participated in the experiment. The experiment was conducted over two sessions. The participants were also divided in two groups. One group was presented with two trials of sentences like (5) in session 1 and two trials of sentences like (6) in session 2. The other group heard the target sentences in the opposite order. The child participants watched a story acted-out by one experimenter along with a puppet, played by a second experimenter. At the end of the story, the puppet described what he thought happened in the story. The child's task was to judge whether or not the puppet said "the right thing." That is, children judged the truth or falsity of the puppet's description of the story. To ensure that use of disjunction was felicitous in the experimental context, the stories were presented in "Prediction Mode." Instead of having children judge the truth of the puppet's statement at the end of the story, the puppet made a prediction about forthcoming events at some point in the middle of a story (and it was repeated again at the end). Thus for (6), for example, half-way through the story, the puppet predicted how the events might unfold by uttering the target sentence, The girl who doesn't go to sleep will get a dime or a jewel. In such contexts of uncertainty, it is felicitous to use disjunction. If the child judged the puppet's statement to be true, then it was assumed that the child's grammar generated a structure and a meaning for the sentence that matched the events that took place in the story. If the child judged the puppet's statement to be false, then it was assumed that the child's grammar generates only structures and meanings that did not match the events in the story. This inference is based on the assumption that, whenever possible, children (and adults) will access a meaning that makes the puppet's sentence true. This is called the Principle of Charity (Davidson, 2001). If children and adults adhere to the Principle of Charity, then we are invited to make the following inference: When children and/or adults consistently judge a sentence to be false, this indicates that they were unable to mentally compute a meaning representation that makes the sentence true.

The story that was used to test the sentences in (5) and (6) acted-out a tale of two girls who were waiting for the tooth fairy to arrive, as they had both lost a tooth. The girls knew that the tooth fairy would come during the course of the night and give them a reward in exchange for their lost tooth. One of the girls decided to go to sleep, as all good girls should do, but the other girl decided to stay awake. The tooth fairy duly arrived, bringing along two dimes and two jewels. At this juncture, the puppet made his prediction about what would happen next. This was the puppet's delivery of the test sentence. The story then resumed. The fairy gave both a jewel and a dime to the girl who was sleeping. The girl who was awake explained that she knew she should be asleep but had stayed up because she really wanted to meet the tooth fairy. The tooth fairy said that she was disappointed that the girl had not gone to sleep, but still decided to give her one reward. She gave the girl just a jewel. At the end of the story, the puppet repeated the prediction made in the middle of the story, delivering the test sentence for the children to judge.

The main finding from the experiment was that children rejected sentences like (5), in which negation c-commanded disjunction 92% of the time. That is, children rejected the sentences because they generated the conjunctive entailment. They took (5) to mean that the girl who stayed up late didn't get a dime and she didn't get a jewel. This is false, because in the story the tooth fairy gave her a jewel. Sentences like (6) were accepted 87% of the time. There was no c-command relation between negation and disjunction in (6). The children generated disjunctive truth conditions; they accepted a sentence like (6) because it was true that the girl who didn't go to sleep got a dime or she got a jewel. The Crain et al. experiment showed that typically-developing children treat the two sentence types very differently, rejecting one, and accepting the other. This suggests that children are generating hierarchical sentence representations, and that the notion of c-command guides their interpretation of these sentences. As noted, if children had been relying on linear precedence or a linear strategy, then there would be no reason to generate a conjunctive entailment for sentences like (5). Thus, it is likely that both sentences would have been interpreted in a similar manner with disjunctive truth conditions.

Returning to children with autism, recall that the Perovic et al. (2013b) experimental findings showed that the ALN group of children did well on Principle A, correctly identifying the appropriate referent 96% of the time. However, given that the correct referent (Bart's dad not Bart) both c-commands the reflexive, and is, from a linear perspective, the closest and perhaps most salient potential antecedent, it was difficult to know whether the children were drawing on grammatical knowledge or a linear strategy. For this reason, we replicate the Crain et al. (2002) study which distinguishes an interpretation based on c-command from one based on a linear strategy.

# EXPERIMENT 1: DISJUNCTION AND NEGATION

The goal of the experiment is to determine how children with ASD interpret sentences containing negation and disjunction. Crucially, this structure disentangles the confound present in the Principle A structure used in Perovic et al. (2013a,b). In sentences with reflexives containing possessive noun phrase subjects like Bart's dad, it was not possible to tell whether correct identification of the antecedent for the reflexive could be attributed to knowledge of c-command or a linear strategy. The comparison structure with negation and disjunction tested in the present study dispenses with this confound. In addition, our second experiment also incorporates the sentences containing reflexives as used by Perovic et al. for comparison.

The first experiment used sentences with the same structure as the study by Crain et al. (2002), ones like (7) and (8). In (7) negation c-commands disjunction, while in (8), it does not because the negative auxiliary verb is embedded inside the relative clause.


The experimental hypothesis was as follows: If children with ASD can access c-command, they should respond to such sentences in the same way as typically-developing children. That is, children should interpret (7) as the boy who is on the bridge will not get a car and the boy who is on the bridge will not get a ball. In other words, they should generate a conjunctive entailment when disjunction appears in the scope of negation. However, if children with ASD cannot access the requisite notion of ccommand, they should draw no distinction in the interpretations assigned to sentences (7) and (8). In this case, they would be unlikely to enforce a conjunctive entailment for (7). Furthermore, if c-command is not guiding children's interpretations, they may adopt a linear strategy. Since negation precedes disjunction in both (7) and (8), the expectation is that children would interpret both sentences in the same way, with disjunctive truth conditions.

# Methods

### Participants

Twelve children on the autism spectrum participated in the study. Their age ranged from 5;4 to 12;7, with a mean of 9;11 years. Children with autism were recruited from a special school for children with ASD, located in Melbourne. Children in Sydney were recruited from a Special Education Centre. In addition to these schools, children diagnosed with autism were also recruited from advertisements placed on the Autism Spectrum Australia (ASPECT) website. A formal diagnosis of autism was established based on previous assessment reports as provided by the parents or children and as identified by the specialist school. The children who made up the control group (typicallydeveloping children) were recruited from general advertisements placed on Macquarie University campus. Only children whose first language was English were recruited for both the groups. Ten adults were also recruited in the pilot phase for the experiment. They were students recruited from general advertisements across the campus. This study was carried out with the approval of Human Research Ethics Committee at Macquarie University (Ref: 5201200880).

The children with autism all had verbal communication skills. This group of children was tested on standardized tests of language and cognition. These tests included the matrices subtest of KBIT (Kaufman and Kaufman, 2004) measuring non-verbal IQ and the Test for Reception of Grammar Second Edition (TROG-2; Bishop, 2003). Based on the scores of the KBIT, the children who formed the group with ASD can be described as high-functioning (HFA), as the majority of children had a standard score of more than 80 (Howlin, 2003; Norbury, 2005). The TD children (n = 12) were matched to the children with autism within 2 points of the KBIT raw scores. The age of the children in the matched comparison group ranged from 5;10 to 8;10, M = 7;1. **Table 1** summarizes the descriptive data.

### Procedure

Before testing, all caregivers provided informed consent for their child's participation, in accordance with ethical guidelines set out by the Human Research Ethics Committee at Macquarie University. The present experiment used the dynamic TVJT (Crain and Thornton, 1998) as in the study by Crain et al. (2002). The task of the child was to judge the truth or the falsity of the given sentence spoken by the puppet. In order to make disjunction felicitous in the context, the TVJT was adapted to use in the prediction mode (see Chierchia et al., 1998), again, as in Crain et al. (2002). Accordingly, our story was interrupted half way through and the first experimenter who acted as a dog while manipulating toys asked Kermit, the second experimenter, what he thought would happen next. Kermit replied by uttering the target sentence and the story resumed. At the end of the story, Kermit repeated his prediction to remind the children about the events that occurred in the story. In the present experiment, the stories were videotaped, and the videotaped scenarios were presented to the children on an iPad. This step ensured consistent presentation, and allowed a single experimenter to present the stimuli. The experimenter who demonstrated the iPad videos to children instructed them to judge Kermit's sentences at the end of each story. The trials in our study were not split up into different sessions as was done by Crain and colleagues. All the participants in our study heard all the test trials in the same session. See **Figure 6** for a snapshot of the experimental trial.

Each child was tested individually either in a quiet corner of a room at the school or in the Language Acquisition Lab at the University. The testing for each child, including the standardized tests lasted for approximately 1.5 hours. If the child

TABLE 1 | Participants' ages and mean scores (standard deviations) on standardized tests of language and cognition.


had difficulty paying attention, the session was split into two parts. All participants were told that they would watch short stories and hear a puppet who tries to say what would happen next. Their task would be to evaluate whether the puppet in the iPad presentation was right or wrong. If the puppet was wrong, children were asked why they thought that the puppet was wrong. All the verbal responses of children were digitally recorded. Children's judgments of the test sentences were scored as "Yes" or "No." Percentages of correct rejections or acceptance for particular items were calculated for each child.

### Stimuli

The experiment included 4 stories to test the structure in (7) in which there is a c-command relation between negation and disjunction. The c-command sentences like (7) were associated with rejections of the test sentence, in keeping with the TVJT methodology. In order to demonstrate their knowledge of c-command and the resulting constraint on interpretation, children had to overcome the Principle of Charity and reject the sentences. There were also 4 stories for the kind of structure as exemplified in (8), in which there is no c-command relation between negation and disjunction. One of the four stories was associated with rejection in order to catch a biased style of responding where children might implicitly pair c-command stories with a "no" response and non c-command stories with a "yes" response. The stories were presented/played to each child in random order. The experimenter chose the video clip of any story at random for presentation. The testing session started with two practice trials. Both the practice trials were paired with a "no" response. These practice trials both contained negation (e.g., "The cook will not let Elmo eat the cake"). See **Table 2** for a complete list of all test sentences.

### Results

The main finding was that the group of children with ASD performed in a similar manner to the typically-developing group of children, rejecting sentences like (7) and accepting ones like (8)10. The group results are summarized in **Table 3**.

When children were asked why they rejected the c-command sentences like (7), the children from both groups gave similar justifications. For example for the test sentence, The boy who is on the bridge will not get a ball or a car, the children would say that the puppet is wrong as the boy on the bridge got a car whereas he was not supposed to get anything. For another test sentence, The cat who is on foot will not get a fish or milk, the children would say that the puppet is wrong as the cat on foot got a fish. A Mann-Whitney Test showed there was no significant difference in the responses of the HFA group and the TD children for the c-command sentences (Z = 1.3568, p = 0.17384) or for the non c-command sentences (Z = 0.4907, p = 0.62414). A Mann-Whitney Test was also conducted to compare performance on the c-command and non c-command sentences within both the groups. The difference was significant for both the HFA (Z = 2.4537, p = 0.01428) and the TD groups (Z = 3.7816, p = 0.00016).

There was a significant difference in performance between the two types of sentences for both the groups. This finding is comparable to that obtained by Crain et al. (2002). In their experiment, children rejected the c-command sentences like (7) 92% of the time, and accepted the non c-command trials 87% of the time. The similar pattern suggests that children with HFA are able to use the hierarchical structure of c-command to distinguish between the c-command and the non c-command sentences just like their TD peers. We return to possible reasons for the fact, that children were not as accurate on the non c-command sentences as the c-command trials, in the Discussion section.

In the next experiment, we explore whether these same children are able to implement c-command to assign the correct referent for reflexives on sentences governed by Principle A. If children are able to use c-command to constrain relations between negation and disjunction, then it is conceivable that they may still show sensitivity to c-command in establishing the complex syntactic dependency of binding, unless variable binding is an issue. If children adopt a linear strategy in conditions where they face the added complexity of variable binding then their performance on the control sentences would be better than their performance on Principle A sentences.

<sup>10</sup>Recall 3 of the 4 sentences were acceptances, while one was designed to be false and a rejection.

### TABLE 2 | List of sentences for C-command and Non C-command trials.


TABLE 3 | Percentage of correct interpretations (Group Mean) for C-command and Non C-command sentences.


# EXPERIMENT 2: PRINCIPLE A

The present study is concerned with Principle A and reflexives. Previous studies conducted by Perovic and colleagues make it difficult to conclude whether the ALN children did well in identifying the correct referent for the reflexive in sentences like Bart's dad is touching himself because their grammatical knowledge incorporates the notion of c-command or whether they were simply adopting a linear strategy. As we saw, Bart's dad is the only legitimate antecedent for himself, and the other potential antecedent Bart is not a potential antecedent because it does not c-command himself.

# Methods

### Participants

The same child participants (n = 12) who participated in Experiment 1 also participated in Experiment 2. Their age ranged from 5;4 to 12;7, with a mean of 9;11 years. Eight adults who did not participate in Experiment 1 participated in a pilot study to ensure the viability of the tasks. They were recruited from general advertisements across Macquarie University.

### Procedure

The children were tested on sentences containing reflexives that are governed by Principle A using the dynamic version of the Truth Value Judgment Task (TVJT) (Crain and Thornton, 1998). Our methodology contrasts with that of Perovic and colleagues who used a 2-choice picture selection task. In our experiment, as in Experiment 1, the stories were pre-recorded and presented to the children on an iPad. See **Figure 7** for a snapshot of the experimental trial. The testing procedure was similar to Experiment 1 except that this experiment did not adopt the prediction mode. This experiment used the "description mode" in which the puppet simply tried to say what happened in the story on its completion. The experimental items were preceded by two practice items, one designed to be a "Yes" answer, and the other a "No" answer. The experimenter then proceeded to the main task. The first story presented to children was always a Name Reflexive (NR) story containing a reflexive like (1) followed by a Control Possessive structure story (CP) like (3). Children were presented with four stories under each category. At the completion of each story, children judged two sentence types; first they judged a Name Reflexive story, and then a Control Possessive story. Children's judgments of the test sentence were scored as "Yes" (true) or "No" (false).

### Stimuli

The target sentences for the second experiment were sentences containing reflexives like Bart's dad washed himself with soap. This was the same structure as used in the Perovic et al. experiment, with the addition of a Prepositional Phrase (PP) such as with soap sentence-finally, to make the sentence seem more natural. The correct response associated with the Name Reflexive sentences was always a rejection of the test sentence. The target sentences were designed to be false to ensure that children have to override the Principle of Charity. In order to show their knowledge of the Principle A constraint, children have to go out of their way to reject the sentence in context, and to explain why it is false. In addition to the four Name Reflexive target sentences, there were four Control Possessive (CP) sentences; 2 of these were designed to be true and 2 were false. This was done in order to balance the "yes" and "no" responses. These also had the additional PP sentence finally (e.g., Bart's dad washed the dog with shampoo). See **Table 4** for a complete list of test sentences.

### Results

The main finding was that both the children with ASD and the TD control group all performed extremely well on the task. See **Table 5** for group mean results of the children.

Each "No" response for the Name Reflexive test sentences was scored as correct rejection. Responses under the control possessive or CP condition were scored depending upon whether children correctly accepted or rejected the test sentences. A Mann-Whitney Test was used to compare the patterns of responses by children with autism and the TD children. The group difference (ASD vs. TD children) was not significant for the Name Reflexive sentences (Z = 1.0681, p = 0.28462) or for the Control Possessive sentences. (Z = 0.2887, p = 0.77182). Withingroup analyses were also conducted across the two sentence types. The difference between NR and CP phrases were not significant for either the ASD (Z = 0.0289. p = 0.97606) or the TD control group (Z = 1.5588. p = 0.11876). When children

TABLE 4 | List of sentences for Principle A.


TABLE 5 | Percentage of correct interpretations (Group Mean) for NR and CP.


were asked why they rejected the test items, children from both groups gave similar justifying responses. For example for the test sentence Bart's dad washed himself with soap. Children's stated reason for rejecting it was that Bart's dad washed Bart with soap.

In a nutshell, the performance of the HFA group does not differ across the Name Reflexive sentences and the Control Possessive structures. Children with HFA assigned the correct referent for reflexives on sentences governed by Principle A, just like their TD peer group.

# GENERAL DISCUSSION

Previous studies have shown that in contrast to high-functioning children with autism (HFA), children on the lower end of the autism spectrum and those with concomitant language impairments have difficulty in correctly interpreting sentences containing reflexives (Perovic et al., 2013a,b). Perovic et al. argued that the difficulty in interpreting reflexives when they appear in possessive structures like Bart's dad is touching himself arises because children may adopt a linear strategy, permitting both Bart's dad and Bart as potential clause-mate antecedents for the reflexive. As a result, children show chance performance. Thus, the authors argued that at least the "ALI version of Principle A constrains the ALI child only to having a clausemate antecedent of the reflexive, missing the c-command part of Principle A" (Perovic et al., 2013b, p. 146). However, this hypothesis did not make it clear as to why the ALN children perform better on the Name Reflexive sentences and the Control Possessive sentences. Using the Principle A sentences with a possessive noun phrase antecedent like Bart's dad, it was not possible to tell whether these children were using their grammatical knowledge of c-command or a linear strategy to identify the correct antecedent. For this reason, we tested a new structure in which these two possibilities are differentiated, the sentences containing negation and disjunction, which we termed the c-command and non c-command sentences as illustrated in (7) and (8) respectively.

We hypothesized that if children with HFA have knowledge of the hierarchical relationship of c-command, they would generate a conjunctive entailment for the c-command sentences like The boy who is on the bridge will not get a ball or a car as disjunction appears within the scope of negation. That is, they would (only) get the interpretation on which the boy who is on the bridge will not get a ball and he will not get a car. However, if the grammatical knowledge of this HFA group of children was compromised, we predicted that the children would interpret the c-command and non c-command sentences in a similar manner. In this case, they would not be expected to impose a conjunctive entailment on sentences like (7). Presumably, in this case they would give the sentence the range of disjunctive truth conditions that arise for (8). That is, they would allow it to mean that the boy who is on the bridge will not get a ball, or, alternatively, he will not get a car, or, possibly he won't get a ball or a car. The results obtained showed that the children with HFA tested in this study were able to use c-command in order to distinguish between sentences where negation only preceded but did not ccommand disjunction vs. those where negation both preceded and c-commanded disjunction. In the latter case, the children were able to generate conjunctive entailment consistent with De Morgan's law of propositional logic. If the children with ASD were adopting a linear strategy then they would have attributed disjunctive truth conditions to both sentences.

Notice that the pattern of performance for TD children was more accurate performance on the c-command sentences like (7) than the non c-command ones like (8). This pattern has been observed in other studies too. In Crain et al.'s (2002) study conducted with 4 and 5 year old TD children, the pattern was similar. Children rejected the c-command sentences like (7) 92% of the time while accepted the non c-command sentences less, 87% of the time although there was less difference in the two conditions than in the present experiment. So, why is it that the children are more accurate on the c-command sentences? One possibility that was explored by Gualmini and Crain (2005) was that there is more length between the negation and disjunction operators in the non c-command sentences. They manipulated the number of words between the operators putting more length (5 words) between the operators in the ccommand sentences like Winnie the Pooh will not let Eeyore eat the cookie or the cake (5 words) and less length (3 words) in the non c-command sentences like The Karate Man will give the Pooh Bear he cannot lift the honey or the doughnut. The hypothesis was that if length was the relevant factor, children should perform more poorly on the c-command sentences than the non c-command ones. However, this was not the case. The performance of children was 85% correct on c-command sentences while their performance was 80% correct on the non ccommand sentences. The results showed that the interpretation of sentences was not determined by the number of intervening words between negation and disjunction. Consequently, children assigned a conjunctive interpretation only to sentences where negation c-commanded the disjunction.

There is another possible account of the lower accuracy on the non c-command sentences in our experiment with HFA children. This possibility hinges upon differences in the execution of the present study and the studies conducted by Crain et al. (2002) and Gualmini and Crain (2005). Crain et al. (2002) presented the c-command and non c-command sentences in two different sessions, to avoid any carryover effects, and Gualmini and Crain (2005) used a between subjects design. Our experimental design, on the other hand, was a within subjects design, so the participants heard both conditions within the same session. This may have caused some confusion. In addition, some of the non c-command sentences were true and one false, which again, may have meant the responses were less accurate for the non c-command sentences. Nevertheless, the pattern is the same as Crain et al. (2002) experiment, which leads us to infer that children's interpretations are based on computations of hierarchical sentence representations.

Another noteworthy finding from the present investigation is that the children with autism did not appear to have any difficulty processing relative clauses. This result contrasts with the findings of Durrleman and Zufferey (2013) who reported comprehension difficulties for both subject and object relative clauses in HFA French-speaking adults. The present set of sentences only contains subject relatives (e.g., the boy who is on the bridge will not get a ball or a car) and it is well known that object gap relative clauses are more challenging, but nevertheless, the children with autism performed well on our task. Our findings are consistent with those reported by Durrleman et al. (2015) who showed that French-speaking adults with ASD are more likely to master subject relative clauses than object relative clauses. The current results are also consistent with the finding that English-speaking teenagers diagnosed with autism made more errors on object relative clauses as opposed to subject relative clauses in a sentence repetition task (Riches et al., 2010).

If children are able to compute hierarchical relations of ccommand for sentences containing logical operators such as negation and disjunction, then they should able to use the same relations for interpreting sentences containing reflexives. If variable binding is an issue, though, it is possible that children could perform well on computing c-command with negation and disjunction, but not for Principle A. However, this latter prediction was not borne out for the HFA group of children in our experiment. It could be the case for children with language impairment, but this is yet to be verified. Our experiment found good performance on both experimental tasks, both with negation and disjunction and Principle A. Especially noteworthy is the fact that our current sample of children was younger (age 5;4–12;7) than the groups examined by Perovic et al. (2013a,b) who were between 6 and 18 years of age. Our results are consistent with the performance of Greek children on binding (Terzi et al., 2014). These authors showed that Greek-speaking children diagnosed with autism did not show deficient performance on reflexive binding. It is also reported by Geutjes (2014) that Dutch children diagnosed with autism show performance similar to typically-developing children on Dutch strong and weak reflexives. Although there are language specific differences between Greek, Dutch and English, which may introduce further variables, Principle A is nevertheless a universal principle, and so in principle, there should be no cross linguistic differences (see Thomas, 1991). A thorough crosslinguistic comparison of binding in autism will thus be a fruitful research direction in this regard.

Performance on Control Possessive (CP) sentences was also similar for both the groups of children in our study. Further analysis revealed that the performance on the CP condition was not significantly different from the performance on the Name Reflexive condition (NR) for both the groups. Taken together, our results provide evidence for intact grammar in children on the higher end of the autism spectrum. However, the HFA group performed 15% lower than their typically-developing peers only on the reflexive condition. Thus, we settle for a conservative hypothesis that children with HFA are sensitive to c-command for constraining relations between disjunction and negation. They are also likely to be sensitive to c-command for establishing the complex syntactic dependency of binding instead of relying on a linear strategy. However, in order to pick out the correct antecedent for reflexives, children not only need c-command, they also need to be able to understand that the relationship between the antecedent and the referent is one of variable binding. It may thus be possible that future studies with a larger sample size could show a deficient performance on sentences governed by Principle A for children with autism. This appears to be a plausible interpretation of our results as the performance of the HFA group (85%) was comparable to the performance of the comparison group (83%) on the control sentences (CP). The results hint at the possibility that the complexity of variable binding could pose a problem for children with autism. However, good performance on the CP sentences is not likely to be the result of relying on a linear strategy or choosing the nearest noun as the referent for the antecedent.

# Role Played by Non-verbal Abilities

Our findings are in accord with the latest findings reported by Janke and Perovic (2015). This recent study showed that 26 British HFA children (non-verbal IQ > 80 as assessed by the Matrices subtest of the KBIT) had intact comprehension of reflexives. The authors furthermore classified the children as ALI based on their performance on standardized language tests (the TROG and the British Vocabulary Scales). Only three children, classified as ALI, showed less than perfect performance on reflexives. In other words, the authors noted individual variation. We did not divide our children into a language intact or language impaired category due to the comparatively smaller size of the sample that was only tested on the TROG, although language scores for 6 children from our sample could be considered to be in the impaired range<sup>11</sup> as they scored below the 10th percentile (e.g., Whitehouse et al., 2008). Thus in the interim, we hypothesize that children with HFA form a distinct linguistic phenotype with respect to intact grammatical functioning (see Perovic et al., 2013b; Janke and Perovic, 2015). It remains to be seen whether children with low-functioning autism (LFA) or those with ALI show any improvements of performance on a different task which has not been usually used, i.e., TVJT<sup>12</sup> . In the meantime, then the general impression seems to be that LFA children will show deficits of syntax. This is because higher non–verbal IQ is an important prognostic variable for clinical populations in general and for autism in particular (Szatmari et al., 1989). Comparatively, LFA children are at a higher risk for language impairment irrespective of the degree of intellectual impairment (Kjelgaard and Tager-Flusberg, 2001).

# CONCLUSIONS

In the two experimental studies that we have reported, HFA children did not have difficulty computing the hierarchical relationship of c-command. In sentences containing negation and disjunction, children distinguish the interpretations they allow, depending on whether or not negation c-commands negation. In sentences in which there is a c-command relation between negation and disjunction, children successfully impose a conjunctive entailment, while attributing disjunctive truth conditions to sentences in which c-command does not hold. Furthermore, children are successful in picking out the correct antecedent for the reflexive in sentences like Bart's dad washed himself with soap, in conformity with Principle A. It will be instructive to replicate these studies in the future with a larger

# REFERENCES

sample of children, while carefully controlling for HFA children with and without language impairment. This is because studies report different results for HFA children and LFA (Boucher, 2009), or for children classified as ALI vs. ALN (Tager-Flusberg, 2006). Nevertheless, the findings from our English-speaking sample of children concur with the findings of English-speaking British children for binding (Janke and Perovic, 2015). These investigations all suggest that children at the high end of the spectrum may not have any kind of syntactic deficiency (e.g., Terzi et al., 2016a,b). Further cross-linguistic investigation with other complex syntactic structures will be important to shed light on this issue.

# ETHICS STATEMENT

This study was carried out in accordance with the recommendations of Human Research Ethics Committee at Macquarie University (ref: 5201200880) with written informed consent from all subjects. All subjects gave written informed consent in accordance with the Declaration of Helsinki.The protocol was approved by the Human Research Ethics Committee at Macquarie University.

# AUTHOR CONTRIBUTIONS

RT and NK jointly conceptualized the study and wrote the manuscript. NK collected the data for the study and conducted the analyses.

# FUNDING

This study was part of NK's PhD thesis titled, "Grammatical Knowledge in Children with Autism." NK was supported by the International Postgraduate Research Scholarship (IPRS) for the entire tenure of her PhD (2011–2015) at Macquarie University.

# ACKNOWLEDGMENTS

We are grateful to Prof. Stephen Crain for his valuable comments, Kelly Rombough, Cory Bill, and Anna Notley for experimental assistance, Dr. Jon Brock for his guidance and help on recruitment and ethical issues. Special thanks to all children, schools and families who willingly participated and Ms. Deepa Richmond for helping us establish contact with the special school at Melbourne. We also acknowledge the support received by the Australian Research Council Centre of Excellence in Cognition and its Disorders (CE110001021) www.ccd.edu.au.

<sup>11</sup>Our sample of HFA children had a low mean SS (76.58) on the TROG while the HFA children examined by Janke and Perovic (2015) had a mean score of 91.73 and those examined by Perovic et al. (2013b) had a mean score of 94.50 on the TROG. However, our children were chronologically younger than those examined by Janke and Perovic (2015) and Perovic et al. (2013b).

<sup>12</sup>Our choice of methodology (a dynamic version of the TVJT) may also have been optimal (see Sanoudaki and Varlokosta, 2015, who demonstrated task effects for the interpretation of Greek strong pronouns).

American Psychiatric Association (2000). Diagnostic and Statistical Manual of Mental Disorders, 4th Edn. Washington, DC: American Psychiatric Association.

Bartolucci, G., Pierce, S. J., and Streiner, D. (1980). Cross-sectional studies of grammatical morphemes in autistic and mentally retarded children. J. Autism Dev. Disord. 10, 39–50. doi: 10.1007/BF02408431

Bishop, D. V. M. (2003). Test for Reception of Grammar Version 2. London: Psychological Corporation.

Bishop, D. V. M., Chan, J., Adams, C., Hartley, J., and Weir, F. (2000). Conversational responsiveness in specific language impairment: evidence of disproportionate pragmatic difficulties in a subset of children. Dev. Psychopathol. 12, 177–199. doi: 10.1017/S0954579400002042

Boucher, J. (2009). The Autistic Spectrum: Characteristics, Causes and Practical Issues. London: Sage Publications.

Braine, M. D. S., and Rumain, B. (1981). Development of comprehension of "or": evidence for a sequence of competencies. J. Exp. Child Psychol. 31, 46–70. doi: 10.1016/0022-0965(81)90003-5


Study," in Advances in Language Acquisition, Proceedings of GALA 2011, eds S. Stavrakaki, M. Lalioti, and P. Konstantinopoulou (Newcastle: Cambridge Scholars Publishing), 472–482.

**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2017 Khetrapal and Thornton. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

# Contrasting Complement Control, Temporal Adjunct Control and Controlled Verbal Gerund Subjects in ASD: The Role of Contextual Cues in Reference Assignment

Vikki Janke<sup>1</sup> \* and Alexandra Perovic<sup>2</sup>

<sup>1</sup> Department of English Language and Linguistics, Rutherford College, School of European Culture and Languages, University of Kent, Canterbury, UK, <sup>2</sup> Division of Psychology and Language Sciences, Department of Linguistics, University College London, London, UK

### Edited by:

Anna Gavarró, Autonomous University of Barcelona, Spain

### Reviewed by:

William Snyder, University of Connecticut, USA Ana Lúcia Santos, Universidade de Lisboa, Portugal

> \*Correspondence: Vikki Janke v.janke@kent.ac.uk

### Specialty section:

This article was submitted to Language Sciences, a section of the journal Frontiers in Psychology

Received: 02 December 2016 Accepted: 09 March 2017 Published: 28 March 2017

### Citation:

Janke V and Perovic A (2017) Contrasting Complement Control, Temporal Adjunct Control and Controlled Verbal Gerund Subjects in ASD: The Role of Contextual Cues in Reference Assignment. Front. Psychol. 8:448. doi: 10.3389/fpsyg.2017.00448 This study examines two complex syntactic dependencies (complement control and sentence-final temporal adjunct control) and one pragmatic dependency (controlled verbal gerund subjects) in children with ASD. Sixteen high-functioning (HFA) children (aged 6–16) with a diagnosis of autism and no language impairment, matched on age, gender and non-verbal MA to one TD control group, and on age, gender and verbal MA to another TD control group, undertook three picture-selection tasks. Task 1 measured their base-line interpretations of the empty categories (ec). Task 2 preceded these sentence sets with a weakly established topic cueing an alternative referent and Task 3 with a strongly established topic cueing an alternative referent. In complement control (Ron persuaded Hermione ec to kick the ball) and sentence-final temporal adjunct control (Harry tapped Luna while ec feeding the owl), the reference of the ec is argued to be related obligatorily to the object and subject respectively. In controlled verbal-gerund subjects (VGS) (ec Rowing the boat clumsily made Luna seasick), the ec's reference is resolved pragmatically. Referent choices across the three tasks were compared. TD children chose the object uniformly in complement control across all tasks but in adjunct control, preferences shifted toward the object in Task 3. In controlled VGSs, they exhibited a strong preference for an internal-referent interpretation in Task 1, which shifted in the direction of the cues in Tasks 2 and 3. HFA children gave a mixed performance. They patterned with their TD counterparts on complement control and controlled VGSs but performed marginally differently on adjunct control: no TD groups were influenced by the weakly established topic in Task 2 but all groups were influenced by the strongly established topic in Task 3. HFA children were less influenced than the TD children, resulting in their making fewer object choices overall but revealing parallel patterns of performance. In this first study of three sub-types of control in ASD, we demonstrate that HFA children consult the same pragmatic cues to the same degree as TD children, in spite of the diverse pragmatic deficits reported for this population.

Keywords: autism, syntax, pragmatics, control, language development, language impairment

# INTRODUCTION

If a lay person is asked to consider which aspect of communication causes most difficulty to individuals with autism spectrum disorder (ASD), first and foremost, their thoughts will go to that aspect of language that uses context and real world knowledge to establish intended meanings, known in the linguistics field as pragmatics. Indeed, when summarizing the principal language problem in ASD, current textbooks continue to describe pragmatics as being the most pervasive, whilst slowly recognizing that varying syntactic deficits occur in this heterogeneous population, too (see Cummings, 2016). The term pragmatics, however, is used to cover an enormous range of skills, including the ability to understand non-literal meanings, such as those used in metaphor, irony and humor (see Ozonoff and Miller, 1996; Dennis et al., 2001; MacKay and Shaw, 2004; Martin and McDonald, 2004; Norbury, 2005; Rundblad and Annaz, 2010), the socially-based ability to listen and respond appropriately in conversational exchanges (see Tager-Flusberg and Anderson, 1991; Boucher, 2009) but also knowledge of how to make use of contextual information when encountering sentences that have more than one interpretation. This can include resolution of a structural ambiguity, such as in (1), where depending on the attachment site of the adjunct, either argument could be understood as being in possession of the stick. Alternatively, the choice might stem from a referential underspecification, as in (2), where the agent of the infinitival verb in the bracketed clause could be equated with the sentential argument (i.e., Luna), or someone else entirely, even if in the absence of further context, we are first drawn toward the so-called "sentence-internal referent" interpretation.


On the basis of these few examples, we can see that a very broad range of skills are covered by this umbrella term. Often, a distinction is drawn between complex pragmatic tasks that require a person to go beyond the literal meaning, such as in irony, and those that only require one to reach a literal meaning that is contextually determined, as seen in reference assignment. The former is sometimes referred to as secondary pragmatics and the latter as primary (see Recanati, 2004). It is on the latter type that this study focuses, together with syntactic competence. We compare the degree to which typically developing (TD) children and children diagnosed as HFA consult contextual cues when interpreting sentences that contain an underspecified term, whose reference depends upon another, fully specified term. This fully specified term may be in the same sentence, in which case it is a linguistic antecedent, but it may also occur outside of the sentence, in which case it is a discourse antecedent. Our aim is to establish if attendance to contextual cues differs in these populations when they engage in the task of reference assignment.

The sentences we focus on are called control constructions, which include a range of sentences whose interpretations are regulated syntactically or pragmatically. Prototypical examples of two sub-types of syntactically regulated control can be seen in (3) and (4), where in both cases the interpretation of the understood agent (represented as ec for empty category) is restricted to a unique interpretation (see Williams, 1980; Landau, 2013). In (3), which is an example of complement control (CC), the agent of the verb in the complement clause must be the matrix object (i.e., Hermione), whereas in (4), which is an example of sentence-final temporal adjunct control (AC), the agent of verb in the adjunct clause is interpreted as the matrix subject (i.e., Harry) by most people.


The syntactic nature of the relation between the antecedent (i.e., the element which controls the ec's interpretation) and the ec in complement control becomes clear if we illustrate the sentential restrictions on the ec's interpretation. Example (5) shows that the antecedent must come from within the sentence, that it must be local, and that it needs to be higher in the structure than the ec (see Williams, 1980; Manzini, 1983; Hornstein, 2001). In (5a), for example, only Hermione can be interpreted as the agent of kick. The indices also show that a sentence-external referent is not permitted and that the subject, though sentence-internal, cannot control the ec, it not being the most local contender<sup>1</sup> . (5b) demonstrates the structural superiority requirement, where only Hermione's cousin (and not Hermione) can be the ec's antecedent, since only the whole possessive DP c-commands into the infinitival clause.

	- b. Ron persuaded Hermione's cousin<sup>i</sup> [ec<sup>i</sup> to kick the ball].

The ec in sentence-final temporal adjunct control has long been reported as similarly restricted in terms of the syntactic antecedent it can take. It does not permit external referents, as shown in (6a), and its antecedent must also c-command it, as illustrated in (6b). (6a) also suggests that an objectoriented reading of the ec is barred. The adjunct, not having been selected by the matrix verb, is free to attach high, where only the subject c-commands it (see Landau, 2013). This high attachment also makes the subject the most local.

(6) a. Harry<sup>1</sup> tapped Luna<sup>2</sup> [while [ec1/ <sup>∗</sup>2/ <sup>∗</sup><sup>3</sup> reading the book]]. b. Harry's cousin<sup>1</sup> tapped Luna [while [ec1reading the book]].

**Abbreviations:** AC, sentence-final temporal adjunct control; ALN, autism language normal; ASD, autism spectrum disorders; BPVS, British Picture Vocabulary Scales; CA, chronological age; CC, complement control; HFA, high functioning autism; KBIT, Kaufman Brief Intelligence Test; MA, mental age; RS, raw score; SS, standard score; TD, typical development or typically developing; TROG, Test of Reception of Grammar; VGS, controlled verbal-gerund subject.

<sup>1</sup>There is a well-known exception in double-complement control, namely the double-complement subject control construction, exemplified most often by the verb "promise," where the subject controls the ec (John<sup>1</sup> promised Peter ec<sup>1</sup> to write the letter). This construction is acquired late (Chomsky, 1969) and is not fully accepted by all speakers (see Janke and Perovic, 2015, for a first study of HFA children's interpretation of this construction). In this article, we focus only on double-complement object control.

In juxtaposition to these syntactically regulated examples of control are pragmatically regulated ones, which admit variable reference. In the controlled verbal-gerund subject (VGS) below, which we met first in (2) above, the agent of the verb could be the sentential argument or someone else, although in the absence of context, the sentential argument is the preferred choice of most child and adult speakers (see Janke, 2016; Janke and Bailey, 2017 respectively). The fact that this example also permits an external-referent reading demonstrates the absence of the syntactic restrictions we saw for complement and sentencefinal temporal adjunct control above.

### (7) [ec? Rowing the boat clumsily] made Luna seasick.

Typically developing children start producing complement control sentences quite early, namely from 3 years, but comprehension studies have shown that for a short while after this, their interpretation of the ec is not fixed (Eisenberg and Cairns, 1994). From about 5–6 years, however, the majority of children restrict their interpretations in complement control to the object, even in the presence of pragmatic leads that cue subject interpretations (Lust, 1987; Cohen Sherman and Lust, 1993; Janke, 2016). This is in contrast to their interpretations of overt pronouns, for example, for which they do consult leads when determining who the pronouns refer to (see Cohen Sherman and Lust, 1993). Sentence-final temporal adjunct control occurs in production later than complement control (see Broihier and Wexler, 1995) and for a few years, some children accept subject, object and external-referent interpretations of them (see Lust et al., 1986; McDaniel et al., 1990/1991; Goodluck and Behne, 1992). However, by the age of about seven, nonsubject interpretations are very rare in the absence of pragmatic leads (see Hsu et al., 1989). More recently, Janke (2016) demonstrated that children aged between 6;9 and 11;3 do in fact permit object interpretations when that object is cued by a strongly established topic. The same result was found in a comparable study on 70 adults (Janke and Bailey, 2017). They did not, however, accept external-referent interpretations under the same amount of discourse pressure. This is important as it demonstrates that the fragility of this particular sub-type of adjunct control is restricted to sentence-internal arguments and so not to be confused with a pragmatically regulated control relation, such as controlled VGSs. These constructions have been studied less, but those that exist report mixed results. Adler (2006) and Goodluck (1987), using a truth-value judgment task and an act-out task respectively, found a preference for the sentence-external referent in children under six. In contrast, Janke (2016), which used a picture-selection task, reported that children from six onwards demonstrated a strong preference for sentence-internal referent interpretations, a preference which could be altered when the critical sentences were cued with pragmatic leads. There is a variability, however, in children's and adults' referent choices in these constructions, which is expected in a pragmatically regulated relation.

An interesting question regarding these three sub-types of control and (language in) ASD is whether or not children with ASD would converge on the same referential choices that typical populations do, or whether idiosyncrasies in the cognitive profile of individuals with ASD could influence their interpretations of linguistic constructions for which both syntactic and pragmatic proficiency is required. One type of executive function skill, namely that of cognitive flexibility, has been argued to be linked to obsessive and repetitive behaviors in ASD (e.g., South et al., 2007), and possible pragmatic deficits (Kissine, 2012). Deficiencies in cognitive flexibility, or the "ability to shift to different thoughts or actions depending on situational demands" (Geurts et al., 2009, p. 74), could certainly result in different patterns of interpretation of pragmatically controlled control constructions for children with ASD compared to TD controls, though these may not be relevant to interpretation of syntactically regulated constructions, such as complement control and sentence-final temporal adjunct control.

Complement control and temporal adjunct control are syntactically regulated relations that involve a CP-layer (their infinitival clauses are CPs, see Chierchia, 1984) and so are examples of complex syntax. There are mixed results in the literature as to whether children with ASD are fully proficient at this level of grammar. In the first two studies on this construction in autism, Janke and Perovic (2015) and Janke and Perovic (2016) showed that regular complement control caused no interpretative difficulties in two different populations of high-functioning children with autism (HFA). However, other examples of complex syntax may be compromised in this subgroup. Perovic et al. (2007), for example, reported on a group of HFA children having difficulty with raising constructions (i.e., Homer seems to Bart t to be wearing a hat), which are traditionally analyzed as instances of A-movement, where the argument is interpreted in a different (argument) position from which it originated. Constructions involving other types of movement (i.e., A-bar movement, where the argument moves to a non-argument position), such as relative clauses, have also been found to cause difficulty in some populations with ASD (Riches et al., 2010; Durrleman et al., 2016). It seems then, that syntactic relations that involve displacement can be compromised in some HFA populations, whereas those that do not are spared (see Janke and Perovic, 2015). Perovic et al. (2013a,b), for example, reported that reflexive binding caused no problems to HFA children classified as ALN (autism with normal language). Reflexive binding is a relation that does not incorporate movement and shares many other syntactic properties with obligatory control (see Manzini, 1983; Koster, 1987; Janke, 2008).

Unlike complement control, the interpretation of controlled VGSs depends heavily on the context. The examples below demonstrate this point. In (8a), as we saw above, there is a strong inclination to interpret the sentence-internal referent as the ec's antecedent. However, this is not fixed, as evidenced by the manner in which our interpretations can change in (8b and c). (8b) provides a "weakly established topic," in that the introductory sentence promises to make Ron the topic of the forthcoming sentence (see Janke and Bailey, 2017). The ec in the following sentence is a discourse-anaphoric element so it can take its reference from this weakly established topic. The example in (8c) demonstrates a stronger cue, utilizing a "strongly established topic." In this example, the first sentence is about Ron, thereby making this DP the sentence topic, and the person

Ron refers to is elaborated on and continues as the topic of discourse in the following sentence. In TD children and adults, these topics are very persuasive. The weakly established topic switches the majority of participants' referent choices toward it and the strongly established topic does so nearly uniformly (Janke, 2016).

	- b. Let me tell you something about Ron. ec Rowing the boat clumsily made Luna seasick.
	- c. Ron is taking a trip onto Hogwarts lake. Ron takes hold of the wood oars. ec Rowing the boat clumsily made Luna seasick.

Janke and Perovic (2016) tested a group of HFA children on controlled VGSs and found that the children showed a similar level of susceptibility to the pragmatic leads as their control children. This is in contrast to the widely established view that all pragmatics in ASD is deficient: their results suggest that pragmatic skills relevant to the selective and appropriate use of context to decide who is being spoken about in undetermined circumstances are functioning well in this population. Importantly, both the TD and HFA children ignored topics in sentences preceding complement control, as in (9a–c), and so chose the object uniformly.

	- b. Let me tell you something about Harry. Harry told Luna to pop the balloon.
	- c. Harry is performing a new trick. Harry takes out a pin. Harry told Luna to pop the balloon.

This stable pattern is expected because as the product of a syntactically regulated relation, the ec in complement control should resist outside interference, which is exactly what this paradigm revealed. What is not yet known, however, and is a question that the current paper will address, is how HFA children respond to topics that cue the object in temporal adjunct control, as in (10) below.

	- b. Let me tell you something about Luna. Harry tapped Luna while feeding the owl.
	- c. Luna is looking after the birds for the day. Luna takes out the bird seed. Harry tapped Luna while feeding the owl.

This sub-type of control has not been examined in ASD before so by conducting this first analysis on sentence-final temporal adjunct control we can provide an important contribution to the growing portrait of complex syntactic abilities in this population. However, there is another reason for this construction being an interesting topic to examine in children with autism, which relates to work that has revealed a lenience it exhibits in terms of the interpretations its ec permits. Recent experimental work on this sub-type of adjunct control has indicated that children's and adults' interpretations of the ec are not quite as previously assumed in the literature (see Janke, 2016, for children and Janke and Bailey, 2017, for adults). Using the aforementioned pragmatic lead paradigm, participants were asked to make referent-choice decisions in different sub-types of control which were preceded by no contextual cue, a weak contextual cue or a strong contextual cue. The results revealed temporal adjunct control not to be rigidly subject-oriented. Specifically, although a weak contextual cue toward the object had no or little effect on referent choices, a strong contextual cue toward the object resulted in a significant rise in object choices in both adults and children (aged 6;9–11;3). This was in stark contrast to their choices in complement control and control sentences, which remained uniform across every condition. Importantly, the fragility that adjunct control displayed in terms of its interpretation was also markedly different from pragmatically regulated control, as shown above in (8), which was also tested. In this instance, the cue determined referent choice definitively.

On the basis of these results, Janke and Bailey (2017) presented an analysis for this type of sentence-final adjunct control which could reflect these seemingly conflicting properties: Unlike complement control, which remains resilient to pragmatic cues, pragmatic cues preceding temporal adjunct control result in a significant number of children and adults adopting object interpretations. This generally only occurs under severe strong discourse pressure and not all participants are persuaded by the cue<sup>2</sup> . However, unlike controlled VGSs, interpretations in this type of adjunct control are restricted to within the sentence, cause nothing like the degree of interpretative shift seen in VGSs, and are renowned for not permitting generic interpretations one of the hallmarks of a pragmatically regulated control relation. Thus, a structural analysis was proposed, which could account for the evident interpretation shift, yet not lose sight of the syntactic properties that this sub-type of control displays, namely the requirement that the ec has a sentenceinternal, structurally dominant antecedent. Before we turn to the relevant sentences, note first that sentences with adjuncts are conventionally analyzed as having multiple attachment sites for the adjunct. This flexibility accounts for them not being restricted to a single interpretation, as illustrated in (11), where either the subject or the object can be linked to the prepositional phrase.

(11) The angry man chased the boy with a big stick.

When linked to the object, the adjunct attaches inside the VP within the domain of the object (see Larson, 2004) but when linked to the subject, the adjunct attaches higher, at the VP level, which is within the subject's structural domain. If we return now to sentence-final temporal adjunct control, a similar rationale can be used to account for the interpretations this construction permits. It is well established in the literature that the most popular interpretation of temporal adjunct control is one in which the subject is equated with the ec. On this parse, the adjunct adjoins at VP-level, as in (12), and only the subject ccommands into it so only a subject-oriented reading of the ec is possible.

<sup>2</sup> Interestingly, even in the "no cue" condition, complement control and temporal adjunct control result in slightly different outcomes: whereas complement control proves to be resolutely object-oriented, responses in temporal adjunct control, although predominantly subject-oriented, do display some variation. There is also a minority of adults that demonstrates a preference for an object-oriented reading in this base-line condition.

This structure accounts for the many people that prefer the subject-oriented reading but it cannot capture the grammar of speakers who permit an object-oriented reading under the discourse pressure generated by the strongly established topic. This is because the object does not c-command into the CP. However, by utilizing an analysis proposed independently for English VP structure in Janke and Neeleman (2012), Janke and Bailey (2017) proposed that speakers who allow an objectoriented reading permit the adjunct to attach low, merging directly with the verb, as in (13) (see also Larson, 2004). A consequence of this low attachment is that a VP-shell must be generated because in English, a verb must be left-adjacent to an argument that is dependent on it for accusative case (see Janke and Neeleman, 2012, for a full account). With a VP-shell configuration, both arguments c-command into the adjunct but the object is most local. On this parse, then, only an object reading of the ec is available.

Note that it is because adjuncts allow more than one structure that this choice between two sentence-internal referents is possible: when the syntax provides more than one structural configuration, pragmatics can influence the way in which the string is parsed. In contrast, in complement control, only the VPshell structure is available since control verbs select a CP that is obligatorily merged as the verb's complement (see Larson, 1991).

One important dimension to the temporal adjunct control data pattern is that these two interpretations are not equally favored. It therefore remains to account for why the subject parse is so much preferred over the object one. Janke and Neeleman (2012) argued that VP-shell formation is subject to a principle of economy, where a structure with no movement is more economical than one with movement:

(14) Economy


### (Janke and Neeleman, 2012: exx. (6))

The relevance of this analysis for current purposes is that a tree with no movement (i.e., no VP-shell) is more economical than a tree with movement (i.e., with a VP-shell) so the former structure should be highly preferred over the latter. Applying this to temporal adjunct control provides a means of modeling the data pattern observed, namely the overwhelmingly strong preference for subject interpretations. For full motivation of this account, the reader is referred to the original text. The important point for the current purposes is that it predicts subjectoriented adjunct control to be the highly preferred structure yet allows interpretations to change to the object under severe discourse pressure. In contrast, since complement control has an unambiguous structure, interpretations should not budge at all, and this is the precise pattern found in TD children and adults. It remains now for us to explore how HFA children perform on this construction, namely whether or not they will show the same initial preference for a subject interpretation, and whether preceding sentences that establish the matrix object as the topic of discourse will lead them to adopt the alternative, less economical, parse.

With the interpretative patterns of these three sub-types of control in place, we now return to what the current study will test and the outcomes that might be predicted from our children with ASD. With respect to complement control, we can formulate the following hypothesis:

### Hypothesis 1

If the syntax underlying complement control is unimpaired in HFA, all three groups' interpretations of the ec should pattern together, remaining uniformly object-oriented across the three conditions: the condition in which there is no cue, the condition in which the subject is cued by a weakly established topic and the condition in which the subject is cued by a strongly established topic.

Such a result would serve to further corroborate the previous studies' findings by replicating them. But of further importance is that it can contextualize our assessment of the children's attention to pragmatic cue in examples of control that are pragmatically regulated in adults, namely the controlled VGSs. If the children were persuaded by the topics in infelicitous circumstances (i.e., in complement control), then their liberal use of them in pragmatically regulated constructions would be less informative. If, however, they are ignoring topics when they are irrelevant, we have a clearer window through which to examine their pragmatic development.

Our predictions with regard to the controlled VGSs in the relevant conditions are as follows:

### Hypothesis 2


Our predictions with regard to temporal adjunct control are more tentative. Using work on complex syntax in ASD and sentencefinal temporal adjunct control in TD as a gauge, we can form the following predictions:

### Hypothesis 3


# METHOD

# Participants

This study was carried out in accordance with the recommendations of the "University of Kent's Research Ethics Committee," with written informed consent from all participants. All participants gave written informed consent in accordance with the Declaration of Helsinki and the University of Kent's Research Ethics Committee approved this study (ID: 20101584).

Sixteen children (4 girls) aged 6-16, with a confirmed clinical diagnosis of ASD, attending primary and secondary schools in Kent and greater London were recruited for the study. Four children with ASD were excluded for not being able to complete the testing battery, while for one participant an incomplete experimental battery is available. No participants had any hearing impairments, neurological or genetic deficits and they were monolingual native English speakers. Two groups of children from Kent acted as control participants to the group with ASD, reported as typically developing by their respective schools' head teachers. One group was matched to the ASD group on the raw score of Matrices subtest of Kaufman Brief Intelligence Test (KBIT), TD KBIT group, and the other on the raw score of British Picture Vocabulary Scales 2 (BPVS-2), TD BPVS group. Details of each group's scores on standardized measures are given in **Table 1**.

# Materials

A two-choice picture-selection task in Janke (2016) and Janke and Perovic (2016) was employed. Four examples of control were included in the test battery but this report focuses on three: complement control, temporal adjunct control and controlled verbal gerund subjects<sup>3</sup> . For each trial, children were presented with two pictures and needed to choose the one that best matched the accompanying sentence. This appeared at the bottom of the screen whilst also presented auditorily through headphones. They were recorded in a sound-proof booth, using the voice of a native-speaking female researcher not involved with the project, who maintained a nuclear stress throughout. Item presentation was randomized automatically for each participant, and location of the correct picture was balanced throughout (left or right) as were the figures in the pictures. Four characters from the Harry Potter books (Harry, Ron, Hermione, and Luna) were used. In addition to the three critical sentence types, six control sentence sets were included. The first was a simple SVO sentence set, which checked that children could follow the reasoning of the task and the second was an SVO embedded sentence. The third tested knowledge of "while." The fourth cued an incorrect interpretation of an SVO sentence with a weakly established topic, which tested whether children ignored a contextual cue for a sentence whose set interpretation is uncontroversial. The fifth cued an incorrect interpretation of an SVO sentence with a strongly established topic, which tested the same phenomenon but under still stronger pressure. Finally, the sixth tested understanding of cause relevant to the VGS condition. There were six trials in each condition, with three critical sentence types (complement control CC, adjunct control AC, and controlled verbal gerund subjects VGS) in three different conditions (no cue, weak cue, strong cue<sup>4</sup> ) together with six control conditions (SVO, SVO\_emb, while, SVO\_WC, SVO\_SC, cause), culminating in



Measures in bold are those on which the groups were matched. SS, Standard Scores; KBIT, Kaufman Brief Intelligence Test, Matrices subtest; BPVS, British Picture Vocabulary Scales 2.

<sup>3</sup>Another type of non-obligatory control tested in the same battery is reported on in separate work.

<sup>4</sup>Note that the VGS construction was cued in two directions in both cued conditions, namely toward the external referent and towards the external referent. This means there are 12 trials for this construction in each of these conditions, not 6.

102 sentences for each child. These sentences were distributed across three tasks, where they were divided according to the presence or absence of a cue: Task 1 presented the constructions with no cue, Task 2 preceded the constructions from Task 1 with a weakly established topic (weak cue), and Task 3 preceded them with a strongly established topic (strong cue). The order of the task presentation was pseudo-randomized (more details in Section Procedure below).

# Sentence Types

In this section, we illustrate examples of each construction tested, namely complement control, temporal adjunct control, controlled verbal gerund subjects, and the six control conditions. The complete set can be found in the Appendix (Supplementary Material).

For complement control, the matrix verbs were persuade, order and tell and the verbs in the controlled clauses were kick, mix and wave respectively. The picture corresponding to the correct interpretation depicted the character represented by the matrix object engaged in an action, while the character represented by the matrix subject stood by. The foil showed the matrix subject engaging in the action. For the examples below, the corresponding picture showed Ron kicking the ball, with Hermione standing next to him, and the foil showed Hermione kicking the ball, with Ron standing next to her.

	- a. Hermione persuaded Ron ec to kick the ball.
	- b. Let me tell you something about Hermione. Hermione persuaded Ron ec to kick the ball.
	- c. Hermione is learning a new game. Hermione practises the rules. Hermione persuaded Ron ec to kick the ball.

For temporal adjunct control, the matrix verbs were tap, kiss and lift and the verbs in the controlled clause were feed, fly and drink. The picture corresponding to a subject interpretation of the ec depicted the character represented by the matrix subject engaged in an action, while the character represented by the matrix object stood by. In the alternative picture, the matrix object engaged in the action. For the sentences below, the picture aligned with a subject interpretation had Harry tapping Luna with Harry feeding the owl, and the picture aligned with an object reading had Harry tapping Luna with Luna feeding the owl.

	- a. Harry tapped Luna while ec feeding the owl.
	- b. Let me tell you something about Luna. Harry tapped Luna while ec feeding the owl.
	- c. Luna is looking after the birds. Luna takes out the food. Harry tapped Luna while ec feeding the owl.

For controlled VGSs, the main verbs used were pour, read and row.

	- b. Let me tell you something about Ron. ec Reading the book slowly made Hermione sleepy.
	- c. Ron is looking up a spell. Ron says each word carefully. ec Reading the book slowly made Hermione sleepy.

For the first control condition, which was an SVO sentence in the progressive, the corresponding picture showed the subject engaged in the activity, whereas the foil depicted an unmentioned character as the agent. In the example below, the correct picture showed Harry mixing the flour with Hermione standing next to him and the foil showed the reverse.

(17) SVO Control Sentence Example Harry is mixing the flour.

In the "while" control condition, as illustrated in (18), the corresponding picture showed both characters engaging in the actions described. In the foils, only one of the characters is engaged in the relevant activity while the other stands by passively. For half the trials, the character not meeting the description was in the main clause and for the other half this was the character in the embedded clause.

(18) While Control Sentence Example Hermione is feeding the owl while Harry is waving the wand.

The control condition for the weakly established topic consisted of an embedded SVO sentence preceded by a weakly established topic. In the correct picture for (19), Ron is drinking the potion and Hermione is standing next to him. In the foil, Hermione is drinking the potion.

(19) Weakly Established Topic SVO Control Sentence Example Let me tell you something about Hermione. Hermione said that Ron is drinking the potion.

The control condition for the strongly established topic preceded an SVO sentence with a strongly established topic. For (20), in the correct picture, Harry is waving the wand with Luna standing nearby and in the foil, the reverse occurs.

(20) Strongly Established Topic SVO Control Sentence Example Luna is learning a difficult spell for a class test. Luna says the magic words slowly. Harry is waving the wand.

The fifth control sentence was applicable to VGS in that it tested understanding of causation such as in (21). In the correct picture, Hermione was pouring water and spilling it over herself with Ron standing by, whereas in the foil, Ron was pouring the water and spilling it on Hermione.

(21) The water made Hermione wet.

Finally, an embedded SVO control sentence was included. In the correct picture the subject of the embedded clause was engaged in the action (Ron in the example below) and in the foil, the matrix subject (Hermione) was the agent of the activity.

(22) Hermione said that Ron is drinking the potion.

# Procedure

Administration of the three tasks and the standardized assessments (BPVS II; KBIT; TROG-2) occurred over three testing sessions, each lasting between 30 and 40 min. BPVS II, KBIT and the first experimental task were administered in the first session, whereas in the second and third session, the order of the second and third experimental task was randomized for each child. TROG was administered either in the second or third session. For participants with ASD, if the child showed poor performance on control conditions (e.g., SVO, SVO\_embedded) in the first experimental task, the remaining experimental tasks were not administered and the child's data were not included in the analysis; this was the case for four children.

Experimental stimuli were presented on a laptop and randomized by computer software. Prior to the trial, children were shown pictures of the characters engaged in various activities and told their names. They were asked to point to each of the characters the experimenter named and to identify various activities occurring in the pictures, for example, "Show me Luna is popping the balloon" and "Show me Ron is reading the book." All the children succeeded with this phase. They were then told that they would be shown two pictures and see and hear a sentence describing the pictures. After the sentence had finished playing, they needed to choose the picture they thought went best with the sentence. The children made their choice by clicking on one of the large tabs by each picture, which appeared once the sentence had played, preventing them from making a premature choice. The children received a book voucher as a 'thank you' for taking part.

# RESULTS

# Results on the Control Conditions

All children performed at ceiling on the control conditions (see **Table 2**). These scores were not analyzed further due to ceiling effects.

# Results on Complement Control (CC) and Temporal Adjunct Control (AC)

A generalized linear mixed model (GLMM) executed in SPSS 22 was used to analyse the data for the CC and AC constructions. Fixed effects entered into the model were Group (ASD, TD\_KBIT, TD\_BPVS), Construction (CC and AC), Condition (no cue, weak cue and strong cue), and the Group∗Condition∗Construction interaction.

The model showed significant main effects of Construction F(1, 1,338) = 422.45, p < 0.001, Condition F(2, 1,338) = 6.066, p = 0.002, and a significant Group∗Condition∗Construction interaction: F(12, 1,338) = 3.841, p < 0.001. The main effect of Group was not significant: F(2, 1,338) = 0.275, p = 0.759.

Estimated mean object responses for each group on the two constructions, across three conditions, are given in **Figures 1**, **2**,

TABLE 2 | Mean correct responses in the control conditions.


revealing strikingly different patterns on CC vs. AC for each of three groups.

On the CC construction, Sidak-corrected pairwise comparisons included in the model revealed no statistically significant group differences in any of the three conditions (no cue, weak cue or strong cue). In contrast, the groups showed different performance on the AC construction on some of the conditions (estimated mean responses for an easy comparison are given in **Table 3**). On the no cue condition, no difference between groups was observed: ASD vs. TD\_KBIT t(1, 338) = 0.522, p = 0.602; ASD vs. TD\_BPVS t(1, 338) = 1.5221, p = 0.240. On the weak cue condition, the difference between the

TABLE 3 | Estimated mean object responses on AC and CC across all conditions.


ASD group and both control groups was significant: TD\_KBIT t(1, 338) = 2.526, p = 0.023 and TD\_BPVS t(1, 338) = 2.973, p = 0.009. On the strong cue condition, the differences almost reached statistical significance: ASD vs. TD\_KBIT t(1, 338) = 2.099, p = 0.071 and ASD vs. TD\_BPVS t(1, 338) = 2.381, p = 0.051. Comparisons of the performance of the two control groups, TD\_KBIT vs. TD\_BPVS, revealed no differences on any of the cues: no cue: t(1, 338) = 2.035, p = 0.121; weak cue: t(1, 338) = 0.449, p = 0.654; strong cue: t(1, 338) = 0.321, p = 0.748.

To get a better picture of within group effects, within group Sidak-corrected comparisons were carried out for each of the two constructions and across the three conditions. On the CC construction, none of the groups revealed differences between their performance on the three different conditions—no cue, weak cue and strong cue. However, on the AC construction, there were significant within-group differences on individual conditions. All three groups showed a difference between no cue vs. strong cue on AC (p = 0.006 for ASD, and p < 0.001 for the two control groups). The ASD group showed no other differences: their performance on no cue vs. weak cue was not significantly different, neither was weak cue vs. strong cue. The TD\_KBIT group showed a difference between weak vs. strong cue (p < 0.001), but not between no cue vs. weak cue. The TD\_BPVS group showed a difference between all different cues: no cue vs. weak cue (p = 0.009) and weak and strong cue (p < 0.001).

# Results on Verbal Gerund Subjects

The second GLMM analysis was run to examine the groups' performance on the VGS constructions, in 5 conditions, with Group (ASD, TD\_KBIT and TD\_BPVS) and Condition (No cue, Weak Cue Internal Referent, Weak Cue External Referent, Strong Cue Internal Referent, Strong Cue External Referent), and the Group∗Condition interaction included in the model.

The model showed no statistically significant effect of Group F(2, 40) = 0.209, p = 0.813, a highly significant effect of Condition F(4, 195) = 47.176, p < 0.001 and a significant Group∗Condition interaction, F(8, 242) = 2.732, p = 0.007. The estimated mean internal referent responses for each group across the five conditions are shown in **Figure 3**.

Pairwise Sidak-corrected analyses included in the model showed no significant differences between the performances of any of the three groups on any of the five conditions.

However, within-group comparisons of participants' performance on the five conditions revealed a larger number of significantly different comparisons in the ASD group, and a smaller number of significant comparisons in the two control groups, which is what drove the significant Group∗Condition interaction. In the ASD group, except for the non-significant comparison of weak cue External Referent vs. strong cue External referent [t(86) = 1.340, p = 0.184], children's performance on all other conditions was significantly different when compared to other conditions; (see **Table 4**), with p-values ranging from p = 0.024 to p < 0.001.

In the TD\_BPVS group, again children' performance on no cue sentences was not significantly different when compared to weak or strong cue sentences involving Internal Referent, but was highly significantly different when compared to sentences involving External Referent, under both the weak cue

TABLE 4 | Estimated mean internal referent responses on VGS across all conditions.


Int, Internal Referent; Ext, External Referent.

[t(532) = 9.784, p < 0.001] and strong cue [t(197) = 7.344, p < 0.001].

The weak cue Internal Referent did not differ to strong cue Internal Referent [t(849) = 0.505, p = 0.851], but performance on sentences involving Internal Referent was highly significantly different to all performances on sentences involving External Referent: weak cue Internal Referent vs. weak cue External Referent [t(1, 131) = 12.397, p < 0.001], weak cue Internal Referent vs. strong cue External Referent [t(250) = 9.176, p < 0.001]; strong cue Internal Referent vs. weak cue External Referent [t(673) = 11.587, p < 0.001], and strong cue Internal Referent vs. strong cue External Referent [t(185) = 10.198, p < 0.001]. The two External Referent conditions did not differ when compared to each other: weak cue External Referent vs. strong cue External Referent: [t(308) = 0.193, p = 0.851].

In the TD\_BPVS group, again the no cue sentences did not differ to sentences involving Internal Referent, but was highly significantly different when compared to sentences involving External Referent, under both the weak cue [t(73) = 4.596, p < 0.001] and strong cue [t(68) = 4.420, p < 0.001].

Performance on sentences involving Internal Referent did not differ from each other: weak cue Internal Referent vs. strong cue Internal Referent [t(228) = 1.245, p = 0.383], but performance on sentences involving Internal Referent was highly significantly different to all performances on sentences involving External Referent: weak cue Internal Referent vs. weak cue External Referent [t(596) = 16.036, p < 0.001], weak cue Internal Referent vs. strong cue External Referent [t(291) = 12.926, p < 0.001]; strong cue Internal Referent vs. weak cue External Referent [t(139) = 11.036, p < 0.001] and strong cue Internal Referent vs. strong cue External Referent [t(124) = 10.908, p < 0.001]. The group's performance on the two External Referent conditions did not differ: weak cue External Referent vs. strong cue External Referent: [t(1,000) = 1.041, p = 0.383).

# DISCUSSION

The aim of this study was to establish whether high-functioning children with autism respond differently to non-verbal and verbal MA-matched TD children when presented with contextual cues of different strengths on three sub-types of control: complement control, controlled verbal- gerund subjects and sentence-final temporal adjunct control. There were several main findings. First, children in all three groups demonstrated the same resilience to weakly and strongly established discourse topics in complement control. That is, they opted for the object interpretation consistently across all three conditions. Second, the HFA children's attendance to the topics in controlled VGSs was very similar to that of the TD groups. All three groups' referent choices were influenced by the topics. In the no cue condition, all groups showed a preference for the sentenceinternal referent, however, this preference was stronger in the TD groups, which meant the HFA group had significantly fewer internal-referent responses than the typical groups in this condition. The weakly established topics generally had a very strong effect on all children's interpretations, whose referent choices were largely determined by the cue. The decisive influence of this weak cue meant that the effect of the strongly established topic was masked. The result was that in most cases, there was no further shift toward the cued referent in this condition. Third, the results for sentence-final temporal adjunct control showed the groups to be behaving very similarly in one respect yet slightly differently in another. In the no cue condition, the three groups performed on a par with each other, all illustrating overwhelming consensus for a subject-oriented interpretation of the ec. In the condition that used a weakly established topic to cue the object, the TD groups' object choices remained stable relative to the choices made in the no cue condition, whilst the HFA group showed a small increase in accepting the object choices. In the condition that used a strongly established topic to cue the object, all groups' object choices increased significantly, yet the increase in the HFA group was smaller, resulting in the TD groups' number of object choices being somewhat greater than the HFA group's number of object choices, as illustrated in **Figure 1**.

We begin our discussion with the control items, before progressing to complement control and controlled VGSs, where we will indicate how these results relate to earlier literature on HFA children's performances on these constructions. After this, we turn to temporal adjunct control, where a number of possible explanations for these results are discussed.

Firstly, all children's performances on the control conditions were at ceiling. This meant that they could understand the task, they could comprehend embedded sentences, they understood the meaning of "while" entailed that two people engaged in an action simultaneously, and the basic cause-effect relation described in the VGS sentences—all over the course of three testing sessions lasting at least 25 min each. In addition, the conditions including pragmatic leads demonstrated that children could ignore infelicitous cues for sentences whose references are set.

Turning to complement control, we saw that there were no differences between the clinical and typical groups. All three groups, therefore, recognized the obligatory syntactic relation between the ec in the controlled complement and the object in the matrix clause. These results support the two earlier aforementioned studies on two different groups of HFA children (Janke and Perovic, 2015, 2016), both of whom gave object choices uniformly, too. Three studies culminating in the same pattern of results strongly support our argument that this example of syntax is unimpaired in HFA. The contribution of these results, namely that complement control has proven resilient to infelicitous cues, enables us to probe children's proficiency of pragmatically regulated constructions, confident that children are able to discern between terms whose references are regulated syntactically and terms whose references require attention to the context for their resolution.

Our next question was whether the HFA children's attention to contextual cues differs to that of TD children when assigning reference to the ecs in controlled VGSs. Firstly, in the no cue condition, although all groups demonstrated a preference for the internal referent, this preference was less pronounced in the HFA group, particularly in comparison to the TD-KBIT group. This result is important as it might answer for the subtle differences between the populations in the subsequent conditions. Turning to the cueing of the internal referent first, when this was cued by a weakly established topic, the HFA group's internal referent choices rose significantly. When cued by a strongly established topic in this same direction, the HFA group's internal referent choices increased significantly once again. In the other two groups, however, although interpretations could be seen to shift (recall **Table 4**), the topics did not significantly raise internalreferent choices from the baseline. This difference in the cues' effects could be seen as a product of the HFA children's initial lower number of internal-referent choices, which allowed the cues to come into effect. With respect to the conditions which cued the external referent, all three groups showed the same pattern. In the condition which cued the external referent with a weakly established topic, all groups' internal referent choices decreased dramatically—so much so that the effect of this cue was strong enough to mask any influence of the strongly established topic. In this latter condition, internal referent choices did not decrease further for any of the groups. On this basis, we can conclude that the populations are responding in a remarkably similar way to the pragmatic leads. The results are also in line with those reported in Janke and Perovic (2016), where that HFA population also showed no difference in performance on this construction from their matched TD controls.

At this point, we have distinguished between HFA children's responses in two types of control, one of which is syntactically regulated, the other of which is pragmatically regulated. In both cases, children performed on a par with the TD children. The one difference between the TD and HFA children can be sourced to the HFA children's slightly lower level of consensus for an internal referent in the no cue condition than the TD children so the hypothesis that HFA children would attend to the cue in a way that is not markedly different to TD on either of these constructions can be upheld.

The final construction we turn to is sentence-final temporal adjunct control. Recall that this construction is not a prototypical example of obligatory control but neither does it have the signature properties of a pragmatically regulated type. Unlike in complement control, where the complement is sister to the verb, the adjunct is not selected by the verb. However, the ec in sentence-final temporal adjunct control does not permit external referents, unlike pragmatically regulated control relations. Let us first consider why all three groups of children's interpretations might have shifted from the baseline at all. Sentence-final temporal adjunct control has long been analyzed as strictly subject-oriented (see Landau, 2013) so the current results might not have been predicted to have occurred. However, the introduction discussed recent experimental work on this sub-type of adjunct control which revealed that children's and adults' interpretations of the ec are not in fact uniformly subjectoriented. To recap, it showed that the same paradigms had demonstrated that a strong pragmatic cue toward the object resulted in a significant consensus for object choices in adults and children aged from 6 to 11. Importantly, this pattern of results was very different from complement control (which remained unaffected by the cue) and also from VGSs (which were affected uniformly by the cues), thereby motivating an alternative account for this type of adjunct control. Specifically, it was shown how an independently motivated analysis of English VP structure (Larson, 2004; Janke and Neeleman, 2012) could be employed to account for the interesting data pattern that had emerged from the TD children and adults: The most economical structure was one where no VP-shell had been generated. This should, therefore, be the highly preferred structure when all else is equal. When the tree is parsed in this way, only the subject interpretation is possible, as in (23), and this is indeed the highly preferred interpretation.

Under severe discourse pressure, however, such as that generated by a strongly established topic, an alternative parse is licit on this account. This less economical parse gives rise to a VPshell, which leaves the object as the most local c-commanding antecedent of the ec, as repeated in (24). On this parse, only an object interpretation is syntactically licit, representing the TD and adult participants' switch to the object in this strongly cued condition.

If we return now to the current children's preferences in the adjunct control sentences which contained no cue, we can note that all three groups displayed the above pattern: they all showed a strong preference for the subject in the no cue condition, thereby adopting the most economical, and so highly preferred, parse. In the second condition that employed the weakly established topic, children with ASD already started to pay some attention to the cue, whereas TD groups still ignored it. In the third condition, all groups showed a significant shift toward the object—indicating that all three groups were consulting the cue - though the degree to which the cue was effective was slightly different: the HFA group's object choices were somewhat lower than those of the two TD control groups'. The pattern of a gradual rather than a sudden increase in the HFA group across different strength of the cue is a result which now needs to be replicated in a further study, but, crucially, indicates that children with ASD do consider these contextual cues in their interpretation of sentence-final adjunct control.

To conclude, given the widely reported pragmatic and syntactic deficits in populations with ASD, the relatively straightforward patterns observed in our sample of children point to similarities, rather than differences, in the linguistic profiles of high-functioning children with ASD and their matched TD controls. In regard to this last construction in particular, it is important to note that there are a number of typical adults and children who are reticent to abandon their initial subject interpretations under the same level of discourse pressure. The subtle difference found in children's interpretations of this construction, therefore, does not in itself warrant an appeal to the hypothesized reduced cognitive flexibility reported in the literature, in line with Geurts et al. (2009). Our study, the first to compare the three sub-types of control in ASD in the literature, reveals that, in relevant contexts, HFA children consult the pragmatic cues similarly to TD children, despite diverse pragmatic deficits reported for this population, suggesting that (at least certain aspects of) primary pragmatics are functioning well in this ASD sub-group.

# AUTHOR CONTRIBUTIONS

VJ and AP conceived the study. VJ and AP shared the data collection. VJ collated and transcribed the data. AP analyzed the data and wrote the results section. VJ wrote the introduction, the method and the discussion. Both authors edited the final version of the manuscript and have agreed to be accountable for the content of the manuscript.

# FUNDING

The study was piloted and experimental materials were developed with funds from a British Academy Small Research Grant (SG112896).

# ACKNOWLEDGMENTS

Warmest thanks to our participants, their families and teachers at St Edwards Catholic Primary School, Sheerness, the Foresters

# REFERENCES


Cummings, L. (2016). Case Studies in Communication Disorders. Cambridge: CUP

Dennis, M., Lazenby, A. L., and Lockyer, L. (2001). Inferential language in HFA children. JADD 31, 47–54. doi: 10.1023/A:1005661613288


Primary School, Glebe Secondary School, Grange Primary School, Green Wrythe Primary School, The Quest School, Paddock Wood, St Peters Primary School, Canterbury and Wittersham Primary School, East Sussex. We are also grateful to Paulien Eijckeler, Christina Papapolyviou, Donna Mulhall, and Abigail Songhurst for their assistance with data collection, and to Gordon Craig for statistical advice.

# SUPPLEMENTARY MATERIAL

The Supplementary Material for this article can be found online at: http://journal.frontiersin.org/article/10.3389/fpsyg. 2017.00448/full#supplementary-material

Janke, V., and Perovic, A. (2015). Intact Grammar in HFA? Evidence from control and binding. Lingua 164, 68–86. doi: 10.1016/j.lingua.2015.06.009


binding in Williams syndrome and autism with/without language impairment. Lang. Acquisit. 20, 133–154. doi: 10.1080/10489223.2013. 766742


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2017 Janke and Perovic. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

# Investigating the Grammatical and Pragmatic Origins of Wh-Questions in Children with Autism Spectrum Disorders

### Manya Jyotishi\*, Deborah Fein and Letitia Naigles

Department of Psychological Sciences, University of Connecticut, Storrs, CT, USA

Compared to typically developing children, children with autism (ASD) show delayed production of wh-questions. It is currently controversial the degree to which such deficits derive from social-pragmatic requirements and/or because these are complex grammatical structures. The current study employed the intermodal preferential looking (IPL) paradigm, which reduces social-pragmatic demands. The IPL paradigm can help distinguish these proposals, as successful comprehension promotes the "pragmatics-origins" argument whereas comprehension difficulties would implicate a "grammatical-origins" argument. Additionally, we tested both the linguistic and social explanations by assessing the contributions of children's early grammatical knowledge (i.e., SVO word order) and their social-pragmatic scores on the Vineland to their later wh-question comprehension. Fourteen children with ASD and 17 TD children, matched on language level, were visited in their homes at 4-month intervals. Comprehension of wh-questions and SVO word order were tested via IPL: the wh-question video showed a costumed horse and bird serving as agents or patients of familiar transitive actions. During the test trials, they were displayed side by side with directing audios (e.g., "What did the horse tickle?", "What hugged the bird?", "Where is the horse/bird?"). Children's eye movements were coded offline; the DV was their percent looking to the named item during test. To show comprehension, children should look longer at the named item during a where-question than during a subject-wh or object-wh question. Results indicated that TD children comprehended both subject and object wh-questions at 32 months of age. Comprehension of object-wh questions emerged chronologically later in children with ASD compared to their TD peers, but at similar levels of language. Moreover, performance on word order and social-pragmatic scores independently predicted both groups' later performance on wh-question comprehension. Our findings indicate that both grammar and social-pragmatics are implicated in the comprehension of whquestions. The "grammatical-origins" argument is supported because the ASD group did not reveal earlier and stable comprehension of wh-questions; furthermore, their performance on SVO word order predicted their later success in linguistic processing of wh-questions. The "pragmatic-origins" argument is also supported because children's earlier socialization and communication scores strongly predicted their successful performance on wh-question comprehension.

### Edited by:

Stephanie Durrleman, University of Geneva, Switzerland

### Reviewed by:

Laurence B. Leonard, Purdue University, USA Schnell Zsuzsanna, University of Pécs, Hungary

\*Correspondence: Manya Jyotishi manya.jyotishi@uconn.edu

### Specialty section:

This article was submitted to Cognitive Science, a section of the journal Frontiers in Psychology

Received: 18 November 2016 Accepted: 20 February 2017 Published: 10 March 2017

### Citation:

Jyotishi M, Fein D and Naigles L (2017) Investigating the Grammatical and Pragmatic Origins of Wh-Questions in Children with Autism Spectrum Disorders. Front. Psychol. 8:319. doi: 10.3389/fpsyg.2017.00319

Keywords: wh-questions, language, comprehension, grammar, social-pragmatics, SVO word order

# INTRODUCTION

According to the DSM-V (American Psychiatric Association, 2013), autism spectrum disorder (ASD) is characterized as a developmental disorder with persistent deficits in social interaction and social communication, and with restricted and repetitive patterns of behaviors. Researchers have also proposed that some aspects of language development are different and/or delayed in this population compared to typically developing (TD) children (Rutter, 1978; Charman et al., 2003; Mitchell et al., 2006). It is generally acknowledged that children with ASD have underlying pragmatic deficits attributable to their social-communicative impairment (Kjelgaard and Tager-Flusberg, 2001; Tager-Flusberg et al., 2005; Naigles and Chin, 2015); however, the extent to which a grammatical deficit is also involved continues to be a matter of controversy (Tager-Flusberg, 1994; Eigsti et al., 2007; Eigsti and Bennetto, 2009; Naigles and Chin, 2015; Naigles and Fein, 2017). One way to investigate the extent of social-pragmatic difficulties and grammatical deficits in ASD is to examine their acquisition of wh-questions.

The acquisition of wh-questions seems challenging for children with ASD, as prior research has shown delays in both production and comprehension (Tager-Flusberg, 1994; Goodwin et al., 2012). Some researchers have argued that children with ASD have particular difficulties with wh-questions because these are complex grammatical structures (Eigsti et al., 2007) while others have proposed that their impairments are more related to pragmatics (Tager-Flusberg, 1994). However, most studies that have tested wh-questions in this population have involved spontaneous production, which relies heavily on socialpragmatics knowledge; e.g., knowing how to use these questions in the appropriate contexts. We examine whether there is also a grammatical deficit by investigating whether children with ASD comprehend subject-wh and object wh-questions during the same developmental period as their TD peers, using a paradigm that minimizes social-pragmatic demands. If whquestion difficulties have grammatical origins in these children, then these would also be implicated in their understanding of wh-questions. Moreover, to further explore the grammaticalorigins argument, we examined the relationships between earlier grammatical and social competences and later wh-question comprehension.

Wh-question acquisition is interesting because these constructions require both grammatical and pragmatic knowledge. A wh-question is a question that contains a wh-word (what, where, when, why, how), usually occurring in the beginning of the sentence (in English). Syntactically, these wh-words stand for information that is missing in the sentence. Wh-questions probe for missing arguments (e.g., "What did Mary buy?") or adjuncts (e.g., "Why did she buy that?"). Furthermore, argument wh-questions can ask for the grammatical subject of a sentence (e.g., (1) Who \_\_ likes Mary?) or the grammatical object of the sentence (e.g., (2) Who does Mary like \_\_?). Notice that both subject and object wh-questions involve wh-movement from the original argument location; however, the movement for subject wh-questions does not change the canonical word order of English sentences (SVO; see (1) above), whereas the movement for object wh-questions changes the word order of the sentence to OSV [see (2) above; Radford, 1988; Ambridge and Lieven, 2011].

Pragmatically, wh-questions serve several communicative functions. Wh-questions ask for information, which is unknown but desired by the speaker and is assumed to be known by the addressee. Moreover, the speaker needs to have knowledge about when such questions are proper to use in a discourse/conversational setting (Searle, 1969). Specifically, children can ask questions to seek new factual information from the listener about social or public information or elaborate about shared information between the speaker and listener; their questions can ask for clarifications or repetitions about the conversation, and they can reflect the speaker's knowledge, such as, rhetorical questions, or didactic questions (Sinclair and Van Gessel, 1990; Freed, 1994). Some wh-questions can ask for information about motives, intentions, or mental states of others (Gauvain et al., 2013; e.g., Where do you think the ball went?), whereas other types of wh-questions target purely physical objects, locations and events, such as, "Where's the bear?" or "What are you cooking?" These latter questions do not require mentalization to interpret the correct answer but nonetheless have underlying pragmatic functions like information seeking about objects and events, probing about shared events and experiences, and providing a conversational focus during play.

Wh-questions are acquired by TD children during the preschool years, with comprehension of subject and object whquestions attested between 1 and 2 years of age (Seidl et al., 2003; Goodwin et al., 2012; Gagliardi et al., 2016), and production of the same forms observed by 24–30 months (Tyack and Ingram, 1977; Bloom et al., 1982; Stromswold, 1995). Two- to threeyear old children first use these questions for informationseeking purposes, such as, "Where is the washcloth?" or "What are they drinking?" (Tyack and Ingram, 1977; Bloom et al., 1982; Goodwin et al., 2015), and soon also use the questions for conversational purposes like initiating or maintaining conversations, such as, "How are you?" or "What's that?" Some questions also serve a directive function, such as, "Why don't we read this one?" (James and Seebach, 1982).

Production of wh-questions also emerges during the preschool years for verbal children with ASD, but seems to be both delayed and sparse. For example, during structured and free play sessions, verbal children with ASD were observed to request less information compared to their TD peers and used fewer wh-questions during naturalistic (i.e., unprompted) interactions (Wetherby and Prutting, 1984; Tager-Flusberg, 1994; Eigsti et al., 2007; Goodwin et al., 2012). Early hypotheses concerning the origins of this "wh-question deficit" have focused on the social/pragmatic impairments of children with ASD, arguing that the children were less interested in soliciting information from others, and so had fewer reasons to ask the questions (Rutter, 1978; Tager-Flusberg, 1994). Children with ASD might also ask fewer wh-questions because of their impaired understanding that others can have knowledge that would inform the purpose of their questions. Tager-Flusberg's (1994) analysis of the spontaneous speech of six boys with ASD supported this hypothesis, because while the boys increased in their production of well-formed wh-questions over time especially in using auxiliary verbs and inversion—at rates similar to language-matched peers, their overall frequency of whquestion usage remained sparse (9.3 per 1000 utterances in the ASD group vs. 28.2 per 1000 utterances in the language-matched peers). More qualitatively, the children with ASD's usage of wh-questions in conversations was more restricted, i.e., they produced fewer information-seeking questions about objects, events or psychological states, and did not seem to manifest the conversational functions of agreement and clarification to regulate verbal interactions. Children with ASD also rarely asked conversational openers or social routine questions like, "How are you?" Thus, children with ASD did not seem impaired in their syntactic acquisition of wh-question forms, as shown by their growth in well-formed questions, but their usage of these questions was clearly impoverished.

The pragmatic-origins hypothesis has also been supported by Goodwin et al. (2012), who examined wh-question comprehension in English-speaking children with ASD using intermodal preferential looking (IPL). IPL has the potential to provide a more accurate assessment of linguistic knowledge in very young children, because it involves little to no social, motor or speech demands: children simply watch two videos while hearing a central audio that matches only one of the videos. The children's eye movements are recorded; the assumption is that if they understand the audio, they will look longer at the matching video (Golinkoff et al., 1987, 2013). IPL thus reduces the social-pragmatic constraints for the use of wh-questions; children are not asked to answer any questions, nor are they expected to produce any. Goodwin et al. (2012) showed a wh-question video to TD children and children with ASD at four visits during a longitudinal study. The video presented familiar items—an apple, a flower, keys, and a book—engaged in hitting events (i.e., an apple hitting a flower, keys hitting a book; adapted from Seidl et al., 2003). Following these familiarization trials, the children saw three test trials that asked object-wh, subject-wh, and "where" questions while the pairs of items were displayed simultaneously, side by side. The TD children demonstrated reliable understanding of wh-questions at 28 months of age, at the first visit when they were shown the videos. The children with ASD showed reliable comprehension only at 54 months of age, at the 4th visit when they had seen the videos. While their comprehension was delayed relative to the TD group in terms of their chronological age, the overall language level of the ASD group at 54 months was not different from the language level of the TD children at 28 months; therefore, Goodwin et al. (2012) suggested that comprehension of wh-questions was achieved at similar language levels in both groups. Minimizing the pragmatic demands of wh-question use via IPL yielded positive findings of wh-question knowledge, thus supporting the claim that sparse wh-question usage in children with ASD is a result of their social/pragmatic impairments. The findings of Durrleman et al. (2016) are also consistent with this hypothesis. These researchers tested school-age French children with ASD on their comprehension of both simple and complex wh-questions, and reported that, while the children performed above chance, their scores were significantly lower than those of TD children matched on non-verbal abilities.

However, not all research is consistent with the pragmatic origins hypothesis. Two recent studies of the spontaneous speech of children with ASD have indicated that their whquestion development was tightly linked to their overall grammatical development. Eigsti et al. (2007) compared fiveyear-old children with ASD to TD children matched on nonverbal IQ and receptive vocabulary. Not surprisingly, the children with ASD used fewer and less complex wh-questions than the TD children; however, they also had smaller mean length of utterance (MLUs), indicating that their syntactic development was delayed relative to their vocabulary levels. Moreover, their wh-question complexity patterned with their MLU rather than their vocabulary. Tek et al. (2014) followed two subgroups of children with ASD across 2 years, and found that the highverbal children with ASD, who were matched on MLU with TD children, showed increases in their complexity of wh-question use (i.e., progressing from routine questions to wh-questions with verbs, and then to wh-questions with both a main and auxiliary verb, etc.) that paralleled the increases in their MLU and in the wh-question use of the TD group. In contrast, the lowverbal children with ASD showed flatter slopes in their individual growth curves. In sum, these researchers have found wh-question use in children with ASD to be commensurate with their overall grammatical levels, suggesting that observed deficits are due to grammatical difficulties rather than pragmatic ones.

In the current study, we revisit this debate concerning the grammatical vs. pragmatics origins of the wh-question deficit in two ways. First, we conducted a replication and extension of Goodwin et al.'s (2012) study, altering the stimuli with the goal of making them easier. Second, we investigated possible precursors to wh-question comprehension, under the hypothesis that if the wh-question deficit has a grammatical origin, then early grammatical competence will predict later wh-question comprehension; in contrast, if the wh-question deficit has a pragmatics origin, then early social competence will predict later wh-question comprehension. We motivate each of these innovations below.

Goodwin et al. (2012) reported that the children with ASD achieved wh-question comprehension at the visit when their general language levels were on a par with those of the TD children, at the first visit when they (the TD children) demonstrated wh-question comprehension. Following Seidl et al. (2003) and Gagliardi et al. (2016), who reported successful whquestion comprehension in TD children as young as 20 months of age, it is possible that the TD children in Goodwin et al. (2012) would have shown comprehension at lower language levels; however, they were not shown this video at earlier visits. The children with ASD in Goodwin et al. were tested on whquestion comprehension when their language levels were at ageequivalents of 20 months, but they did not show comprehension at this earlier point. We conjecture, though, that several aspects of Goodwin et al.'s (2012) stimuli were less than ideal. First, both events involved the verb hit, which we have found is not common for children with ASD. That is, even by 54 months of age, only 53% of the children with ASD had produced the verb "hit," according to parental report. If "hit," and hitting events, are unfamiliar to young children with ASD, they might not have been able to process the wh-questions efficiently during the 4-s test trials. In contrast, all TD children in the study had produced this verb at 32 months of age—and most showed successful wh-question comprehension as well. Furthermore, the hitting events themselves were non-prototypical transitive events; that is, they involved the action of an inanimate agent on an inanimate patient. Prototypical transitive events involve animate agents (Slobin, 1982), as do prototypical wh-questions (Tyack and Ingram, 1977), and the wh-questions produced by children with ASD generally follow this pattern as well (Tager-Flusberg, 1994; Tek et al., 2014). The presentation of inanimate agents might have caused additional confusion. In sum, it is possible that earlier comprehension of wh-questions in these children with ASD was not demonstrated due to these challenging stimuli, and the current study introduces several changes which were hypothesized to facilitate the interpretation of the events and so the comprehension of wh-questions referring to those events. Evidence of earlier comprehension would support the "pragmatic origins" hypothesis.

A second way to examine the origins of wh-question acquisition, and of the deficit observed in the productions of children with ASD, is to investigate the extent to which earlier grammatical and/or pragmatic factors are precursors or predictors of successful wh-question comprehension. Grammatically, a pre-requisite to understanding subject- and object-questions might lie in children's understanding of basic declarative sentences consisting of a subject, a verb, and an object, known as canonical English SVO word order. For example, in order to engage in wh-movement, children should have systematically understood the SVO sentence structure (3) and one-to-one matched the structure of the frame with the wh-question (4) to help them guide to the correct referent (either the subject or object) of the action.

In the above example, if children have understood the subjectverb-object sentence structure from hearing the sentence "John likes Mary," then when they hear a subject-wh-question like, "Who \_\_ likes Mary?" children should be able to structurally map this transitive construction to the gap in the subject position of the question, "Who \_\_\_ likes Mary?" Moreover, if children understand that the SVO sentence structure is a transitive frame with a subject (a "liker") and a verb ("like") that requires a direct object (a "like") this knowledge can enable them to map the wh-word movement back to its gap in the object position. Therefore, we propose to investigate how children's prior grammatical knowledge of SVO word order contributes to their later wh-question comprehension. Research with TD children has begun to demonstrate that early sentence processing skills predict later syntactic performance (Newman et al., 2006; Kidd and Arciuli, 2016); in addition, one recent study has found predictive relations between children with ASD's processing of sentences and their later sentence comprehension (Naigles et al., 2011). In that study, children with ASD were taught novel verbs in transitive sentences via the IPL paradigm and then asked whether the verbs mapped onto causative or non-causative actions; i.e., syntactic bootstrapping (Naigles, 1990). The children were generally successful; moreover, after controlling for their vocabulary size, those who were faster processors of SVO word order (i.e., showing a shorter latency to look at the match scene) 8 months earlier were better able to use the SVO frames to make predictions about new verb meaning (children's longer looking time toward the matching scene during the test trials compared to baseline trials). In the current study, we investigate the extent to which children's comprehension of wh-questions is predicted by their earlier comprehension of declarative SVO sentences.

Pragmatic prerequisites to children's acquisition of whquestions per se are less well-defined; however, pragmatic and social precursors to language development in general are wellattested, and include such factors as joint attention, gesture, and turn-taking (Clark, 2015; Tomasello, 2015). These behaviors are known to be consistently impaired in children with ASD (Tager-Flusberg et al., 2005), and variability in early manifestations of these pragmatic abilities has been found to predict variability in later measures of language, both general (Mundy et al., 1990; Luyster et al., 2008) and specifically grammatical (Rollins and Snow, 1998; Naigles et al., 2016). In the current study, we directly investigate the contribution of social and pragmatic factors to whquestion development and understanding, and hypothesize that children who are more attuned to their social and communicative milieu might acquire wh-questions earlier, because by attending well to their functions (e.g., asking for information), they may also become focused sufficiently on their forms.

In the current study, we used IPL to assess wh-question comprehension in TD preschoolers and preschoolers with ASD. We created new videos that included animate characters, i.e., a costumed horse and a bird, as well as new actions and verbs, such as tickle, wash, hug, and ride, which have been reported to be understood by children with ASD at 2.5 years of age (Swensen et al., 2007). Our first hypothesis was that finding earlier or equivalent comprehension with these videos, compared to those of Goodwin et al. (2012), would support a pragmatic origin for the "wh-question deficit" in children with ASD. That is, minimizing pragmatic demands, coupled with more familiar stimuli, should illuminate intact grammatical knowledge. In contrast, later or weaker wh-question comprehension with the new videos would be consistent with a grammatical origin.

We also examined the relationships between children's early standardized test measures, socialization measures, and word order comprehension, and their later wh-question comprehension to investigate the degree to which earlier general language measures or social measures are related to later comprehension. In terms of grammatical competence, early grammatical knowledge of word order may serve in either general or specific ways as a foundation for later acquisition of wh-questions. For example, in general, if a child has difficulties acquiring word order at an early age then these same difficulties could influence their ability to learn grammar in later years. Specific links between early acquisition of word order and wh-question comprehension might involve the fact that without understanding that SVO is the canonical word order in English, the function of the wh-word, i.e., that it stands for a missing NP, might be opaque. Our study was not designed to distinguish between these possibilities; instead, we investigate whether early grammatical competence is associated with later performance on wh-questions, which would strengthen the argument of a grammatical deficit in wh-questions in children with ASD. We also investigate whether early (rather than concurrent) social competence is related to subsequent whquestion comprehension, on the rationale that children need to be socially aware to understand the point of wh-questions and the reasons for asking them. For example, one Vineland question asks, "Answers when familiar adults make small talk (for example, if asked, "How are you?" says, "I'm fine"; if told, "You look nice"," says, "Thank you"; etc.). Thus, if early socialization measures are associated with later wh-question comprehension, then this will support the pragmatics-origins argument.

# MATERIALS AND METHODS

# Participants

Fourteen children with ASD and 17 TD children participated in this longitudinal study. All were monolingual English learners. One child with ASD participated in the overall project, but was not included in the final analyses of this study because he did not provide sufficient data during the wh-question task for more than half of the visits. One child in the TD group was omitted from the IPL analyses at visit 6 because she had missing data at this visit. We recruited participants in the ASD group by contacting facilities that offer Applied Behavioral Analysis (ABA; Lovaas, 1987); we restricted the sample to children receiving ABA to ensure some consistency in the interventions being received. Moreover, ABA is the most common intervention offered in our geographic area (northeastern U.S.). These service providers distributed information about the study to parents of children who had been diagnosed within the last 6 months and had just begun ABA training. Interested parents then contacted us and were interviewed via telephone to verify their child's diagnosis and eligibility for the study. All parents signed consent forms prior to participating.

The participants in the ASD group included seven White males, two Asian males, and one African American male. There were two White females, one Asian female, and one African American female. This sample of children somewhat reflects the prevalence of ASD in the general population; we made significant efforts to recruit non-Caucasian families. All children were from lower-to upper-middle-class families living in the Northeastern United States. At the first visit, the children with ASD ranged in age from 18 to 42 months (M = 32.93, SD = 7.28) and their MLU, a measure of sentence complexity, ranged from 0 to 3.13 (M = 1.26, SD = 0.67). To be included in the study, the children with ASD had to be receiving at least 20 hours of ABA intervention weekly. Because it is difficult to distinguish between ASD and pervasive developmental disorder—not otherwise specified (PDD-NOS), we accepted participants with either diagnosis, which was then verified by the Autism Diagnostic Observation Schedule (ADOS; Lord et al., 2000). The ADOS and other test scores are provided in **Table 1**.

The TD group was recruited via birth announcements from local newspapers. The TD group included 13 White males, three White females and one Asian female from middle- to uppermiddle-class families living in Connecticut. These demographics closely resembled those of the ASD group. Rather than matching the TD group to the ASD group on age, we chose to match them on level of language development. Therefore, we began testing TD children at ∼20 months of age (M = 19.74, SD = 1.25) with MLU ranging from 1.02 to 1.86 (M = 1.36, SD = 0.25) at visit 1, when their language abilities were most similar to those of the ASD group at visit 1 (see **Table 1**).

# Materials

### Standardized Tests

The ADOS (Lord et al., 2000) was administered to assess ASD status. We also administered the Vineland Adaptive Behavior Scales, 2nd Edition (Vineland II; Sparrow et al., 2005) to evaluate children's communication, socialization, daily living skills, and motor skills, which yielded standard scores based on mothers' reports. The communication domain of the Vineland consisted of some items related to language competence, such as, "Uses present tense verbs ending in ing (for example, "Is singing"; "Is playing"; etc.)," and other items that were more related to pragmatics, such as, "Understands sayings that are not meant to be taken word for word (for example, "Button your lip"; "Hit the road"; etc.)" or "Asks questions by changing inflection of words or simple phrases (for example, "Mine?"; "Me go?"; etc.)"; grammar is not important. The socialization domain consisted of items like, "Makes or tries to make social contact (for example, smiles, makes noises, etc.)" or "Answers when familiar adults make small talk (for example, if asked, "How are you?" says, "I'm fine"; if told, "You look nice," says, "Thank you"; etc)". In the ASD literature, the Vineland scale has been found to be strongly correlated with join attention skills (Toth et al., 2006; Poon et al., 2012) and ADOS scores (Klin et al., 2007; Paul et al., 2014); it is frequently used as a measure of social competence in special populations like ADHD and ASD (Oswald and Ollendick, 1989; Charman et al., 2001, respectively). In our study, an average of the communication and socialization scores was used as a measure of social competence.

The Mullen Scales of Early Learning (1994) were administered to measure the development in the areas of visual perception, fine motor skills, receptive language, expressive language, and gross motor skills (Mullen, 1994). Finally, the MacArthur Communicative Developmental Inventory (CDI; Fenson et al., 1994) provided a measure of the child's production vocabulary, via parental report. The infant version of the CDI was used at visit 1. The Receptive One-Word Picture Vocabulary Test, 4th edition (ROWPVT-4; Martin and Brownell, 2010b) and Expressive One-Word Picture Vocabulary Tests, 4th edition (EOWPVT-4; Martin and Brownell, 2010a) were administered at all visits to evaluate the children's receptive and expressive vocabulary skills, respectively.



<sup>\*</sup>p < 0.05.

<sup>a</sup>Autism spectrum = 7+; autism = 12+.

<sup>b</sup>CARS range = 15–60; Autism spectrum = 30+; autism = 36+.

<sup>c</sup>Number of words produced out of 396.

ADOS, Autism Diagnostic Observation Schedule; CARS, Childhood Autism Rating Scale; CDI, Communication Development Inventory. ROWPVT, Receptive One-Word Picture Vocabulary Test. EOWPVT, Expressive One-Word Picture Vocabulary Test.

### IPL Setup

The IPL paradigm (Golinkoff et al., 1987; Naigles and Tovar, 2012) involves showing children two videos side by side, while playing child–directed speech from a central speaker that corresponds to only one of the videos. The child's direction and duration of gaze are recorded and coded for indications of his/her understanding. An Apple Powerbook was used to project the stimuli onto a portable 63" × 84" screen, via an LCD projector. The computer was connected to an external speaker, which was placed out of sight behind the screen. A digital camcorder for filming the child's face was placed on a small tripod in front of the screen, just below the center.

# IPL Stimuli

### Wh-Question

The wh-question video was adapted from Goodwin et al. (2012), with two major changes. First, the animate characters of a costumed horse and bird served as agents and patients. Second, these characters engaged in four familiar live-action transitive events: washing, tickling, riding and hugging. The verbs describing these events were all attested in the vocabularies (i.e., CDIs) of both groups by visit 4. The horse appeared as the agent for the tickle and ride events, and the bird appeared as the agent for the wash and hug events. After each transitive event, the horse and bird appeared side by side and the audio asked a whobject or wh-subject question. In total, each child was asked four object-wh-questions, four subject-wh-questions, and at the end of the video, two where-questions. In the videos, the side of the matching scene was counterbalanced both within (i.e., the matching side varied from left to right in an XYYXXY pattern) and between (i.e., for half of the children the first match was on the left and for the other half, the first match was on the right) participants (see **Table 2** for the layout).

### Word Order (Candan et al., 2012)

The layout for the word order video is presented in **Table 3**. The pretest trials (labeled "P" in the table) introduced and labeled the costumed horse and bird. Trials 1–2 presented a familiar action with agent A and patient B on one side (e.g., the bird pushing the horse), and then with agent B and patient A on the other side (e.g., the horse pushing the bird). During these trials, the action was labeled in a neutral frame (e.g., "Pushing!"). In Trial 3 (the control-for-salience trial), both renditions of the action were presented simultaneously and the audio was the same as in trials 1 and 2; this provided a baseline measure of stimulus salience. Trial 4 was the test trial, in which the verb was placed in a sentence such that only one of the two renditions matched. This trial thus examined whether the child understood the difference between "A verbs B" (e.g., "the bird is pushing the horse") and "B verbs A" (e.g., "the horse is pushing the bird"). A total of six familiar verbs and actions were introduced and then tested for word order understanding. These were push, tickle, pull, wash, hug, and ride. The same characters were used for each action; the horse was the agent for half of the matching actions and the bird was the agent for the others.

### Jyotishi et al. Origins of Wh-Questions in ASD

### TABLE 2 | Sample layout of the Wh-question video.


<sup>a</sup>Object-wh-questions = What did the horse tickle?; What did the bird wash?; What did the bird hug?; What did the horse ride?

<sup>b</sup>Subject-wh-questions = What hugged the horse?; What rode the bird?; What tickled the bird?; What washed the horse?

<sup>c</sup>Where is the horse?; Where is the bird?

# Procedure

The children were visited in their homes, at 4-month intervals for a total of six visits. The visits began with one experimenter administering standardized tests, while another experimenter prepared the IPL setup. Next, the child sat ∼3 ft in front of the screen and camcorder and watched three IPL videos. The word order video was shown at visits 1 and 2; the wh-question video was shown at visits 3 through 6, and was always the second or third video in the series. Breaks were allowed as needed between videos. After viewing the videos, the mother and child participated in a 30-min play session. Finally, the mother completed any remaining surveys or forms.

## Coding

The films of the child's gaze during the IPL task were captured and digitized in the lab. Looking times were coded offline by watching these films frame by frame, using a custom coding program. The test audio was removed, so the coders did not know which direction of looking was correct. Looking during each frame was coded as to the left, right, center, or away. If a child did not look at both screens for more than 1 s total for a given trial, his/her data were not included for that trial. For the wh-question video, this occurred in 1.4% of test and control trials for the TD group and 4.6% of test and control trials for the ASD group. For the word order video, the percent of excluded trials for the TD group was 2.7%, and it was 2.9% for the ASD group. This level of data loss is similar to that in other IPL studies (Naigles et al., 2005; Swensen et al., 2007; Goodwin et al., 2012). All participants were coded by at least two coders to ensure reliability. The correlation between coders averaged 0.99, p < 0.001.

### Wh-Question Comprehension

The dependent variable was the mean proportion of time that the child looked at the named item during each trial type (i.e., subject-, object-, and where-questions). This was the metric employed by Seidl et al. (2003; see also Goodwin et al., 2012) to demonstrate what-question comprehension; namely, the child needed to look at the named item significantly less during

### TABLE 3 | Sample layout of the word order video.


<sup>a</sup>P indicates the pretest trials.

a subject- or object-wh-question trial than during the wherequestion trial. For example, to assess comprehension of "What tickled the bird?", we compared children's looking time to the bird during this trial vs. during the "Where is the bird?" trial. During the "where" trial, they should look consistently at the bird whereas during the "what" trial, they should look consistently away from the bird. Such within-subject comparisons are common with the IPL paradigm, as children's eye movements during baseline trials serve as their own controls for performance during test trials (Brandone et al., 2007; Swingley, 2011; Piotroski and Naigles, 2012). To succeed at this task, then, children need not manifest a completely adult-like understanding of the grammar; they need only to allow the "what" questions to pull their attention away from the named item, indicating that they are aware that grammatical wh-movement has occurred (and that for object questions, SVO is no longer the correct word order). There is evidence that adults, too, initially look at the named item before switching to the correct referent, during online processing of what-questions (Sussman and Sedivy, 2003; Kukona and Tabor, 2011).

### Word-Order Comprehension

The dependent variable was the difference score between the children's proportion of looking to the match during the test trial and baseline trials. This is a common way to assess comprehension via IPL s (Piotroski and Naigles, 2012); the testbaseline comparison demonstrates the degree to which the test audio guided the children's looking at the matching scene, relative to their initial preference for that scene based solely on stimulus salience. Data from visits 1 and 2 were combined (as in Tovar et al., 2015).

# Data Analysis Plan

In our first set of analyses, we assessed wh-question comprehension via repeated-measures ANOVAs to compare children's percentage of looking at the named item for the "where" question to looking at the named item for the "what" questions in each group. Next, we conducted pairwise correlations between the wh-question comprehension measures (using the difference score of percent looking to the named item during "where" questions minus percent looking to the named item during "what" questions) and standardized test language measures to discover relationships between children's general language and their wh-question comprehension. Finally, we conducted regression analyses to investigate the extent to which the children's performance on the earlier word order IPL measure (i.e., the grammatical measure) and their earlier Vineland communication and socialization scores (i.e., the social-pragmatic measures) uniquely predicted their performance on the later wh-question comprehension measure. These Vineland scores were entered separately as well as an average score.

# RESULTS

# When Do Children with ASD and TD Children Comprehend wh-Questions?

A repeated-measures analysis of variance (2 × 4 × 2) was conducted with group (ASD or TD) as the between-subjects variable, and visit (3, 4, 5, or 6) and trial type (combined subjectand object what questions, and where questions) as withinsubjects variables. The results showed a main effect of trial [F(1, 24) = 45.97, p < 0.001, partial eta squared = 0.657], indicating that children's proportion of looking to the named object was different for the "what" and "where" questions. There was no main effect of visit [F(3, 72) = 0.988, p = 0.404, partial eta squared = 0.040], nor a significant group × trial interaction [F(1, 24) = 1.15, p = 0.294, partial eta squared = 0.046]. A significant group effect emerged [F(1, 24) = 8.92, p = 0.006, partial eta squared = 0.271], with greater overall looking to the named object by the TD group than by the ASD group. Given these significant trial and group effects, the next set of analyses investigated each group's looking patterns separately for subjectand object-what questions.

For the TD group, the first repeated-measures analysis of variance (4 × 2) was conducted with visits (3, 4, 5, 6) and trial type (subject-what questions and where questions) as withinsubject variables. There was a main effect of trial [F(1, 15) = 45.31, p < 0.001, partial eta squared = 0.751] but no main effect of visit [F(3, 45) = 0.328, p > 0.05, partial eta squared = 0.021]. The visit × trial interaction trended toward significance [F(3, 45) = 2.56, p = 0.068, partial eta squared = 0.145]. The second ANOVA compared the object wh-questions and "where" questions, and revealed a main effect of trial [F(1, 15) = 30.63, p < 0.001, partial eta squared = 0.671], and no main effect of visit [F(3, 45)= 1.13, p > 0.05, partial eta squared = 0.070]. This visit × trial interaction also trended toward significance [F(3, 45) = 2.55, p = 0.068, partial eta squared = 0.145].

For the ASD group, the first repeated-measures analysis of variance (4 × 2), conducted by visit (3, 4, 5, 6) and trial type (subject-what questions and where questions), revealed a main effect of trial [F(1, 9) = 6.24, p < 0.05, partial eta squared = 0.409] but no main effect of visit [F(3, 27) = 1.13, p > 0.05, partial eta squared = 0.112] and no visit × trial interaction [F(3, 27) = 0.224, p > 0.05, partial eta squared = 0.024]. Similarly, the repeatedmeasures analysis of variance (4 × 2) for the object wh-questions with visit (3, 4, 5, 6) and trial type (object-what questions and where questions) revealed a main effect of trial [F(1, 9) = 24.24, p < 0.005, partial eta squared = 0.729] but no main effect of visit [F(3, 27) = 0.743, p > 0.05, partial eta squared = 0.076] nor a significant visit × trial interaction [F(3, 27) = 0.374, p > 0.05, partial eta squared = 0.040].

Overall, then, both groups demonstrated wh-question comprehension—they correctly looked less at the named item during the what-question trials than during the "where" trials. Because we were interested in when wh-question understanding was first achieved, and because of the marginal visit by trial interactions in the TD group, we next investigated each group's looking patterns for the subject and object wh-questions at each visit. For the purpose of these analyses, one-tailed significance testing was used as we expected an effect in a specific direction, i.e., less looking to the named item during the what-test trials. In the TD group, children looked significantly less to the named item during the object-what-trials vs. where-trials at all visits [visit 3: t(16) = 1.90, p = 0.038; visit 4: t(16) = 3.68, p = 0.001; visit 5: t(16) = 4.09, p < 0.001; visit 6: t(15) = 6.26, p < 0.001; see **Figure 1A**]; they also looked significantly less at the named item during subject what-questions compared to where-questions starting at visit 4 [visit 3: t(16) = 1.27, p = 0.111; visit 4: t(16) = 3.75, p < 0.001; visit 5: t(16) = 3.57, p = 0.001; visit 6: t(15) = 8.52, p < 0.001; see **Figure 1B**].

The ASD group's performance was less consistent for objectwhat questions: while they appeared to show comprehension at visit 3, t(13) = 3.39, p = 0.002, this effect disappeared at visit 4, t(13)= 0.998, p = 0.168 and visit 5, t(11) = 1.05, p = 0.157, then re-emerged at visit 6, t(11) = 2.07, p = 0.031; see **Figure 2A**. Similarly, the ASD group's performance with subjectwhat questions varied across visits, reaching significance at visit 3 but then trending toward significance only at visit 5 [visit 3: t(13) = 2.30, p = 0.019; visit 4: t(13) = 0.807, p = 0.217; visit 5: t(11) = 1.58, p = 0.07; visit 6: t(10) = 0.857, p = 0.206; see **Figure 2B**].

In sum, TD children displayed evidence of wh-question comprehension by 32 months of age (i.e., visit 4, if both subject and object questions are considered). The ASD group demonstrated significant comprehension at visit 3; however, the ASD group was unable to maintain this level of comprehension consistently for the rest of the visits (with re-emerging significant comprehension for object wh-questions at visit 6). When the two groups are compared by age and/or visit, there is a discrepancy in the point of wh-question comprehension attainment; however, it is important to compare the groups by language level as well. As **Table 1** shows, the two groups performed at equivalent language levels at visit 1, but by visit 3 they had diverged and the TD children were more advanced. We thus compared the language levels of the TD children at visit 4 and the children with ASD at visit 6; this comparison yielded no significant differences in receptive (ROWPVT) vocabulary [TDvisit4: M = 43.31, SD = 11.95; ASDvisit6: M = 48.28, SD = 19.35; t(28) = −0.859, p > 0.05] or their expressive (EOWPVT) vocabulary [TDvisit4: M = 31.17, SD = 9.89; ASDvisit6: M = 30.00, SD = 24.55; t(29) = −168, p > 0.05] Thus, it appears that the TD and ASD groups achieved comprehension of wh-questions at similar language levels.

We next consider the number of children in both groups at each visit who demonstrated wh-question comprehension. Difference scores were created for percent looking to the named item during "where" questions minus the same measure (combined across subject and object trials) during "what" questions. Positive scores indicated better understanding of whquestions because these indicate that children looked longer at the named item during the "where" questions compared to the "what" questions; these children were designated "Comprehenders." All children who showed a difference in the wrong direction (i.e., less than zero) were designated "Noncomprehenders." A series of chi-square test of goodness-offit analyses {visit 3: [χ 2 (1, <sup>n</sup>=17) = 3.76, p = 0.05]; visit 4: [χ 2 (1, <sup>n</sup>=17) = 5.88, p < 0.05], visit 5: [χ 2 (1, n=17) = 8.48, p < 0.005]; and visit 6: [χ 2 (1, <sup>n</sup>=16) = 14.06, p < 0.001]}, indicate that in all the visits there were more Comprehenders than Non-comprehenders in the TD group. Within the ASD group, there were more Comprehenders than Non-comprehenders at visit 3 [χ 2 (1, <sup>n</sup>=14) = 5.78, p < 0.05; see **Table 4**].

To further investigate individual differences, Pearson's correlations were conducted between measures of early language measures and concurrent or later wh-question comprehension scores (i.e., the difference scores). The five sets of language measures included the Vineland, Mullen, CDI, ROWPVT (receptive vocabulary) and EOWPVT (expressive vocabulary); a Bonferroni correction adjusted alpha to p = 0.005 was used as the threshold of statistical significance. As **Table 5** shows, in the TD group, children with higher wh-question comprehension scores at visit 6 had had larger vocabulary scores (CDI) at visits 2 and 3 (r<sup>s</sup> > 0.700, p<sup>s</sup> < 0.005). Children with greater expressive vocabulary (EOWPVT) at visits 5 and 6 also had higher wh-comprehension scores at visit 6 (r<sup>s</sup> > 0.700, p < 0.005; see **Table 5**). Due to the stricter significance level (p = 0.005), correlations among language measures and wh-question comprehension scores in the ASD group did not reach significance.

# Do Children's Early Comprehension of SVO Word Order and Social Competence Predict Their Later Comprehension of Wh-Questions?

We next analyzed the degree to which children's early understanding of canonical SVO word order, and their social competence, each independently predicted later wh-question comprehension. This kind of analysis is potentially perilous because of the small number of participants in each group (n = 15); moreover, eight children in this wh-question dataset were excluded from these regressions because their word order data were missing (e.g., because they did not look long enough at the video). Therefore, we increased our power by creating a larger dataset, which combined our participants and those of Goodwin et al. (2012; we also used the word order data first reported in Naigles et al., 2011). Combining the datasets is not automatically justified, because while the participant selection and procedures were identical, both the wh-question videos and the word order videos differed to some extent. However, our justifications for combining the datasets were as follows: First, as shown in **Table 6**, the language levels of the TD children in both datasets were equivalent at visits 1 and 6, and the language levels of the children with ASD in both datasets were also equivalent at visits 1 and 6. Second, whereas the characters for the two word order videos were different (girl and boy vs. horse and bird), the layouts themselves were almost identical, involving two

animate characters and the five common transitive verbs and actions push, tickle, wash, hug, and ride. Third, whereas the whquestion stimuli were different across the videos (i.e., including inanimate agents and patients engaged in hitting actions in Goodwin et al. (2012); vs. animate agents and patients engaged in five reversible actions in the current study), these layouts were also almost identical (i.e., transitive actions followed by wh-object questions, transitive actions followed by wh-subject questions, then the where-questions). Fourth, the pattern of findings from the wh-question videos was similar in both datasets, with the TD children in both groups displaying stable comprehension of wh-questions by 32 months of age, and the children with ASD, in both groups demonstrating comprehension by 53–54 months of age (Goodwin et al., 2012). We believe these to be sufficient reasons for combining the datasets; however, we acknowledge that predictors of wh-question acquisition might vary according to animacy of the arguments (Tyack and Ingram, 1977; Philip et al., 2001). We defer further consideration of this point to the discussion section; for now, we consider the goal of discovering such predictors to warrant this exploratory analysis. Thus, the combined dataset for the word order-wh-question comparison now included 35 participants in the TD group and 31 in the ASD group.

We conducted bivariate correlations between the word order measure, Vineland socialization, and communication scores separately and averaged, and subject and object wh-question comprehension scores at relevant visit. In the TD group,

TABLE 4 | Number of children showing comprehension or no comprehension of Wh-questions (subject—and object—questions combined).


ASD, autism spectrum disorder; TD, typically developing.

subject-wh-question comprehension at visit 5 was positively correlated with early word order comprehension (r = 0.359, p < 0.05) while subject wh-question comprehension at visit 6 was positively correlated with the averaged Vineland communication and socialization scores at visits 1 and 2 (r = 0.373, p < 0.05). In addition, object wh-question comprehension at visit 5 was positively correlated with visit 2 Vineland communication scores (r = 0.352, p < 0.05) while object wh-question comprehension at visit 6 was positively correlated with visit 1 and visit 2 Vineland communication scores (r = 0.370, p < 0.05; r = 0.373, p < 0.05) as well as the averaged Vineland communication and socialization score (r = 0.372, p < 0.05).

In the ASD group, visit 3 subject-wh question comprehension was significantly correlated with visit 2 Vineland communication (r = 0.438, p < 0.05) and the averaged Vineland socialization and communication score (r = 0.394, p < 0.05); furthermore, object wh-question comprehension at visit 6 was positively correlated with early word order comprehension (r = 0.381, p < 0.05).

We then conducted two stepwise multiple regressions, with each group separately, to assess the degree to which early word order understanding and early social/pragmatic performance uniquely contributed to later wh-question comprehension. Thus, the models included the children's word order scores, their visit 1 Mullen visual reception scores, their visit 2 CDI (language) scores, their visit 1 and visit 2 Vineland communication scores, and the average of the Vineland communication and socialization score. A measure of visual reception was included because this taps into children's non-verbal IQ, which is an important indicator of the children's ability to attend to and learn from their world. CDI scores from visit 2 were included to examine how an early vocabulary measure contributed to their later language processing ability, and the word order and Vineland communication and combined communication/socialization scores were early indicators of the children's grammatical and pragmatic abilities, respectively

In the TD group, the first regression model used visit 5 objectwh-question comprehension score as the outcome variable, yielding a significant model in which visit 2 communication scores were the only significant predictor F(1, 30) = 4.97, p = 0.033 (see **Table 7**). The second regression model TABLE 5 | Cross-lagged and concurrent pearson correlations between language measures and Wh-question comprehension for TD children across all visits (N = 17).


MSEL, Mullen Scales of Early Learning Composite; VABS, Vineland Adaptive Behavior Scales Composite; CDI, Communicative Development Inventories; ROWPVT, Receptive One-Word Picture Vocabulary Test; EOWPVT, Expressive One-Word Picture Vocabulary Test. \*p < 0.005, two-tailed; +p < 0.01.

used visit 6 object-wh question comprehension score as the outcome variable, yielding a significant model in which visit 1 communication scores were the only significant predictor F(1, 30) = 6.94, p = 0.013 (see **Table 8**). The third regression model used visit 6 subject wh-question comprehension score as the outcome variable, yielding two significant models. In the first model, the average of the Vineland communication and socialization scores was the significant predictor F(1, 30) = 5.57, p = 0.025, whereas in the second model, both the average of the Vineland communication and socialization scores plus the word order scores each contributed significantly to the model, F(2, 29) = 5.66, p = 0.008 (see **Table 9**).

In the ASD group, the first regression model used visit 6 object-wh question comprehension as the outcome variable, yielding a significant model in which children's word order scores was the only significant predictor F(1, 27) = 4.40, p = 0.045 (see **Table 10**). The second regression model used visit 3 subjectwh-question comprehension as the outcome variable, yielding a significant model in which visit 2 Vineland communication scores was the only significant predictor F(1, 25) = 6.86, p = 0.015 (see **Table 11**).

TABLE 6 | Comparison of TD and ASD participants from both cohorts at visits 1 and 6 on the MSEL and CDI.


MSEL, Mullen Scales of Early Learning. CDI, Communication Development Inventory; \*p < 0.05.

### TABLE 7 | Stepwise regression analysis for variables predicting visit 5 object—what question comprehension in TD children (N = 31).


TABLE 8 | Stepwise regression analysis for variables predicting visit 6 object—what question comprehension in TD children (N = 33).


V1, Visit 1.

# DISCUSSION

In this study, we addressed two main questions: (a) Viewing these new wh-question videos, which included animate agents and familiar actions and verbs, did children with ASD demonstrate comprehension of subject- and object-wh-questions at the same visit or language level as the TD children? (b) Did children's


Vineland composite, average of vineland socialization, and communication scores at visit 1 and 2.

TABLE 10 | Stepwise regression analysis for variables predicting visit 6 object—what question comprehension in children with ASD (N = 29).


TABLE 11 | Stepwise regression analysis for variables predicting visit 3 subject—what question comprehension in ASD children (N = 27).


V2, Visit 2.

earlier grammatical knowledge (indexed by comprehension of SVO word order) and their social competence (indexed by their Vineland communication and socialization scores) predict their later comprehension of wh-questions? Addressing our first question, with these new videos, we found overall significant comprehension of wh-questions by both groups (i.e., a main effect of trial, with the children understanding that "where" questions asked them to look at the named item whereas subject and object "what" questions asked them to look away from the named item). More detailed scrutiny of performance at each visit, though revealed that TD children demonstrated robust comprehension of both subject- and object-questions by 32 months of age (i.e., at visit 4) whereas children with ASD showed what looked like comprehension at visit 3, which disappeared for visits 4 and 5 and then re-emerged at visit 6 (i.e., at 53 months of age), most strongly for the object whquestions. Because their performance was not consistent across the first three visits when they viewed the wh-question video, we are cautious about claiming wh-question comprehension in the ASD group before visit 6. The two groups thus achieved whquestion comprehension at different ages and visits; however, the language level of the ASD group at visit 6, when they showed comprehension of object-wh-questions, was quite similar to those of TD children at visit 4, the earliest visit when these children showed stable comprehension of both object-wh and subject-wh-questions.

Addressing our second question, we found that whquestion comprehension was related to both grammatical and social-communication abilities. That is, for both TD children and children with ASD, their comprehension of SVO word order as well as their Vineland social-pragmatic scores at earlier visits predicted their later performance on wh-question comprehension.

Our new wh-question videos were designed with the goal of making wh-question processing easier, because we included animate subjects—who are the typical agents in prototypical transitive actions—and verbs that were more familiar to both TD children and children with ASD. Therefore, we expected to find robust subject- and object- wh-comprehension performance in our TD group at visit 3 (the first time they saw the video), replicating Goodwin et al. (2012), and earlier subject and object wh-question comprehension in the ASD group than had been found by Goodwin et al. (2012). However, our results were, somewhat surprisingly, quite parallel to those of Goodwin et al. (2012), with the TD group showing marginal comprehension at visit 3 and robust comprehension at visit 4, and the ASD group still showing inconsistent comprehension across visits. Thus, the new videos did not elicit earlier evidence of comprehension from the ASD group. Replicating Goodwin et al. (2012), we found that the groups appeared to achieve good "what" question comprehension when their language levels were on par; that is, at visit 4 for the TD group and visit 6 for the ASD group. Interestingly, though, we did not replicate the correlations that Goodwin et al. (2012) observed, in the ASD group, between vocabulary levels and wh-question comprehension; possibly, this discrepancy indicates that the children who achieved good whquestion performance with the Goodwin et al. (2012) video were the ones who knew the verb "hit," whereas no such association was observed with the current videos because all verbs were familiar. Taken together, these findings suggest that using familiar verbs and animate agents did not change the basic findings of Goodwin et al. (2012); namely, that wh-questions are difficult for children with ASD. Even though children were only required to look at the correct answer, they still demonstrated impairments in their understanding. We suggest that these findings support the argument that these children's difficulties with wh-questions have a grammatical-origin.

We also investigated the degree to which children's variance in their early grammatical and/or social-pragmatic performance might predict their later variance in subject and object-wh question comprehension. Indeed, the regressions suggested that wh-question comprehension is related to both grammatical and social-pragmatic factors. The "grammatical-origins" argument is supported because the children's performance on the earlier word order task strongly predicted performance on later whquestion comprehension, for both the TD and ASD groups (albeit at different visits and for different wh-questions). These relationships held even when non-verbal cognition and general vocabulary level were controlled; therefore, they are not indicators of general ability to perform well in cognitively or linguistically demanding tasks. We suggest, instead, that the children's competence at understanding the canonical English SVO word order helped them become more efficient in subsequently processing wh-questions, in that having stable representations of SVO helped them understand that the moved wh-word in a subject-wh or object-wh-question maps onto the grammatical subject or object of the verb, respectively. These findings provide evidence for the continuity of grammatical knowledge in both young TD children and children with ASD, such that they might use early-developing syntactic knowledge to process the grammatical role of wh-words.

These findings extend those of Naigles et al. (2011), who demonstrated that children with ASD who were faster at understanding SVO sentences were also better at using such transitive frames to conjecture that the novel verbs in them were causative; i.e., doing syntactic bootstrapping. That correspondence was thus between understanding SVO sentences with familiar verbs and learning verbs in SVO sentences with novel verbs—i.e., both tasks involved essentially the same sentence forms. Our current findings extend Naigles et al. (2011) because we have demonstrated correspondences between understanding canonical SVO frames at early visits and understanding non-canonical SVO frames at later visits. That is, the children in the current study needed to understand that the fronted wh-word "stood for" an NP, and to know that the NPtrace was either in subject or object position. Moreover, when the NP-trace was in object position, the surface word order was OVS; thus, the correspondence we observed in the ASD group between SVO comprehension at visits 1–2 and object-wh-question at visit 6 suggests that the children with ASD are not perseverating on one specific word order and had some knowledge of the abstract relationship between sentences that had different surface orders. This observed correspondence thus supports the argument that the wh-question deficit in children with ASD has a grammatical origin.

However, our findings also support the argument that whquestion impairments in children with ASD also derive from pragmatic impairments. That is, the TD group and the ASD group's comprehension of wh-question at the later visits was predicted by their social-pragmatic abilities at the earlier visits, in that children with better performance on wh-question comprehension were reported by their parents to have better communication and socialization skills on the Vineland. Socialpragmatic abilities might play a role in the development of whquestion understanding in both general and specific ways. In general terms, children who are more attuned to their social environment might simply pay more attention to the language their parents use, which would include wh-questions (see also Goodwin et al., 2015). In specific terms, children who are more aware of the social conventions about when and how to ask whquestions, and who pay attention to their parents' pointing to objects when they (the parents) ask questions, would be expected to better understand the referents of wh-questions. When children are more attuned to their social environment, they can better understand the focus and interpretation of how questions are used and formulated by their family members. Better socialpragmatic abilities would enable children to understand the different functions of wh-questions and the particular context within which they are used which can strengthen their knowledge and understanding of wh-questions.

Limitations of this study include participant characteristics, our choice of social-pragmatic measures and a lack of a joint attention measure, and the wh-question video itself. First, we are restricted in the generalizability of these findings with children with ASD as these children were receiving ABA as their primary intervention, and therefore the generalizability of these findings to the ASD population as a whole are limited. Second, we are limited in our argument to further distinguish syntactic challenges from pragmatic challenges, as this study did not analyze children's production data of wh-questions or their joint attention skills; that is, we are limited in our knowledge about whether children in our study also showed deficits in their whquestion production, indicating a pragmatic challenge (however, note that Goodwin et al. (2012) found delays in both production and comprehension of wh-questions). Also, joint attention would be a key predictor to investigate in future studies because it taps into pragmatic skills in children and therefore it would be important to examine whether joint attention skills are related to later syntactic development. Perhaps, if their joint attention is impaired, then we might also see pragmatic aspects of their wh-question production being impaired. Third, it is possible that we made the wh-question task harder for children with ASD by using two animate characters engaged in causative actions. As has been shown in prior research, a prototypical action is an animate object performing an action on an inanimate object (Slobin, 1982). Perhaps our inclusion of animate patients in the current wh-question video made wh-question processing more challenging, possibly even for both groups (but see Gagliardi et al., 2016, who found good wh-question comprehension in TD toddlers who viewed videos with animate patients). In line with this, another limitation is that we combined the wh-question video with animate characters with the wh-question video with inanimate characters in our prediction analyses and it is possible that there can be different predictors for animate characters and inanimate characters. For example, Tyack and Ingram (1977) and Philip et al. (2001) found that typical children's acquisition of "who" and "what" questions emerged at different ages. It is important to point out that our study controlled for that by asking "what" questions throughout. It is possible that TD children in our study did not show early stable comprehension of wh-questions as their peers did in Goodwin et al. (2012) because we used animate characters with "what" questions. We believe that this would not be an issue for children with ASD because of their pragmatic impairment; however, this remains to be an open question.

In future work, it would be interesting to discover extent of the impairment in wh-questions in other languages, and investigate whether the deficits in understanding such wh-questions also hold for languages that do not require wh-movement. Members of our group have used Goodwin et al.'s (2012) video to examine wh-question comprehension in South Korean children with ASD, with the preliminary finding that, even though Korean whwords remain in situ, Korean 4-year-olds with ASD nonetheless show poorer wh-question comprehension than their languagematched TD peers (Park et al., 2016). This is an important step toward determining which grammatical components of wh-questions are most challenging for children with ASD. Additionally, we concluded that the children with ASD showed comprehension at visit 6 rather than at visit 3 because they did not show comprehension at visits 4 and 5; however, this U-shaped curve is puzzling and future studies are needed to replicate this effect.

In conclusion, the IPL paradigm has elicited comprehension of wh-questions in 2-year-old TD children; in contrast, children with ASD demonstrated delayed and somewhat inconsistent understanding of these same wh-questions. Changing the actions to more familiar ones did not help children with ASD demonstrate earlier comprehension compared to previous results (Goodwin et al., 2012). Our findings suggest that wh-questions present linguistic challenges to children with ASD that go beyond issues of stimuli. They lend support to both "grammatical-origins" and "pragmatic-origins" hypotheses concerning the wh-question deficit in children with ASD: The "grammatical-origins" argument is supported because performance on an early grammatical competence task was strongly associated with performance on later wh-question comprehension for both groups. The "pragmatic-origins" argument is also supported because wh-question comprehension was associated with children's earlier social-communication scores, i.e., children with better social abilities were later more able to consistently comprehend wh-questions. Thus, the current study shows that wh-question challenges seem to be related to both grammatical and pragmatic challenges in children with ASD.

Finally, our finding that both linguistic and social-pragmatic factors are implicated in wh-question acquisition in children with ASD is consistent with the recent report of Naigles et al. (2016), who found that children with ASD's vocabulary and joint attention skills each independently predicted their propensity to reverse personal pronouns. These studies provide the first demonstrations that both specifically linguistic and generally social factors are influential in the language challenges of children with ASD, and we encourage more researchers to include measures that tap into multiple domains when they are investigating the language of these individuals. We suggest that attributing the language challenges of children with ASD to "only" linguistic or social bases masks the

# REFERENCES


intricate coordination that children perform—even children with ASD—among multiple domains of knowledge during language development.

# ETHICS STATEMENT

This study was carried out in accordance with the recommendations of the University of Connecticut, Institutional Review Board with written informed consent from all subjects. All subjects gave written informed consent in accordance with the Declaration of Helsinki. The protocol was approved by the UConn IRB.

# AUTHOR CONTRIBUTIONS

LN and DF designed the original data collection. MJ and LN worked together on the questions, design, coding, analyses, and write up of the current study, with some input from DF.

# FUNDING

This research was funded by the National Institute on Deafness and Other Communication Disorders (NIH-DCD) [grant number R01 2DC007428].

# ACKNOWLEDGMENTS

We are grateful to Anthony Goodwin for helping to launch this project, to Rose Jaffery, Janina Piotroski, and Andrea Tovar for their assistance in data collection, and to undergraduates at the UConn Child Language Lab for their assistance in coding. Portions of this research were presented at the International Meetings for Autism Research and the Biennial Meeting of the Society for Research in Child Development, and we are grateful for comments and suggestions received at those venues. Finally, we are extremely grateful to the families who have participated in this research.


disorder: joint attention, imitation, and toy play. J. Autism Dev. Disord. 36, 993–1005. doi: 10.1007/s10803-006-0137-7


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2017 Jyotishi, Fein and Naigles. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

# Syntactic and Story Structure Complexity in the Narratives of Highand Low-Language Ability Children with Autism Spectrum Disorder

Eleni Peristeri <sup>1</sup> \*, Maria Andreou<sup>2</sup> and Ianthi M. Tsimpli <sup>3</sup>

<sup>1</sup> Language Development Lab, Department of English Studies, Aristotle University of Thessaloniki, Thessaloniki, Greece, <sup>2</sup> Department of English, School of Arts and Humanities, University of Cologne, Cologne, Germany, <sup>3</sup> Department of Theoretical and Applied Linguistics, University of Cambridge, Cambridge, United Kingdom

Although language impairment is commonly associated with the autism spectrum disorder (ASD), the Diagnostic Statistical Manual no longer includes language impairment as a necessary component of an ASD diagnosis (American Psychiatric Association, 2013). However, children with ASD and no comorbid intellectual disability struggle with some aspects of language whose precise nature is still outstanding. Narratives have been extensively used as a tool to examine lexical and syntactic abilities, as well as pragmatic skills in children with ASD. This study contributes to this literature by investigating the narrative skills of 30 Greek-speaking children with ASD and normal non-verbal IQ, 16 with language skills in the upper end of the normal range (ASD-HL), and 14 in the lower end of the normal range (ASD-LL). The control group consisted of 15 age-matched typically-developing (TD) children. Narrative performance was measured in terms of both microstructural and macrostructural properties. Microstructural properties included lexical and syntactic measures of complexity such as subordinate vs. coordinate clauses and types of subordinate clauses. Macrostructure was measured in terms of the diversity in the use of internal state terms (ISTs) and story structure complexity, i.e., children's ability to produce important units of information that involve the setting, characters, events, and outcomes of the story, as well as the characters' thoughts and feelings. The findings demonstrate that high language ability and syntactic complexity pattern together in ASD children's narrative performance and that language ability compensates for autistic children's pragmatic deficit associated with the production of Theory of Mind-related ISTs. Nevertheless, both groups of children with ASD (high and low language ability) scored lower than the TD controls in the production of Theory of Mind-unrelated ISTs, modifier clauses and story structure complexity.

Keywords: autism, language ability, narratives, sentence complexity, microstructure, macrostructure

# INTRODUCTION

Although the Diagnostic and Statistical Manual of Mental Disorders, Fifth Edition (DSM-5) has de-emphasized language ability in the diagnosis of Autism Spectrum Disorders (ASD) by removing the criteria of "age-of-onset" and "no history of language delay" for Asperger's syndrome, language impairment is commonly associated with ASD. In fact, language delay is the most frequent cause of

Edited by:

Stephanie Durrleman, Université de Genève, Switzerland

### Reviewed by:

Arhonto Terzi, Technological Educational Institute of Patras, Greece Vikki Janke, University of Kent, United Kingdom

> \*Correspondence: Eleni Peristeri eperiste@enl.auth.gr

### Specialty section:

This article was submitted to Cognitive Science, a section of the journal Frontiers in Psychology

Received: 28 November 2016 Accepted: 06 November 2017 Published: 20 November 2017

### Citation:

Peristeri E, Andreou M and Tsimpli IM (2017) Syntactic and Story Structure Complexity in the Narratives of Highand Low-Language Ability Children with Autism Spectrum Disorder. Front. Psychol. 8:2027. doi: 10.3389/fpsyg.2017.02027

**94**

initial referral to specialist services for children with ASD (McMahon et al., 2007). Language ability varies considerably among diagnosed individuals, with 30% lacking minimal spoken language despite access to intervention (minimally verbal children; Kjelgaard and Tager-Flusberg, 2001; Anderson et al., 2007; Tager-Flusberg and Kasari, 2013), while highly-verbal children with ASD tend to exhibit considerable heterogeneity in their language abilities (e.g., Charman, 2004; Kjellmer et al., 2012). Receptive language is usually lower than expressive language in highly-verbal children with ASD (e.g., Hudry et al., 2010), though it is sometimes anecdotally reported that some school-aged children demonstrate relatively good receptive skills, despite their low expressive skills (Kasari et al., 2013). Within this framework, a distinction is often made between children with ASD who have age-appropriate language skills and those who have a language impairment similar to that found in children with Specific Language Impairment (SLI), but whose vocabulary levels and non-verbal cognition are intact (e.g., Rapin and Dunn, 2003; Tager-Flusberg and Joseph, 2003; Tager-Flusberg, 2006). For instance, Tek et al.'s (2014) longitudinal study revealed similar language growth patterns of children with ASD and good language skills with their typically-developing (TD) age-matched peers in a variety of language measures, including grammatical morphemes, vocabulary and sentence complexity, in contrast to an age-matched group with ASD and low verbal skills that exhibited developmental delays across the same language areas.

The long-recognized variation in language ability in ASD suggests that the autistic language phenotype can be partly, yet consistently dissected on the basis of the children's verbal skills. Recently, this assumption has been formalized in the study by Wittke et al. (2017) that designated a specific, grammaticallyimpaired subgroup of SLI in a large ASD sample. This group performed in the normal range on non-verbal IQ and vocabulary while still showing a specific deficit in grammatical skills, in contrast with a group of language-impaired children with ASD who had significantly below average non-verbal IQ and overall deficits in vocabulary and grammar. Moreover, some grammatical errors were more frequent in the grammaticallyimpaired group than the group with global language impairment. Consequently, Wittke et al.'s (2017) study has highlighted a structural deficit in a subgroup of children with ASD that was due neither to low non-verbal skills nor to the severity of ASD symptoms, as previously suggested (Harper-Hill et al., 2013). Wittke et al.'s (2017) study focused exclusively on children's grammatical impairment at the single-morpheme level [by reference to Brown's (1973) 14 grammatical morphemes]. To the best of our knowledge, no study has so far investigated the effect of language ability on different aspects of narrative production in children with ASD. Thus, which narrative functions are affected in children with ASD who vary in their language ability but have average non-verbal and verbal IQ, remains unexplored. In the present study, we recruited two groups of children with ASD and normal non-verbal IQ whose language ability, reflected in verbal IQ and expressive vocabulary, was within the normal range, yet, on extreme ends of the scales (high vs. low). We focus on the question of whether such disparity affects narrative production at the microstructural and macrostructural levels of analysis.

The field has now recognized that the variation in the language abilities of children with ASD is partly related to syntactic skills. Narratives have been successfully used in studies with children with ASD to elucidate differences which are not apparent or clearly defined using standardized tools alone. The narratives of children within the spectrum have been shown to include syntactically less complex sentences, omitted morphemes and increased rates of coordination (e.g., Roberts et al., 2004; Eigsti et al., 2007; Marinis et al., 2013; Norbury et al., 2014). However, whether language ability in ASD can affect the production of specific types of subordinate clauses in narratives remains largely unknown. Moreover, most previous studies focus on English-speaking children with ASD. Other languages, in which morphosyntactic features of subordination are richer than in English, have barely been studied. Furthermore, the contribution of pragmatics in the production of some types of subordinate clauses, such as adverbial or relative clauses, has not been specifically addressed in the examination of narratives produced by individuals with ASD. For example, adverbial clauses provide cues to establish coherence relations between the events of a story. As such, these clauses may be particularly challenging for children with ASD due to the pragmatic deficit that defines autism (e.g., Naigles and Chin, 2015; de Marchena and Eigsti, 2016). One of the aims of the present study is to examine the extent to which variation in language ability in children with ASD affects the use of modifier clauses, i.e., adverbial and relative clauses (Hughes et al., 1997). In this respect, the compensatory role of good language ability in planning and producing a coherent and complete plotline for the story is also examined. Children with ASD have been shown to encode the characters' emotions and thoughts, referred to as +Theory of Mind-related Internal State Terms (ISTs) (henceforth, +ToMrelated ISTs) less often than TD children (e.g., Siller et al., 2014). The link between subordination and performance in ToM tasks has been commonly found in children with ASD. For instance, Tager-Flusberg (2000) reported that children with ASD experienced greater difficulty than age-matched intellectually-impaired children in extracting the embedded clause of communication verbs in wh-questions (e.g., Why did Bobby say Dad put the cake away?). Moreover, performance on these questions was a strong predictor of children's false belief reasoning abilities. Nevertheless, the effect of high- vs. low-verbal skills in children with ASD on the use of ±ToM-related ISTs and on story structure complexity has not been examined as yet.

The present study aims to fill this gap to contribute to the question of whether good language abilities can compensate for the pragmatic deficit in and its effects on microstructural and macrostructural aspects of narratives. Specifically, all the children with that participated in the present study had non-verbal IQ scores within the normal range of the Wechsler Intelligence Scale for Children (Wechsler, 1992; WISC-III; Greek adaptation and standardization by Georgas et al., 2003). Variation in the group was defined through language ability which was measured by a standardized expressive vocabulary test and the verbal IQ tests of WISC-III (Wechsler, 1992). We measured syntactic complexity by calculating the number of complex sentences in each child' narrative. Complex sentences are those which include more

than one clause which can be coordinated or subordinated to the matrix clause. The language of the experimental and the control group is Greek, a language with richer morpho-syntactic distinctions in subordinate clauses than English. Two types of subordinate clauses are examined in narrative microstructure: (i) verb-complement clauses, i.e., clauses selected as complements of a verb in the higher clause, and (ii) modifier clauses, including temporal or causal adverbial clauses and relative clauses modifying subject or object noun phrases in the sentence. One crucial difference between complement and modifier clauses is that complement clauses are selected by the verb, hence their use presupposes lexical and syntactic knowledge (Grimshaw, 1979; Noonan, 1985; Haegeman, 2006). Modifier clauses, on the other hand, are not selected. Instead, they are used for semantic and pragmatic purposes, such as to establish cohesion between events in a narrative by providing causal or temporal information, or by elaborating on the referentiality of the noun phrase (Fox and Thompson, 1990; Vieu et al., 2005). In essence, the use of modifier clauses presupposes morpho-syntactic but also pragmatic skills which guide the conceptual structure and planning of the propositions encoded in the narrative. By distinguishing between complement and modifier subordinate clauses in the narratives of children with we can examine the relative contribution of language as opposed to pragmatics in and the compensatory role of language on pragmatics. The examination of narrative macrostructure, as instantiated in the use of ±ToM-related ISTs and in the complexity of story structure contributes to the same question: higher use of ±ToMrelated ISTs and/or story structure complexity for the group of children with and high language skills as opposed to those with low language skills would speak in favor of the compensatory role for language in pragmatics. In contrast, if the two groups of children perform similarly in those macrostructure measures as well as in the use of subordination, it would be concluded that the pragmatic deficit cannot be masked or overcome by good language skills.

# Grammar in ASD: Evidence from Narrative Production

Research in the use of grammar in autism has often relied upon the analysis of children's narratives. Narrative production shows differences among participants with ASD, which depend on language ability: less verbally-able participants tend to produce shorter and syntactically simpler utterances than languagematched controls (Tager-Flusberg, 1995; Tager-Flusberg and Sullivan, 1995; Capps et al., 2000). Findings are less consistent when children with ASD are compared to age-matched TD participants. A number of studies (Losh and Capps, 2003; Diehl et al., 2006; Novogrodsky, 2013; Norbury et al., 2014) found no differences between individuals with ASD and their TD peers with similar language abilities; however, Stirling et al. (2017) did report that children with ASD lagged behind TD children in syntactic complexity in their written narratives. Bishop (2003) also reports that children with ASD aged between 6 and 10 years who had typical non-verbal abilities but low scores on expressive and/or receptive language measures, produced fewer complex sentences than their TD peers. Although, language ability operationalized in terms of expressive and/or receptive language abilities has been used as a qualifier of ASD children's narrative performance (e.g., Norbury et al., 2014; Suh et al., 2014), not much is known about the different types of subordination which may be particularly challenging for children with autism. The current study examines whether different levels of language skills in children with ASD differentially affect the use of complement vs. modifier clauses.

Apart from investigating the microstructure of narratives, i.e., lexical diversity and morpho-syntax, research on narratives and autism has focused on the analysis of macrostructure. This level includes the linguistic encoding of the characters' affective and cognitive states, as well as the encoding of reference which requires appropriate pronominal form-function mappings as the discourse unfolds. An effective narrator not only has to structure the story in an intelligible way so that the listener understands the setting, characters, events, and outcomes of the story (Rumpf et al., 2012), but also needs to identify the motivations and reactions of the characters that embed sociallyoriented goals (Stein and Glenn, 1979). The specific domain has been key to systematically revealing the pragmatic deficit in ASD. Pragmatic difficulties have been considered the hallmark of ASD and a domain in which all children within the spectrum, even those with age-appropriate structural language abilities and intelligence, struggle to master to various degrees (Rapin and Dunn, 1997; Landa, 2000; Bishop and Baird, 2001; Kjelgaard and Tager-Flusberg, 2001; Tager-Flusberg et al., 2005; Stefanatos and Baron, 2011). Children with ASD have been shown to produce narratives with more ambiguous referencing (Loveland et al., 1990; Tager-Flusberg, 1995; Manolitsi and Botting, 2011; Novogrodsky, 2013; Norbury et al., 2014; Suh et al., 2014), fewer dialogic interactions among the story characters (Stirling et al., 2017), fewer ISTs referring to the story characters' emotions (e.g., Siller et al., 2014), and inappropriate use of language within context (e.g., Losh and Capps, 2003; Collet-Klingenberg and Franzone, 2008) compared to TD children.

Prior work on language ability in ASD has often focused on its compensatory role in children's ToM deficit. Much of the relevant work has tested prosody and scalar implicatures in sentential contexts (McCann et al., 2007; Pijnacker et al., 2009). Moreover, certain language skills but also abstract thinking have been proposed as factors contributing to success in ToM tasks (e.g., Eisenmajer and Prior, 1991; Happé, 1994; Steele et al., 2003; Milligan et al., 2007; Durrleman et al., 2016). Crucially, ASD children's use of affective terms in narratives has been found to correlate significantly with their performance on ToM tasks, i.e., tasks that tapped into the children's ability to represent their own or other people's mental states, like pretendplay (Blanc et al., 2005), and false belief (Baron-Cohen et al., 1985). Although the causality of the link between ToM and language in autism (or typical language development) is not defined as yet, findings suggest that ASD children's failure to consider the perspectives of others has consequences on the use of language expressing mental states. Thus, affective terms are used considerably less frequently by children with ASD during storytelling compared to TD controls despite ASD children's intact meta-representational skills (Baron-Cohen et al., 1985; Baron-Cohen, 1989). The present study aims to investigate the extent to which the language ability of children with ASD, whose verbal skills are within the normal range but differs in being high and low, may compensate for the ToM deficit in narratives.

# Subordinate Clauses in Greek

As already mentioned, one of the aims of the present study is to investigate whether microstructure in ASD, and in particular syntactic complexity in narrative production differs in children whose language abilities lie at the higher and lower end of the normal range. Subordination, and in particular, complement and modifier subordinate clauses are also examined.

Complement clauses are further distinguished in terms of the complementizers which introduce them. Non-interrogative complement clauses in Greek can be introduced by the complementizers na, oti/pos, and pu [examples (1)–(3) below]. Whereas, in English the distinction in complementation is between finite and non-finite complement clauses which can be introduced with an overt or zero complementizer (e.g., "I know that/Ø he left," "I want Ø to leave"), in Greek the use of a complementizer is obligatory. Morphological distinctions in complementizers are based on Mood (subjunctive na vs. indicative oti/pos), and factivity (factive pu vs. non-factive oti/pos). The complementizer na is a Mood marker introducing subjunctive clauses (Holton et al., 1997), which are the closest translational equivalents to infinitival clauses in English (Agouraki, 1990; Tsimpli, 1990; Roussou, 1994; Giannakidou, 2009). The complementizer oti introduces indicative clauses, while the complementizer pu is used to introduce complements of psychological predicates like lipame "be-sad," metanjono "regret," herome "be-glad" (Christidis, 1986; Varlokosta, 1994). In terms of differences in the feature complexity of these three complementizers, the subjunctive one is the least complex with respect to finiteness features as it is the only complementizer in Greek which can introduce clauses with underspecified Tense features. Notably, na-clauses are the earliest to develop in typically-developing Greek children; the form that the verbs have in na-clauses are among the earliest forms used by Greekspeaking children even in matrix contexts (Varlokosta, 1994; Tsimpli, 2005). The present study examines the frequency of use of indicative oti/pos and factive pu vs. the subjunctive na complementizers in the children's narratives.


In addition to complement clauses, we examine adverbial and relative clauses which modify an event and a noun phrase, respectively. Adverbial and relative clauses are not selected and therefore optional (Haegeman, 1994). The function and the positioning of adverbial clauses have been argued to be mainly motivated by their pragmatic function. Adverbial clauses usually provide temporal or causal information which modifies the event of the main clause (Haegeman, 2006, 2010) [examples (4)–(5) below]. In connected speech, adverbial clauses establish cohesive links between the events of a story, thus contributing to the complexity of narration (Shapiro and Hudson, 1991; Andreou, 2015). Moreover, they enrich the propositional content of the complex sentence in which they occur, and assume organizational functions at the pragmatics level (Bestgen and Vonk, 2000; Vieu et al., 2005). Relative clauses [exemplified in (6) below] are embedded within nominal phrases and are thought to function like predicates or modifiers of a head noun (e.g., in "The man that I saw," the relative clause "that I saw" modifies the noun phrase "the man").


According to corpus-studies (Fox and Thompson, 1990; Biber et al., 1998), the discourse function of relative clauses is to ground the head entity with respect to discourse information and to elaborate further on its referential properties. The modifier status of both adverbial and relative clauses allows us to group them together under the modifier clause category. Although all subordinate clauses presuppose lexical and morphosyntactic skills (microstructural properties), modifier clauses also build on pragmatic organization skills. The distinction drawn between complement and modifier clauses receives support from studies on monolingual typical development, showing that complement clauses emerge earlier and the timing difference between complement and modifier clauses is considerable (Diessel and Tomasello, 2001; Diessel, 2009).

# THE CURRENT STUDY

This study builds on previous research examining the narrative performance of children with ASD (e.g., Perner et al., 1987; Wellman and Woolley, 1990; Lewis et al., 1994; Sullivan et al., 1994; Tager-Flusberg, 1995; Novogrodsky, 2013; Norbury et al., 2014; Siller et al., 2014; Stirling et al., 2017) and addresses the role of language ability on performance in narrative microstructure and macrostructure. In order to capture variability in the language profiles of the children with ASD, we used standardized measures of both verbal IQ (VIQ) based on the children's performance in the verbal scales of the Greek version of WISC-III (Wechsler, 1992), and expressive vocabulary (Vogindroukas et al., 2009); adaptation from Renfrew (1997). This targeted recruitment allowed us to identify two subgroups of children in the spectrum, those with expressive vocabulary and verbal IQ scores in the higher end of the normal scale (henceforth, ASD-HL), and those in the lower end of the normal scale (henceforth, ASD-LL).

The study's research questions and hypotheses are the following:

**Question 1.** Will the difference in the language ability of the two groups of children with ASD affect frequency of use of complex (i.e., coordinate and subordinate) clauses?

**Hypothesis 1.** Based on previous research (e.g., Tager-Flusberg, 1995; Tager-Flusberg and Sullivan, 1995; Capps et al., 2000; Kjelgaard and Tager-Flusberg, 2001; Bishop, 2003; Eigsti et al., 2007) showing that children with ASD and concomitant language impairment produce fewer syntactically complex sentences compared to TD children, we expect that the ASD-LL group will exhibit fewer syntactically complex sentences than ASD-HL and TD children.

**Question 2.** Will the difference in the language ability of the two groups of children with ASD affect frequency of use of the different types of complementizers?

**Hypothesis 2.** We hypothesize that differences in language ability will affect the diversity of complementizers. Specifically, we expect oti-("that") and pu-(factive "that") complements to be particularly compromised for the ASD-LL group only, due to the fact that these complementizers select verb forms which are fully-specified for Tense and Aspect in contrast to na-(subjunctive) complements which are temporally underspecified (Agouraki, 1990; Tsimpli, 1990; Roussou, 1994). We do not expect differences to emerge between ASD-HL and TD children in the use of oti-("that") and pu-(factive "that") clauses due to the fact that the former group of children is predicted to compensate for the computational demands of complementation by means of their high language ability.

**Question 3.** Will children with ASD and higher language skills perform better than their ASD-LL peers in the use of adverbial and relative clauses?

**Hypothesis 3.** Assuming that the production of adverbial and relative clauses draws more heavily on discourse and pragmatics relative to verb-complement clauses (Grimshaw, 1979; Fox and Thompson, 1990; Vieu et al., 2005), we predict lower frequency of use of adverbial and relative clauses for both groups of children with ASD relative to TD children. If, on the other hand, ASD-HL children recruit high language skills as a compensatory mechanism for their pragmatic deficit during retelling, we expect this group to produce higher rates of modifier clauses than their ASD-LL peers and similar rates to their TD peers.

**Question 4.** Will children with ASD and higher language skills perform better than ASD-LL children in the use of ±ToM-related ISTs?

**Hypothesis 4.** Based on previous research (De Villiers, 2000; Tager-Flusberg and Joseph, 2005; Schick et al., 2007; Lind and Bowler, 2009; De Villiers and De Villiers, 2014; Durrleman and Franck, 2015; Durrleman et al., 2016) showing that children with autism tend to over-rely on the structural representation of complement clauses to compensate for their mentalizing deficit, we expect ASD-HL children to produce higher rates of +ToM-related ISTs compared to the ASD-LL group and similar rates to their TD peers supported by their good language skills. On the other hand, since this compensatory process is not essential to non-mentalizing expressions, such as −ToM-related ISTs, both groups of children with ASD are expected to score lower than TD children in this category.

**Question 5.** Will children with ASD and higher language skills perform better than their ASD-LL peers in the macrostructure measure of story structure complexity?

**Hypothesis 5.** Previous research shows that children with ASD have difficulty structuring narratives in a coherent manner irrespective of age and non-verbal abilities (Loveland et al., 1990; Losh and Capps, 2003; Diehl et al., 2006; Collet-Klingenberg and Franzone, 2008; Stirling et al., 2017). If encoding story structure draws on the child's language skills, then the ASD-HL is expected to outperform the ASD-LL group and perform similarly to the TD group. On the other hand, if retelling a story is more likely to tap into discourse management skills and world knowledge instead of formal language skills alone, we predict that high language ability will have a minimal effect on story structure complexity across the two groups of children with ASD. If this holds true, both groups of children with ASD are expected to perform lower than TD children in this measure.

Finally, in order to explore possible interactions between microstructural and macrostructural measures in the children's narratives, partial correlation analyses between the specific variables are carried out, after controlling for the children's verbal IQ and expressive vocabulary scores.

# MATERIALS AND METHODS

# Participants

A sample of 30 monolingual Greek-speaking children with ASD [mean age: 9.2 yrs. (SD: 1.9), age range: 6.1–12.4, all male] was tested. The children were recruited from mainstream state primary schools' inclusion classrooms. They all met criteria for ASD based on expert clinical judgment of the child's socialadaptive functioning conducted by a child psychiatrist, which was confirmed by the Autism Diagnostic Interview-Revised (Lord et al., 1994). Children were assessed with the VIQ and performance IQ (PIQ) scales of the Greek version of WISC-III (Wechsler, 1992; Greek adaptation and standardization by Georgas et al., 2003). All the children had a PIQ score of 83 or above [mean: 108.9 (SD: 15.3), range: 83–142]. Language ability was additionally tested with an expressive vocabulary test (Vogindroukas et al., 2009) standardized for 3–10 year old Greekspeaking monolingual children. This word-finding task includes 50 pictures depicting commonplace objects which each child was required to name. Testing was discontinued if the child failed to respond correctly in five consecutive trials. Each correct naming was given one point, so that the maximum score was 50.

The VIQ score of the Weschler Intelligence Scale for Children has been used in prior studies as an indicator of the level of impairment in ASD children's language functioning (Lincoln et al., 1988, 1995; Happé, 1994; Bavin et al., 2014), but it has been insufficient for addressing variability on its own (e.g., Dawson et al., 2007; Nader et al., 2016). As such, expressive vocabulary was also used as a screening measure to characterize ASD children's language profile.

In line with a number of previous studies (e.g., Semel et al., 1987; Marton and Schwartz, 2003; Reilly et al., 2004; Falcaro et al., 2007; Norbury et al., 2014, among many others), children with language scores 1.5 or more standard deviation (SD) below the mean were considered as having low language ability. The VIQ score of the rest of the children with ASD was very close to that of the TD children (see **Table 1**). By using a cut-off of 81 in VIQ (i.e., 1.5 SD below the VIQ mean of all the children with ASD) and a cut-off of 33 in the expressive vocabulary task (i.e., at least 1.5 SD below the mean expressive vocabulary score of all the children with ASD), the children with ASD formed a high- and a low-language ability group: 16 high-language ability children with ASD (ASD-HL; mean age: 9.2 (SD: 1.8), age range: 6.7–12.4) and 14 low-language ability children with ASD (ASD-LL; mean age: 9.1 (SD: 2.1), age range: 6.1–12.0). The children with ASD were age-matched with 15 TD monolingual Greekspeaking children (TD; mean age: 9.3 yrs. (SD: 1.7), age range: 7.3–12.0). The TD children were selected so that they had normal hearing, no speech, emotional, or behavior problems, as well as no observed neurological, articulation, and phonological deficits.

The parents of the children gave written informed consent in accordance with the Declaration of Helsinki, and anonymity of the children and their families was protected. The study with the TD children was carried out in accordance with the recommendations in the Guide for Research Study Approval of

TABLE 1 | Number of children, age, expressive vocabulary, verbal IQ, and performance IQ by Group.


ASD-HL, high-language children with Autism Spectrum Disorder; ASD-LL, low-language children with Autism Spectrum Disorder; TD, typically-developing children; M, mean; SD, standard deviation.

the Greek Institute for Educational Policy. The parents of the children with ASD gave written consent on the administration of the tasks and on the dissemination of the results for research purposes in strict accordance with the recommendations in the Guide for the Differential Diagnosis and Intervention for Children with Special Educational Needs of the Greek Ministry of Education.

A one-way ANOVA analysis with age as the dependent variable indicated that there were no significant differences across groups in age, F (2, 44) = 0.75, p = 0.928. The three groups differed significantly in both their expressive vocabulary ability, F (2, 44) = 25.559, p < 0.001, η <sup>2</sup> = 0.238, and VIQ scores, F (2, 44) = 206.728, p < 0.001, η <sup>2</sup> = 0.855. Subsequent post-hoc Bonferroni tests showed that the ASD-LL group scored significantly lower in expressive vocabulary and in VIQ than both the ASD-HL and TD children (p < 0.001 for all differences). There was no significant difference between the ASD-HL and the TD group in either expressive vocabulary (p = 0.919) or VIQ scores (p = 0.371) (see **Table 1**). Furthermore, the three groups did not differ in their performance IQ scores [F (2, 44) = 0.428, p = 0.654, η 2 = 0.141].

# Narrative Production Task

Materials

Children's oral retellings were elicited by using a single picture story from the Edmonton Narrative Norms Instrument (ENNI; Schneider et al., 2005) that has been designed to collect narrative data from children aged 4–9 through storytelling. The story used in the present study was the A3 Giraffe/Elephant story, which includes eight pictures and consists of three complete episodes (see Appendix A for the pictures that provided the prompts for the narrative, and Appendix B for the original story that the children had to retell).

A minimum of 15 verb clauses was a prerequisite for including a child' narrative in our sample. Moreover, to see if the three groups were comparable in terms of the length of their narratives we ran a one way-ANOVA analysis, with the results showing that there is no group effect in narrative length, F (2, 44) = 1.001, p = 0.376, η <sup>2</sup> = 0.213, which was measured in verb clauses (ASD-HL: 25 (SD: 4.8); ASD-LL: 22.9 (SD: 5.1); TD: 25.5 (SD: 5.6). Furthermore, following relevant literature in the field (Tweedie and Baayen, 1998; McCarthy, 2005) we used square root in order to assure that the narratives produced by the children could be compared.

### Procedure

Each child was tested individually at a location most convenient for the child's parents (i.e., either at the child's home or at a private diagnostic center). The child listened to the story through headphones on the computer screen while viewing two pictures (and a single picture once) per slide. While the child listened to the story, a female adult unfamiliar with the purposes of the study was present in the room. Once the story finished, the child viewed all 13 pictures on a single slide on the computer screen and was asked to retell the story to the examiner, who entered the room only after the child had listened to the whole story and sat opposite the child not being able to see the pictures on the screen.

### Transcription and Scoring of Narratives

Children's retellings were audiotaped and transcribed by the first author. One fourth of this sample (25%) was randomly selected and re-transcribed by the second author. Transcripts were then compared word-for-word, with the comparison reaching 99% agreement. Examples of the transcripts of the narratives of a TD, an ASD-HL, and an ASD-LL child are cited in Appendix C).

Both microstructural and macrostructural properties of each child's narrative transcript were scored manually. For microstructure, the scores for the following linguistic categories were calculated: (1) lexical diversity, i.e., number of different types of content words divided by the total number of contentword tokens; (2) syntactic complexity, i.e., number of complex (i.e., coordinate and subordinate) sentences divided by the total number of simple and complex sentences; (3) subordination index, i.e., number of subordinate clauses divided by the total number of complex sentences; and, (4) types of subordination, which include total counts of verb-complement clauses, and modifier, i.e., adverbial and relative clauses in each child's narrative. Complement clauses were further split into two different categories based on the type of the complementizer used to introduce them, i.e., subjunctive na complement and indicative oti-(that)/pu-(that-factive) complement clauses (Mastropavlou and Tsimpli, 2011). Due to the fact that the ASD-LL group produced no pu-(that-factive) complement clauses, while ASD-HL children produced very few instances of puclauses (Mean: 0.8) in their narratives, we opted to merge oti- (that) and pu-(that-factive) complement clauses and analyze them as a single category due to their shared requirement for Tense and Aspect specification. Examples of types of complement clauses (see examples 7–9) produced by the children with ASD in their narratives are cited below:

Complement clauses:

(7) **na**-complement.

Theli **na** vutisi mesa stin pisina.

wantNONPAST.3SG. to diveSUBJ.PAST.3SG. in the swimming pool "(She) wants to dive in the swimming pool."

(8) **oti**-complement. Idhe **oti** to aeroplanaki epese stin pisina. sawPAST.3SG that the aeroplane fellPAST.3SG. in the swimming pool.

"(She) saw that the aeroplane fell in the swimming pool.".

(9) **pu**-complement. Harike **pu** pire to aeroplanaki. was-happyPAST.3SG. that tookPAST.3SG. the aeroplane "(She) was happy that she took back the aeroplane."

For macrostructure, the following scores were calculated: (1) diversity of +ToM-related ISTs, i.e., number of unique lexical items expressing positive or negative emotion (e.g., sad, angry, happy) and mental verbs (such as think, wonder) divided by the total count of +ToM-related tokens; (2) diversity of -ToMrelated ISTs, i.e., number of unique perceptual (such as see, hear), physiological (such as thirsty, hungry), and communication (such as shout, say) terms divided by the total count of -ToM-related tokens (Gagarina et al., 2012; Tsimpli et al., 2016). The third macrostructural measure included in the analyses was that of story structure complexity (Story Grammar Model; Stein and Glenn, 1979). Each of the three episodes of the story consisted of a Goal of a main character (MC), an Attempt that the MC makes to reach the goal, and the Outcome of the MC's Attempt. The child was awarded three points in each episode for the correct production of Goal, Attempt and Outcome, two points for producing two elements, the Outcome being required in combination with the Goal or the Attempt, one point for producing Goal and Attempt only, and zero points for expressing only one element. Finally, two points were also awarded for the correct reproduction of the place and the time (i.e., the Setting), and one point for introducing the four characters of the story. The maximum score for story structure complexity was 15.

The analysis of both microstructural and macrostructural variables was conducted by the first author and interrater reliability checks were conducted by the second author on 15 (33%) out of the 45 coded transcripts, selected randomly with equal representation of diagnostic (ASD vs. TD) and language ability level (ASD-HL vs. ASD-LL) criteria. Inter-rater reliability was 95.8%, and all discrepancies were resolved through discussion.

# RESULTS

# Group Comparisons: Microstructural Variables

**Table 2** presents the raw data (i.e., total counts) for the microstructural variables. Specifically, we present lexical diversity and numbers of simple and complex (coordinate, subordinate) clauses, the syntactic complexity and the subordination index, the number of na- (subjunctive) and oti-(that)/pu-(that-factive) complementizers, and, the numbers of verb-complement and modifier, i.e., adverbial and relative clauses, for each of the three groups. Comparisons among the three groups were analyzed using the Chi-Squared test. In addition, the data were examined by estimating a Bayes factor (BF) using Bayesian Information Criteria (Wagenmakers, 2007). This compares the fit of the data under the null hypothesis compared to the alternative hypothesis, so that a BF < 1 implies substantial evidence for the null hypothesis, according to which there are no group differences in the dependent variable tested, and BF > 1 implies substantial evidence for the alternative hypothesis, which states that there are differences in group performance.

In addition to the presentation of the data from the three experimental groups, all the Tables provide information on the microstructure and macrostructure of the ENNI (A3) story that the children listened to and were asked to retell. To explore the question whether the narrative output of the children differed from the original story on microstructure and macrostructure, we undertook a series of qualitative comparisons targeting the specific narrative properties; to this end, and for each dependent variable, we used proportions by dividing each child's raw scores by the total number of verb clauses. This approach allowed us TABLE 2 | Group means (and SDs) of total counts and proportions (%) for microstructural measures.


ASD-HL, high-language children with Autism Spectrum Disorder; ASD-LL, low-language children with Autism Spectrum Disorder; TD, typically-developing children; ENNI: Edmonton Narrative Norms Instrument (Schneider et al., 2005).

to detect specific micro- and macrostructural domains in which ASD-HL or/and ASD-LL, as well as TD children's performance deviated substantially from the original story which was used as the baseline. The data is reported in section Comparisons between Group Scores and the Original Story, **Table 4** below.

χ 2 analyses showed that the groups differed significantly on the number of coordinate and subordinate clauses (χ <sup>2</sup> > 4.32, ps < 0.05), as well as on the syntactic complexity and subordination indices (χ <sup>2</sup> > 9.38, ps < 0.005). Further chi-square analyses revealed that the ASD-HL group produced more coordinate clauses than the rest of the groups (ps < 0.05), while both groups with ASD tended to produce fewer subordinate clauses than the TD group (ps < 0.05). The syntactic complexity of the narratives of ASD-LL children was lower than TD children (p < 0.05), while the subordination index of both groups with ASD was lower than the TD group (ps < 0.05).

Regarding the types of the complementizers used across the three groups, there was a significant group effect for oti-(that)/pu- (that-factive) complementizer (χ <sup>2</sup> = 7.32, p < 0.005), stemming from the lower production of these complementizers by the ASD-LL group compared to the other two groups (ps < 0.001). Finally, chi-square analyses showed that the groups differed significantly on both verb-complement and modifier clauses (χ <sup>2</sup> > 25.0, ps < 0.05). Further chi-square analyses revealed that the ASD-LL group produced significantly fewer verb-complement clauses than both ASD-HL and TD children (ps < 0.05). In modifier clauses, both groups with ASD produced fewer modifier, i.e., adverbial and relative, clauses than TD children (ps < 0.05).

# Group Comparisons: Macrostructural Variables

**Table 3** displays the total counts for the macrostructural variables, i.e., +ToM-related ISTs, -ToM-related ISTs, and story structure complexity.

χ 2 analyses showed that the groups differed significantly on the number of both +ToM-related and -ToM-related ISTs (χ 2 > 8.87, ps < 0.05), as well as on story structure complexity (χ 2 = 18.79, ps < 0.001). Further chi-square analyses revealed that the ASD-LL children produced fewer +ToM-related and −ToMrelated terms than the rest of the groups (ps < 0.05), and that ASD-HL children produced fewer −ToM-related terms than TD children (p <0.001). Finally, both groups with ASD scored lower than the TD group in story structure complexity (ps < 0.001).

# Comparisons between Group Scores and the Original Story

**Table 4** presents the percentages of children in each group (TD, ASD-HL, ASD-LL) that have scored lower than the original story's baseline rates of microstructural and macrostructural TABLE 3 | Group means (and SDs) of total counts for macrostructural measures.


ASD-HL, high-language children with Autism Spectrum Disorder; ASD-LL, low-language children with Autism Spectrum Disorder; TD, typically-developing children; ENNI, Edmonton Narrative Norms Instrument (Schneider et al., 2005).

TABLE 4 | Percentages of children per group that scored lower than the ENNI story's baseline rates of microstructural and macrostructural measures.

TABLE 5 | Percentages of children per group that scored higher than the ENNI story's baseline rates of microstructural measures.


ASD-HL, high-language children with Autism Spectrum Disorder; ASD-LL, low-language children with Autism Spectrum Disorder; TD, typically-developing children.

measures (see **Tables 2**, **3**). The overwhelming majority of the children in each group (>75% in each group) tended to score lower in lexical diversity and subordination, as well as in the use of na- (subjunctive) clauses and –ToM-related ISTs compared to the original story. Both groups with ASD (>69%), and especially ASD-LL children (86%), tended to produce fewer modifier clauses than the number of modifier clauses of the story. On the other hand, it was only ASD-LL children (>85.7%) that tended to produce stories with lower syntactic complexity, fewer verb-complement and oti-(that)/pu-(that-factive) complement clauses, as well as fewer +ToM-related ISTs than the baseline rates of the corresponding measures in the ENNI story.

**Table 5** presents the percentages of children in each group (TD, ASD-HL, ASD-LL) that have scored higher than the original story's baseline rates in simple and coordinate clauses (see **Table 2**). Almost half of the children in the ASD-HL and ASD-LL group tended to produce more simple clauses than the original story, while all three groups (>93.3%) produced more instances of coordination than the number of coordinate clauses included in the ENNI story.


ASD-HL, high-language children with Autism Spectrum Disorder; ASD-LL, low-language children with Autism Spectrum Disorder; TD, typically-developing children.

# Correlations between Microstructural and Macrostructural Variables

**Table 6** displays the results of the partial correlation analyses that focused on the exploration of possible associations between children's scores in microstructure (i.e., syntactic complexity, simple, complex, coordinate, and subordinate clauses, and types of subordinate clauses, i.e., complement and modifier), and their scores in macrostructure (i.e., +ToM-related, −ToM-related ISTs, story structure complexity), while controlling for verbal IQ and expressive vocabulary.

The results of the partial correlation analyses showed that the use of +ToM-related ISTs was associated with syntactic complexity, complex, subordinate, and complement clauses. On the other hand, the use of −ToM-related ISTs was only associated with the use of complement clauses. Story structure complexity was found to be positively correlated with the use of subordinate and modifier clauses, i.e., adverbials and relatives, while it was inversely associated with the use of simple and coordinate clauses.

# DISCUSSION

In this study, we investigated the narrative retelling skills in children with ASD of high- and low-language abilities. In line with previous narrative studies (e.g., Solomon, 2004; Eigsti et al., 2007; Manolitsi and Botting, 2011; King et al., 2013; Terzi et al., 2014), we manipulated two distinct layers in narrative production: (i) microstructure, that is the intra-sentential level of narratives comprising the word or sentence complexity level of production and the relationships of the elements within TABLE 6 | Partial correlations between microstructural and macrostructural variables.


N = 45. \*p < 0.05, \*\*\*p < 0.001.

sentences (Cherney, 1998), and (ii) macrostructure, which refers to the global supra-sentential level of discourse and the links among event representations that the narrator has to establish in order to build up a coherent story (Cherney et al., 1998). In the present study, we used independent language ability measures to group the children with ASD into two discrete subgroups, namely, high- and low-language ability falling in the higher and lower end of the normal range of language ability, respectively. Subsampling within the children with ASD in terms of their VIQ and expressive vocabulary scores provided the opportunity to investigate the extent to which language ability in autism is related to the children's syntactic complexity in their narratives. This procedure also allowed us to explore whether high language ability in children with ASD can boost their ability to form subordinate clauses and to attribute mental states to the story's characters. Finally, subsampling within the children with ASD enabled us to examine the role of language ability in the successful encoding of story structure and of relational information between events and characters in the story.

The data demonstrate that syntactic complexity measured in terms of the frequency of use of coordinate clauses is linked to language ability in autism. Relative to the TD group, the ASD-LL group showed significantly lower syntactic complexity, i.e., lower rates of coordinate and subordinate clauses, while there was no difference between TD and ASD-HL children on the same measure. Crucially, the ASD-LL children were the only group whose scores in syntactic complexity showed considerable deviation from the syntactic complexity pattern established in the original story, consistent with the hypothesis that ASD children's low language ability had a detrimental effect on the syntactic complexity of their narratives. A number of narrative studies report that children with ASD use a more restricted range of complex syntactic structures (Stirling et al., 2017) or less complex morpho-syntax (Tager-Flusberg, 1995; Eigsti et al., 2007; Marinis et al., 2013) in their (oral and written) narratives relative to TD children. While the children with ASD and high language abilities in the present study did not differ from their TD peers on the syntactic complexity measure, precisely the opposite obtained for the subordination index; the analyses of subordination along with the comparisons with the original story revealed that both groups with ASD tended to produce significantly fewer subordinate clauses than the TD group. As such, the analyses conducted separately for coordination and subordination provide a nuanced picture of the complexity of the narratives of the two groups of children with ASD, since the difference between ASD-HL and ASD-LL children in syntactic complexity appears to be attributed to ASD-HL children's higher coordination rather than subordination use, relative to their ASD-LL peers. Furthermore, about half of the children in each group with ASD exhibited more frequent use of simple and coordinate clauses to establish reference to the events of the story relative to the frequency pattern of simple clauses established in the original story. These structural differences between ASD and TD children may reflect a general strategy in autism to retell the story through linear, coordinated (vs. hierarchical) chains of successive events in order to safely communicate the core event structure of the story (see also Marinis et al., 2013 for similar findings).

Lexical diversity was a relative strength for both groups with ASD, since neither differed from the TD group in this narrative measure. This pattern contrasts with ASD-LL children's expressive vocabulary score which was significantly lower relative to both ASD-HL and TD children. The fact that the expressive vocabulary score of ASD-LL children did not align with their performance on lexical diversity is not surprising, given that the requirements on word use in each task are different, hence they may draw on different resources and processing constraints. In particular, word-finding in object naming is not supported by context and as such, lexical access may be more demanding than in the retelling context in which children could either recall lexical information from the story they had just listened to or rely on recalling the episodes and the context in which words were embedded in the story. Interestingly, these results are in line with Kambanaros and van Steenbrugge (2013) study on the lexical retrieval abilities of children with SLI which shows that picture naming performance was a weak predictor of the children's retrieval abilities for nouns in connected speech. Other studies also call into question the relative strength of expressive vocabulary over lexical diversity as a method for describing the lexical characteristics of the language production of children with disorders (Silverman and Bernstein Ratner, 2002; Moyle et al., 2007). However, we would like to entertain another alternative explanation for the discrepancy observed between one-word expressive vocabulary and lexical diversity in the performance of ASD-LL children. Looking into the errors that ASD-LL children made in the standardized expressive vocabulary task we notice that the low mean score did not result from "no response" data. In fact, "no response" was the least frequent type of error found. Instead, ASD-LL children tended to produce more semantic errors on average than their TD and ASD-HL peers. For instance, the ASD-LL group tended to produce semantically-related words (e.g., "spy" instead of "binoculars," "snow" instead of "igloo"), which indicates dysfunctional lexical access rather than a limited total conceptual vocabulary.

Group comparisons on the different types of complementizers further highlight the effect of language ability on ASD children's syntactic options at the microstructural level. The ASD-LL children tended to use oti-(that)/pu-(that-factive) complement clauses at a significantly lower rate relative to the rest of the experimental groups; in fact, ASD-LL children's rates of oti- (that)/pu-(that-factive) complements substantially differed from the rates of occurrence of these complementizers in the ENNI story, with more than 80% of ASD-LL children producing fewer oti-(that)/pu-(that-factive) complement clauses compared to the input story.

We argue that this pattern of performance is due to ASD-LL children's difficulty with coordinating syntactic and lexical information being encoded in oti-(that)/pu-(that-factive) complement clauses: apart from being specified for Tense, these types of complement clauses include verbs with full specification for Aspect. An important issue in typical language development concerns the timing of acquisition of aspectual distinctions on verbs (Tsimpli et al., 2010; Kaltsa, 2012; Konstantzou et al., 2013), with relevant evidence showing that children converge on consistent, adult-like aspectual verb markings quite late mainly due to the fact that aspect marking is constrained by several syntax-semantics interface-conditioned factors, such as the aspectual class of the verb, morphological aspect and argument structure, i.e., the presence/absence of object and aspectual adverbials. Thus, aspectual distinctions must be construed from the context, presumably incurring more computational cost. We suggest that ASD-LL children may have been able to employ morpho-syntactic information to encode Tense features on the verbs of oti-(that)/pu-(that-factive) complement clauses, yet, the processing load for encoding Aspect was higher, thus resulting in lower use of the specific types of complement clauses (see Zhou et al., 2015 for similar findings). On the other hand, the less demanding morphosyntactic specification requirements of subjunctive (na-) clauses may account for the lack of group differences in production patterns for subjunctive complement clauses in which the options of tense and aspect verb forms are limited. Thus, both groups with ASD irrespective of language ability should be able to produce them with less strain on computational resources for language production. Nevertheless, the fact that all three groups tended to produce significantly fewer subjunctive (na-) clauses than those included in the original story may indicate that factors other than linguistic ones may be involved such as the depiction of the events of the story. Future work is required to incorporate the link between the use of picture-based narratives and type of subordination used in children's syntactic choices.

In addition to examining the effects of language ability on measures of syntactic complexity in microstructure, the present study has investigated the compensatory role of ASD children's language ability on the production of different types of subordinate clauses. According to our results, modifier, i.e., adverbial and relative, clauses were considerably fewer in both groups with ASD relative to TD peers. We argue that the difference between TD children and both groups with ASD irrespective of language ability indicates that morphosyntactic skills are necessary but not sufficient for the production of modifier clauses as these clauses presuppose the ability to encode coherence relations in story structure. In this respect, complement clauses are similar to modifier clauses on lexical and morphosyntactic grounds, but modifier clauses, unlike complement clauses, additionally require good discourse management skills. Interestingly, evidence in favor of modifier clauses being at the interface of syntax with pragmatics is offered by the correlation analyses of the present study. Modifer clauses are significantly positively correlated with children's scores in story structure complexity, a macrostructural measure reflecting children's ability to organize the events of the story into a pragmatically coherent whole. Overall, our findings on subordination suggest that high language ability in autism is insufficient to compensate for the production of modifier clauses whose production critically lies at the interface between syntax and pragmatics.

ASD-LL children's use of subordinate clause types in the present study affords us an opportunity to track possible similarities and differences between the retelling data of this study and Mastropavlou and Tsimpli's (2011) study with spontaneous speech data by Greek-speaking children with SLI. Both studies show that the rates of na-clauses were considerably higher than modifier clauses, thus, demonstrating potential overlap between the computational processes required for the production of na-clauses in SLI and ASD-LL children. The frequent omissions of na in Mastropavlou and Tsimpli (2011) in contrast to the present study where omission of complementizers was rare, can be taken as proof of severe language impairment in SLI vs. the ASD-LL children of the present study. Further similarities between the two studies may be traced in the use of oti-(that)/pu- (that-factive) complement clauses which were produced at considerably lower rates by children with SLI in Mastropavlou and Tsimpli's (2011) study in comparison to their age- and language-matched TD peers. Though the data is not directly comparable due to the different design and the age of the children recruited by the two studies, factivity seems to have caused high semantic or pragmatic (integration) costs for both ASD-LL and SLI children hence the low production of thatfactive complement clauses. These results are consistent with our hypothesis that different types of subordinate clauses inflict distinct processing costs to children with ASD and that their computation is crucially linked to the children's language ability.

Considering macrostructure, our results show that +ToM and −ToM-related ISTs patterned differently across the two groups of children with ASD. High language ability was found to boost both IST-types, yet, more so for +ToM ISTs, bridging the distance between ASD-HL and TD children. Indeed, ASD-HL (along with TD) children were considerably more likely to produce units of information that involved characters' thoughts and feelings, i.e., +ToM-related terms, relative to ASD-LL children. Similar evidence was obtained from the comparison with the original story, since all the ASD-LL children failed to reach the +ToMrelated IST frequency pattern established in the ENNI story. Crucially, the use of +ToM-related terms was found to be positively correlated with the children's rates of complement clause use. On the other hand, both groups of children with ASD were found to score significantly lower than their TD peers in the use of −ToM-related ISTs, though the ASD-HL group tended to score higher than ASD-LL children in the specific category. The discrepancy observed between +ToM and −ToMrelated ISTs in children with ASD suggests that the compensatory effect of language in the domain of mental state attribution was IST-specific. As +ToM-related ISTs are prototypically used to describe mental states, i.e., the internal feelings and thoughts of others, children with autism need to recruit advanced linguistic knowledge to gain access to others' mentalistic behavior (Frith et al., 1994; Tager-Flusberg, 2000). This suggests that high language ability in autism boosts children's ability to use mental state terms predominantly when such states are relevant to the characters' feelings and thoughts.

Finally, language ability did not appear to be of critical importance to ASD children's performance in story structure complexity. The performance of both groups with ASD in story structure was equally low (i.e., 4.8 out of 15 points) and fell far below that of their TD peers. We suggest that building up the structure of a story in retelling, i.e., re-computing the story's discourse model including the setting, characters, events and outcomes, as well as the perspectives and motivations of the main characters of the story, indexed a highly demanding process of pragmatic enrichment triggered by context. Given that difficulties with pragmatic processing have been universally attested across individuals with ASD, irrespective of their age or level of functioning (Rapin and Dunn, 1997; Tager-Flusberg et al., 2005), we suggest that in situations of such demands the pragmatic deficit affects autistic children's story structure abilities over and above any language ability level. In other words, even high language skills cannot compensate for the pragmatic deficit evinced in story structure complexity. As such, in the comparison between the two macrostructural measures, i.e., +ToM-related ISTs and story structure complexity, story structure complexity seems not to be open to compensation from language skills, unlike +ToM-related ISTs. The higher contribution of pragmatics compared to language skills in developing complex story structure is independently corroborated from TD bilingual children who despite their lower language proficiency compared to monolingual children, produce more complete and elaborate stories in narrative retellings (Tsimpli et al., 2016).

Microstructure measures showed that low language ability mainly compromises the use of +ToM-related ISTs and syntactic complexity in the children with ASD, especially with respect to the use of oti-(that)/pu-(that-factive) clauses. These effects of low language ability, however, may not be unique to ASD as they do not primarily rely on good pragmatic skills, i.e., the area in which a deficit is expected in ASD. Thus, other deficits associated with low language ability, such as low working memory capacity may be responsible for the effects on these aspects of microstructure and macrostructure of narratives. Although we have no evidence to speak for or against this proposal, we believe that the narrative pattern exhibited by ASD-LL children may be uniquely associated with this group. More specifically, the asymmetry between high non-verbal and low verbal IQ (see **Table 1**) may be responsible for the pattern observed, leaving aside whatever pragmatic deficit characterizes ASD. To investigate this further we examined each ASD-LL child's performance in WISC-III (Wechsler, 1992) for scores in verbal and performance IQ scores following Crawford's single case approach (DISSOCS software; Crawford and Garthwaite, 2005). The difference between verbal and performance IQ scores for each ASD-LL child was tested for significant deviation from the verbal vs. non-verbal profile in the TD group. The difference between verbal and performance IQ score for all ASD-LL children differed significantly (p ≤ 0.01) beyond the corresponding difference in the TD group, suggesting an uneven quality of intellectual abilities in the ASD-LL group. Interestingly, Lincoln et al. (1998) meta-analytic review of 23 published studies focusing on the intellectual abilities of children with autism shows that a verbal IQ < performance IQ profile has been consistently found across studies, implying that a depressed verbal IQ relative to performance IQ score may be a marker of autism. Other studies also show that verbal IQ—performance IQ discrepancies in children could be used as an indication of a learning disability (e.g., Hyman et al., 2006). As such, recruiting a low-language ability TD group as controls for the ASD-LL children in the present study would still leave the group-matching issue unresolved as such a sample could include unidentifiable proportions of children with autistic traits or learning disabilities (D'Angiulli and Siegel, 2003). Starting from the robust verbal IQ - performance IQ discrepancy in the ASD-LL group, we assume that the patterns found in their narrative data, such as the impaired ability to construe the matrix and embedded oti- (that)/pu-(that-factive) clauses as a single event, must be specific to this population rather than simply an outcome of processing constraints that may be generalizable to TD children.

Summing up, the compensatory effect of language skills in autism has been found to be restricted to one measure of macrostructure that evaluates the use of +ToM-related ISTs, but not the other which refers to story structure complexity. The lack of an effect of language ability on the production of modifier clauses in the ASD groups suggests that both story structure complexity and modifier clauses heavily depend on the contribution of pragmatics in the syntax-pragmatics interface. As such, the compensatory role of language ability in children with ASD was confined to the microstructural measures of syntactic complexity and the production of oti-(that)/pu-(thatfactive) clauses, both being more dependent on lexical and grammatical knowledge than discourse or pragmatics. We could then conclude that high language ability in autism need not always lead to improvement in narrative performance: pragmatic limitations cannot always be overridden by good language skills at least insofar as performance in a highly contextualized task like narrative production is concerned.

# CONCLUSIONS

Taken together, the results of the present study support two conclusions about the relationship between language ability and narrative performance in children with ASD. First, higher language skills enhance the syntactic complexity of narration in autism as evinced by ASD-HL children's higher use of coordinate, as well as oti-(that)/pu-(that-factive) complement clauses, relative to their ASD-LL peers. Second, high language ability was found to boost ASD children's use of ±ToM-related ISTs, with the benefit being greater for +ToM-related terms. On the other hand, in line with a number of findings reporting the universal impairment of pragmatic language in ASD, our results on modifier clauses and story structure complexity show that both high- and low language ability children with ASD performed equally low. This implies that the compensatory role of language ability in autism may not be operative in the production of subordinate clauses shaped by contextual and discourse considerations, in the use of ISTs not offering mentalistic insight into others' behavior (i.e., –ToM-related ISTs), nor in pragmatically high-demanding contexts, such as the encoding of story structure complexity. As such, the present study has suggested that higher language skills in autism are associated with a sub-set of syntactic and pragmatic competencies, a finding that would have gone unnoticed if the two groups within this normal linguistic range had been treated as a single group. Further crosslinguistic investigations of the narratives of children with ASD of various ages and language ability levels may shed more light on the extent to which their narrative performance is mediated by language ability factors. Also, future studies should recruit larger numbers of participants with ASD, so as to allow for comparisons across subgroups of more substantial sizes.

# ETHICS STATEMENT

This study was carried out in accordance with the recommendations of the "Greek Institute of Educational Policy"

# REFERENCES


with written informed consent from the children's parents. The parents of all the children that participated in the study gave written informed consent in accordance with the Declaration of Helsinki. The protocol was approved by the "Greek Institute of Educational Policy" within the context of the THALES research project (2012–2015) carried out by the Department of English Studies, Aristotle University of Thessaloniki, Greece (among other Departments that participated in the THALES project).

# AUTHOR CONTRIBUTIONS

All authors listed have made a substantial, direct and intellectual contribution to the work, and approved it for publication.

# ACKNOWLEDGMENTS

Warmest thanks go to the children that have participated in the study and to their families, as well as to the child psychiatrists and school teachers for assisting with the recruitment of the children with ASD.

# SUPPLEMENTARY MATERIAL

The Supplementary Material for this article can be found online at: https://www.frontiersin.org/articles/10.3389/fpsyg. 2017.02027/full#supplementary-material

Checklist in a clinical setting. Dev. Med. Child Neurol. 43, 809–818. doi: 10.1017/S0012162201001475


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2017 Peristeri, Andreou and Tsimpli. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

# Grammatical Language Impairment in Autism Spectrum Disorder: Exploring Language Phenotypes Beyond Standardized Testing

Kacie Wittke<sup>1</sup> , Ann M. Mastergeorge<sup>2</sup> , Sally Ozonoff<sup>3</sup> , Sally J. Rogers<sup>3</sup> and Letitia R. Naigles<sup>4</sup> \*

<sup>1</sup> Department of Speech, Language, and Hearing Sciences, University of Connecticut, Storrs, CT, USA, <sup>2</sup> Human Development and Family Studies, Texas Tech University, Lubbock, TX, USA, <sup>3</sup> UC Davis MIND Institute, Sacramento, CA, USA, <sup>4</sup> Department of Psychology, University of Connecticut, Storrs, CT, USA

Linguistic and cognitive abilities manifest huge heterogeneity in children with autism spectrum disorder (ASD). Some children present with commensurate language and cognitive abilities, while others show more variable patterns of development. Using spontaneous language samples, we investigate the presence and extent of grammatical language impairment in a heterogeneous sample of children with ASD. Findings from our sample suggest that children with ASD can be categorized into three meaningful subgroups: those with normal language, those with marked difficulty in grammatical production but relatively intact vocabulary, and those with more globally low language abilities. These findings support the use of sensitive assessment measures to evaluate language in autism, as well as the utility of within-disorder comparisons, in order to comprehensively define the various cognitive and linguistic phenotypes in this

### Edited by:

Stephanie Durrleman, Université de Genève, Switzerland

### Reviewed by:

Helen Tager-Flusberg, Boston University, USA Wolfram Hinzen, University of Barcelona, Spain

# \*Correspondence:

Letitia R. Naigles letitia.naigles@uconn.edu

### Specialty section:

This article was submitted to Language Sciences, a section of the journal Frontiers in Psychology

Received: 17 November 2016 Accepted: 23 March 2017 Published: 18 April 2017

### Citation:

Wittke K, Mastergeorge AM, Ozonoff S, Rogers SJ and Naigles LR (2017) Grammatical Language Impairment in Autism Spectrum Disorder: Exploring Language Phenotypes Beyond Standardized Testing. Front. Psychol. 8:532. doi: 10.3389/fpsyg.2017.00532 heterogeneous disorder.

Keywords: autism spectrum disorder, language impairment, grammar, language samples

# INTRODUCTION

According to the Diagnostic and Statistical Manual of Mental Disorders, Fifth Edition (DSM-5), the criteria for a diagnosis of autism spectrum disorder (ASD) include persistent deficits in social communication and interaction, as well as restricted and repetitive behaviors or interests (American Psychiatric Association, 2013). Children who meet criteria for ASD may have an accompanying language impairment, but this is not required for making the diagnosis. As ASD exists on a continuum, there is significant heterogeneity in the phenotypic presentation of individuals with this disorder, ranging from mild to more severe impairments. This range of abilities is also seen in the language skills of children with ASD: some present with intact language skills, while others develop little or no language (Tager-Flusberg, 2004). Moreover, within the set of children who do acquire language, pragmatic skills have been found to be consistently poor whereas grammatical abilities can vary widely, even in high-functioning individuals with autism. Some children present with grammar in the average range (Kjelgaard and Tager-Flusberg, 2001; Tek et al., 2014), while others have notable difficulties with grammar (Roberts et al., 2004; Eigsti et al., 2007; Tek et al., 2014; Durrleman and Delage, 2016; Modyanova et al., 2017). Several researchers have even suggested that a subset of children with ASD meet criteria for a co-morbid specific language impairment (SLI), as they appear to have primary difficulties with grammar

despite normal cognitive abilities. However, this proposition has continued to be controversial (e.g., Williams et al., 2008; Riches et al., 2010; Tuller et al., 2017), and requires extensive language testing from both populations to be established. Our goals in this paper focus instead on fleshing out the language heterogeneity within the ASD population; in particular, discovering the extent to which meaningful linguistic subgroups emerge when grammatical usage is scrutinized in detail. Also novel to our investigation is the borrowing of spontaneous language measures from the SLI literature for lexical and grammatical profiling, and the inclusion of a relatively large (n = 82) sample of 5-year old children with ASD.

# Language in Autism: A Focus on Grammar

Tager-Flusberg et al. (2005) review the range of linguistic abilities in children across the autism spectrum, making two major distinctions. First, some children with ASD fail to acquire spoken language skills beyond a basic or minimal level, which may range from no spoken words to fewer than 20–30 words (Kasari et al., 2013); about 30% of children with autism fall into this group (Tager-Flusberg and Kasari, 2013). Second, within the group of children who are verbal, some present with normal language while others have a notable language deficit, including difficulties with the understanding and use of grammar (Tager-Flusberg and Joseph, 2003; Norbury, 2017). In the literature, these latter two groups are often distinguished with the terminology autism language normal (ALN) and autism with language impairment (ALI).

A number of grammatical areas have been found to be problematic when children with ALI are compared to typically developing (TD) peers. When probed during an elicited production task, children with ALI produce significantly fewer markers of past tense -ed and third person singular -s (Roberts et al., 2004). Similar tense and agreement omissions have been documented in other studies using spontaneous language samples (Bartolucci et al., 1980; Tager-Flusberg, 1989). Eigsti et al. (2007) found that 5-year-old children with ASD, who were matched to younger TD children on vocabulary and nonverbal IQ, exhibited considerably less complex language than the younger TD group, producing fewer past tense markers as well as fewer Wh-questions. Grammatical errors are also seen in pronoun use. While pronoun reversals (e.g., "you" for "I") are much less prominent than once thought, they are produced more frequently by preschoolers with ASD than TD peers (Naigles et al., 2016). Distinguishing personal and reflexive pronouns has also been found to be challenging for children with ALI (Perovic et al., 2013); moreover, French-speaking children with ALI demonstrate notable difficulty with pronominal clitics (Durrleman and Delage, 2016; Tuller et al., 2017). Findings such as these suggest that grammatical challenges involving language production do arise in ASD; however, there are a number of unresolved questions. First, to what extent is this grammatical impairment independent of the child's non-verbal IQ and/or vocabulary level? While some baseline level of non-verbal IQ seems required for children to achieve phrase speech at all (Anderson et al., 2007; Wodka et al., 2013; Tek et al., 2014), the demonstration that impaired grammatical knowledge exists in children whose non-verbal cognition is within normal limits suggests that the acquisition of grammar depends, at least somewhat, on factors external to general cognition (e.g., Lewis and Landau, 2015; Valian, 2015). This is seen most notably in SLI, but it is less well defined in ASD, which leads to the second question: how pervasively across the autism spectrum do these grammatical impairments arise, and do the same types of impairments recur?

Addressing the first question naturally leads to another population of children with language disorders; namely, those with SLI. Definitionally, children with SLI present with language impairment despite having non-verbal IQ within the normal range and no other developmental or neurological disorder (Leonard, 2014). While the definition of "normal range" cognition varies between SLI researchers, with some considering it synonymous with average range performance and others considering it as scoring above the intellectual disability range (see Gallinat and Spaulding, 2014 for review), the language deficits in SLI are more well-defined. Hallmark language characteristics associated with SLI include particular difficulty with grammatical morphology, such as tense and agreement markers (Rice et al., 1995), as well as pronoun errors when marking case, gender, and number (Van der Lely and Stollwerck, 1997; Moore, 2001). Although research by Sheng and McGregor (2010), McGregor et al. (2013) has found that children with SLI show qualitative differences in their vocabulary knowledge and speed of word learning, overall, children with SLI perform well on tests of vocabulary (e.g., Spaulding et al., 2013). Thus, morphosyntactic errors are typically most notable whereas vocabulary is a relative strength. In the current paper, we borrow some measures from SLI research to further investigate the heterogeneity of grammatical impairment in ASD.

Research addressing the second question has primarily attempted to subgroup children with ASD based on their language abilities. However, not all research in this domain has focused specifically on grammatical language abilities; rather, language has been explored more broadly. For example, Kjelgaard and Tager-Flusberg (2001) found that several distinct language phenotypes of autism emerged from their sample of 89 children. Assessing performance on a variety of standardized language measures, they found three language subgroups. Children in their normal and impaired language groups had commensurately high and low non-verbal and language abilities, respectively. Yet, children in the borderline group reportedly resembled children with SLI, as they had normal non-verbal IQ scores but language below average (Kjelgaard and Tager-Flusberg, 2001). Their findings were limited to certain aspects of language, as the majority of testing focused on vocabulary and none of the measures they used contained detailed indices of grammar. Tager-Flusberg and Joseph (2003) reported similar findings from two samples of school-age children. The first sample showed that children with borderline and impaired language manifested grammatical impairments disproportionately more severe than lexical ones, whereas their second sample showed that some children with ASD have verbal scores lower than non-verbal ones. However, no detailed grammatical measures were provided for

those participants either. Other researchers have attempted to subgroup children with ASD based on language abilities (e.g., Anderson et al., 2007; Rapin et al., 2009), but until recently the measures were drawn from standardized tests, which did not enable detailed analysis of grammatical abilities (for more recent research see Durrleman and Delage, 2016; Modyanova et al., 2017; Tuller et al., 2017). Thus, research to date suggests that there may be multiple subgroups within the category of ALI, as demonstrated by research from Kjelgaard and Tager-Flusberg (2001), Tager-Flusberg and Joseph (2003), but it is unclear how lexical and grammatical abilities might differ among those subgroups.

In sum, research using standardized testing has shown that some children with ASD whose language scores—probably also including grammar—are on par with their TD age-mates; hence, with both language and non-verbal cognitive abilities high/intact, they are referred to as ALN. Researchers and clinicians agree, as well, on the existence of children with ASD whose language levels are minimal to null, and whose cognitive scores are correspondingly low—those who are minimally or non-verbal (NV). What is not yet clear are the characteristics of the children whose abilities range in between these two ends of the spectrum. Research has shown that these children present with weaknesses in their grammatical production skills; however, there may yet be different subgroups within this range. We suspect there may be at least two different subgroups in ALI, including those with normal non-verbal IQ but impaired language, as well as those whose language and cognitive scores are below their age level. How prevalent these groups might be, and to what extent their grammatical and lexical production is similar to and different from each other is poorly defined. It is also unclear how these groups compare to those with ALN on measures of both lexical and grammatical development.

No research thus far has compared these possible subgroups on grammar in any detail, especially because of the reliance on standardized language testing in past studies. As research on SLI has demonstrated, standardized tests are not necessarily sensitive to the types of grammatical deficits typically seen in children with SLI (Greenslade et al., 2009); thus, they may also not be sensitive to grammatical deficits in ASD, especially for examining possible subgroups within ALI. Spontaneous language samples, a methodology that is particularly sensitive to the expressive language deficits in SLI (e.g., Hewitt et al., 2005; Rice et al., 2010), could be an ideal way to capture the range of grammatical abilities in ASD.

# Spontaneous Language Samples: Examining Heterogeneity of Grammar in Children with Language Impairment

While most research to date exploring SLI in ASD has focused on comparing these two disorders, our focus is on how the literature in SLI can provide guidance for how to examine grammar in ASD. In SLI, spontaneous language samples have been used to examine features of their language, many of which have illuminated how children with SLI have many notable differences in grammatical production skills relative to their TD peers. We propose that these SLI-relevant language variables, as described below, should be considered in exploring the grammatical characteristics of language subgroups in ASD.

Children with SLI have been found to produce more grammatical errors overall starting at a young age. For example, Eisenberg and Guo (2013) calculated the frequency of grammatical vs. ungrammatical utterances in a sample of 3 year-old children with SLI and found that, on average, 62% of their utterances were ungrammatical. This is in contrast to their TD peers, only 29% of whose utterances were ungrammatical (Eisenberg et al., 2012). Dunn et al. (1996) also found group differences in total grammatical errors between 4-year-old children with SLI relative to their TD peers; the mean percentage of ungrammatical language in the SLI group was 23.56% of total utterances compared to 10.97% in the TD group. While there are no norms for children's percentage of grammatical errors across language development, nor is there currently a clinically meaningful cut-off for frequency of grammatical errors in a clinical population like SLI, these studies demonstrate that children with SLI produce far more ungrammatical utterances than their peers. One notable observation from these two studies is that while errors become less frequent across both groups from ages 3 to 4, four-year-old children with SLI (Dunn et al., 1996) seem to produce errors at frequency rates similar to TD 3-year olds (Eisenberg et al., 2012). While crossstudy comparisons should be made cautiously given the small sample sizes of these studies, it could be predicted that 5 year-old children with SLI might have grammatical errors at a frequency rate similar to TD 4-year-olds, i.e., somewhere around 10%.

Examination of grammar in SLI has also shown that there are specific markers that are particularly sensitive diagnostic indicators of language impairment, particularly in the preschool and Kindergarten years (e.g., Rice et al., 1995; Eisenberg and Guo, 2013). For instance, Bedore and Leonard (1998) analyzed language samples from both SLI and TD preschool-aged children between the ages of 3 and 5, and found that accuracy with noun morphology (i.e., possessive -s, plural -s, and articles a/the), verb morphology (i.e., regular past tense -ed, third person singular -s, and copula and auxiliary be), and MLU maximized the sensitivity for discriminating between the two groups. In addition, grammatical morphology in SLI has also been explored in domains that are well-defined in typical development, such as Brown's 14 grammatical morphemes (described in order of emergence): present progressive -ing, prepositions in/on, plural -s, irregular past tense, possessive -s, uncontractible copula, articles a/the, past tense -ed, third person singular -s, third person irregular, uncontractible auxiliary, contractible copula, and contractible auxiliary (Brown, 1973). While TD children master (i.e., produce 90% of the time in obligatory contexts) these morphemes in a relatively stable order between the ages of 2 and 5 (De Villiers and De Villiers, 1973), children with SLI are slower to reach mastery of correct usage of these forms in spontaneous language, and either omit them or use them incorrectly for a protracted period of time (Steckol and Leonard, 1979; Paul and Alforde, 1993). There is mixed evidence for their order of emergence in ASD

(e.g., Bartolucci et al., 1980; Tek et al., 2014), so it is also unclear whether children with ASD might be slower to reach mastery of these grammatical morphemes.

Overall, these research studies in SLI show that characteristics of language impairment are identifiable from a young age, and that spontaneous language samples are a particularly useful methodology for examining group differences in grammatical abilities. As demonstrated by the studies just reviewed, frequency of overall errors as well as specific types of errors are useful in comparing children with SLI to their TD peers. Thus, we conjecture that these will also be illuminating for distinguishing amongst a heterogeneous group of children with ASD.

# Current Study

The current study examines variability in language abilities in a relatively large sample of children with ASD. Using a within-disorder approach, we highlight the characteristics of grammatical language impairment in ASD, as well as explore the potential cognitive and linguistic subgroups that exist in this sample. Using guidance from the SLI literature, spontaneous language samples will be used to categorize participants into relevant language sub-groups based on grammatical production abilities. While researchers have sub-grouped children with ASD using standardized language assessments (e.g., Kjelgaard and Tager-Flusberg, 2001; Tager-Flusberg and Joseph, 2003; Anderson et al., 2007; Rapin et al., 2009), the current study is one of the first to use language samples to more precisely capture the variability in grammatical skills in a heterogeneous sample of children with this disorder. Specifically, we will use total frequency of grammatical errors as a criterion for group membership, and propose a cut-off of 10% total grammatical errors for placing children in a grammatical language impairment subgroup. The studies that provided group means for frequency of total grammatical errors in SLI did so for children in the preschool age range, which is younger than the average age of children in the current study (Mage = 5 years, 9 months); however, based on evidence from younger TD children in these studies (Dunn et al., 1996; Eisenberg et al., 2012), we propose that 10% utterances with grammatical errors is a potentially meaningful cut-off for Kindergarten-aged children with language impairment.

Once children are classified based on total grammatical errors, we then explore group differences in the types of grammatical errors, including noun, verb, and pronoun morphology, as well as accuracy with Brown's grammatical morphemes. These analyses address two main questions. First, is there a subgroup of 5 year-old children with ASD who have a primary impairment in grammatical language, similar to the profile of SLI, with nonverbal IQ in the normal range but frequent grammatical errors? And if so, do these children have particular difficulties with incorrect usage of verb, noun, and pronoun morphology, as well as using Brown's 14 grammatical morphemes? Second, how does this subgroup compare to verbal children with other patterns of language and cognitive abilities?

We predict that there will be a sub-group of children who present with normal non-verbal IQ but marked difficulties with grammatical morphology. It is unclear how many children in the group will meet this criterion, as there are no documented prevalence rates for a subgroup like this in children with ASD. In addition, based on previous studies of language sub-groups in ASD, it is expected that there will also be two other language sub-groups that emerge: one, children with normal language, and two, children who also have language impairment but show a broader range of deficits, including smaller vocabularies and more atypical language. The degree to which the grammar of this latter group shows the same profile—albeit possibly in more severe form—as the grammatical impairment group is heretofore undocumented, and so will be examined for the first time in this study.

# MATERIALS AND METHODS

# Participants

Participants for the current study were taken from the larger Autism Phenome Project (APP; total N = 189), a longitudinal project conducted at the University of California-Davis, MIND (Medical Investigation of Neurodevelopmental Disorders) Institute studying the neurobiological, genetic, and behavioral features of a large sample of children with autism. The APP recruits participants throughout northern California, with exclusionary criteria only for diagnosis, age, and language exposure (i.e., restricted to children primarily exposed to English and/or Spanish). Children were first enrolled in the APP when they were about 3-years-old (Year 1), and then a subset (n = 98) were seen again for behavioral testing about 2 years later when they were approximately 5-years-old (Year 3). Language abilities at Year 3 are the focus of interest for the current study, as literature on SLI suggests that grammatical language impairment can be reliably diagnosed by this age (Plante and Vance, 1994). Recordings were not available for 16 of these participants due to video recording errors (e.g., all or most of the session was not taped, or the recording file was corrupted), leaving a final sample of 82 participants for the current study.

The APP participants completed extensive behavioral testing, including some language assessments, as part of their participation in the project. This included the Autism Diagnostic Observation Schedule (ADOS; Lord et al., 1999), for confirmation of autism status; the Differential Ability Scale, Second Edition (DAS-II; Elliott, 2007), to obtain a non-verbal IQ score; and the Peabody Picture Vocabulary Test, Third Edition (PPVT-3; Dunn and Dunn, 1997) and Expressive One-Word Picture Vocabulary Test, Third Edition (EOWPVT-3; Brownell, 2000), to assess both receptive and expressive vocabulary abilities. The children were placed into three groups based on their language and non-verbal IQ testing (see **Table 1**): (1) High Verbal (n = 38) children scored in the normal range (standard scores above 85) for both nonverbal and vocabulary language testing; (2) Low Verbal (n = 11), children whose non-verbal IQ was below 85, and standardized vocabulary testing was commensurate with their non-verbal IQ; and (3) Minimally Verbal (n = 33), children whose non-verbal IQ and vocabulary performance was significantly below average (i.e., standard scores below 70).

TABLE 1 | Original groups based on standardized test scores.


NVIQ = standard score on Differential Ability Scale, Second Edition (DAS-II); ADOS = Autism Diagnostic Observation Schedule; DAS Verbal = T-score on DAS-II; PPVT-3 = standard score on Peabody Picture Vocabulary Test, Third Edition; EOWPVT-3 = standard score on Expressive One-Word Picture Vocabulary Test, Third Edition.

<sup>a</sup>Only 14 participants in the Minimally Verbal group were able to participate in the DAS-II testing. The remainder of this group completed the Mullen Scale of Early Learning (MSEL; Mullen, 1995) at Year 3, and their mean group T-scores on this measure was 20 (SD = 0), indicating floor-level performance for those children who completed the MSEL.

<sup>b</sup>This reflects the group mean for only the 14 participants in this group who participated in the DAS-II.

Prior to transcribing the language samples, the 82 recordings were screened to identify which participants were appropriate for the project. Two participants were excluded, one who did not meet criteria for autism at Year 3 and another with significantly reduced intelligibility that was likely due to co-morbid childhood apraxia of speech. In addition, 29 children did not produce sufficient language to transcribe, which was determined based on language production during the Free Play portion of the ADOS. This was not surprising as they were all from the Minimally Verbal group; however, two participants from that group did produce some spontaneous language [N (utterances) = 33 and 124] and they were included in the final sample. This left 51 remaining participants for the language transcriptions (Mage = 68.84 months, SD = 12.77).

# Transcriptions

Language samples were collected from the ADOS, which provides an opportunity for investigating detailed and comprehensive grammatical profiles, as many of the tasks aim to encourage language production without providing too much structure to reduce the naturalness (Tager-Flusberg et al., 2009). Furthermore, specific tasks on the ADOS were chosen that afforded the most spontaneous and unprompted language production. Although the exact tasks varied slightly depending on the ADOS Module that the child was administered, the following tasks were included in the language sample transcriptions: Free Play, Birthday Party, Bubble Play, Snack, Make-Believe Play, Conversation, Description of a Picture, Telling a Story from a Book, Cartoons, and Creating a Story.

The language samples were transcribed by the first author and a research assistant trained in CHAT format using Computerized Language Analysis (CLAN) software (MacWhinney, 2008). Each language sample was transcribed verbatim by one of the transcribers, who viewed each recording multiple times until the entire sample was transcribed. Utterances or portions of utterances that could not be fully transcribed after three viewings were marked as unintelligible according to CHAT coding conventions. A consensus procedure to check reliability of transcripts was used, similar to that described by Shriberg et al. (1984). That is, the transcribers viewed each other's video recordings while reading the initial transcription to check for errors or discrepancies. Discrepancies were discussed between transcribers until agreement was achieved. On the rare occasions when agreement could not be achieved, those utterances or portions of utterances were marked as unintelligible. This consensus procedure was followed for all 51 transcripts.

# Coding

CLAN conventions were deployed to perform morphological analysis on the transcripts, as well as to mark syntactic errors and extract word type and token variables for all parts of speech. Lexical measures included both types (number of unique words) and tokens (total numbers of words) for nouns, verbs, adjectives, adverbs, pronouns, and prepositions. Grammatical errors were marked within utterances to capture specific grammatical error types, including tense-agreement errors (omissions and usage errors for copula, auxiliary, bound tense markers, present progressive -ing, irregular past, and third person verb forms); pronominal form errors (such as person, case, and gender pronoun errors); and noun morphology errors (including plural -s, possessive -s, and articles a/the). Error counts were then converted into percentages because of the wide variability in transcript/utterance length across participants, ranging from as few as 24 utterances to as many as 247 utterances (M = 125.62 utterances, SD = 55.94). Verb tense-agreement error types were collapsed to form the Percentage of Verb Errors (PVE), calculated by dividing the number of tense-agreement errors described above by the participant's total number of utterances. Noun morphology error types were also collapsed to form the Percentage of Noun Errors (PNE), calculated by dividing the number of noun errors described above by the total number of utterances. Finally, Percentage of Pronoun Errors (PPE) was calculated to capture the pronominal form errors as a function of the total number of utterances.

Each utterance was also coded for jargon, echolalia, and grammaticality. Echolalia is the repetition, with similar intonation, of words or phrases that someone else has said; it can be immediate, or right after someone said it, or delayed, meaning a repetition of something heard in the past (Tager-Flusberg et al., 2005). Jargon was coded if the child used strings of non-meaningful speech with odd intonation. Utterances containing echolalia or jargon were not coded for grammaticality. Ungrammaticality was coded if the utterance contained one or more of the grammatical marker errors described above, a word ordering error, or any other syntactic error that could not be assigned to the other categories. These errors were also calculated as proportions based on total utterances in each child's sample, yielding variables of Percent of Echolalic Utterances (PEU), Percent of Jargon Utterances (PJU), and Percent of Ungrammatical Utterances (PUU). While we acknowledge that dialectical variation might affect our analyses, the first author,

who is a trained clinician, did not observe any specific dialectical differences amongst the participants.

Finally, children's productions of each of Brown's 14 grammatical morphemes were examined to calculate the frequency of use, or total tokens, as well as the correct usage of those morphemes. Brown (1973) calculated accuracy in obligatory contexts as the percentage of correct usages over the total number of contexts in which an adult would be expected to use the grammatical morpheme. The procedure described by Park et al. (2012) was used in this study. Tokens were calculated by hand by examining each transcript for each occurrence of all 14 morphemes. Contexts in which each morpheme should have been used were also examined; accuracy was calculated as a percentage as total correct tokens over total obligatory contexts. For example, when a child in the sample said "and they hungry," this was considered an omission of the contractible copula form "are" in an obligatory context, as the child should have said "and they're hungry." When that same child said "but they're sad," this was counted as a correct form of contractible copula in an obligatory context. Similarly, when another child in the sample said "it's hat" when meaning "it's a hat," this was counted as an omission of the article a in an obligatory context; the utterance "the cat is happy" provided a correct token of the article the in an obligatory context. This procedure, examining correct tokens as well as omissions in contexts when an adult would have used the morpheme correctly, was repeated for all 14 of Brown's grammatical morphemes. Once tokens as well as total obligatory contexts were calculated, correctly produced tokens over the total number of obligatory contexts was calculated, yielding a percentage of accuracy in obligatory contexts.

# Sub-grouping

The 51 verbal participants were assigned into one of three language sub-groups. The two children from the Minimally Verbal group who produced enough language to transcribe were combined with the children in the Low Verbal group to form the "Language Impaired" (LI; n = 13) group. The children from the High Verbal group were assigned into one of two groups based on their frequency of grammatical errors. Children who produced grammatical errors in more than 10% of their total utterances, as measured by PUU, were placed in the "Grammatical Impairment" group (GI; n = 17). Children who produced grammatical errors in fewer than 10% of their total utterances were placed in the "Language Normal" group (LN; n = 21).

# RESULTS

**Table 2** presents the demographic and standardized test data for the three subgroups (LN, GI, and LI), who did not significantly differ in age, [F(2,48) = 0.06, p = 0.942]. With respect to the standardized tests, the LI Group had significantly higher ADOS scores and significantly lower non-verbal IQ, PPVT-3, and EOWPVT-3 scores than the other two groups (ps < 0.01). As expected given our assignment procedure, the LN and GI groups did not differ on any of these measures.

# Group Comparisons: Lexical Variables

**Table 3** displays the lexical variables for each group. One-way ANOVAs showed that the groups differed significantly on the number of Noun, Verb, Pronoun, Preposition, Adjective, and Adverb types and tokens they produced (Fs > 6.0, ps < 0.01). Post hoc Tukey HSD tests revealed that the LI group produced consistently fewer types and tokens for all parts of speech when compared to both the LN and GI subgroups (ps < 0.01). The LN and GI subgroups did not significantly differ on any of the lexical variables.

# Group Comparisons: Grammatical Variables

**Table 4** presents the utterance-level measures for each group. The groups differed significantly in the number of the total utterances in the sample, [F(2,48) = 8.896, p < 0.001], their mean lengths of utterance, [F(2,48) = 6.077, p < 0.01], Percentage of Ungrammatical Utterances [PUU; F(2,48) = 38.52, p < 0.001], Percentage of Echolalic Utterances [PEU; F(2,48) = 8.25, p < 0.001], and Percentage of Jargon Utterances [PJU; (F(2,48) = 4.33, p < 0.01]. Pairwise comparisons using Dunnett's T3 test revealed that the LI group produced significantly fewer utterances than both the


Note: NVIQ = standard score on Differential Ability Scale, Second Edition (DAS-II); ADOS = Autism Diagnostic Observation Schedule; DAS Verbal = T-score on DAS-II; PPVT-3 = standard score on Peabody Picture Vocabulary Test, Third Edition; EOWPVT-3 = standard score on Expressive One-Word Picture Vocabulary Test, Third Edition.

### Wittke et al. Language Phenotypes in ASD


TABLE 4 | Group means for utterance level measures for language sample groups.


Note: MLU, mean length of utterance; PJU, percentage jargon utterances; PEU, percentage echolalic utterances; PUU, percentage ungrammatical utterances.

LN and GI subgroups (ps < 0.001), and that their MLUs were significantly smaller (ps < 0.01). The LI group also produced utterances more frequently with jargon and echolalia than the GI group (p < 0.05). Interestingly, while the LI subgroup produced more ungrammatical utterances than the LN group, the GI group produced significantly more ungrammatical utterances than both groups (ps < 0.001).

Next, Brown's 14 grammatical morphemes were compared across groups. Total tokens were compared with a oneway ANOVA, revealing that the three groups differed significantly in their overall frequency of use [F(2,48) = 11.055, p < 0.001]. Post hoc Dunnett's T3 comparisons confirmed that the LI group produced significantly fewer tokens of these morphemes (M = 38.15, SD = 39.13) overall compared to the LN (M = 119.33, SD = 66.09; p = 0.001) and GI (M = 124.24, SD = 50.71; p = 0.001) groups (see Appendix 1 for data and analysis by individual markers).

Correct usage in obligatory contexts of Brown's 14 grammatical morphemes was also considered. One-way ANOVAs found significant group differences in the children's overall percent correct usage in obligatory contexts of these morphemes [F(2,48) = 4.811, p < 0.01]. Brown (1973) considered these morphemes to be mastered when children produced them with 90% accuracy in obligatory utterances, and children in both the LN and LI subgroups reached this threshold when accuracy across all 14 morphemes was collapsed (91.7 and 92.1%, respectively). In contrast, children in the GI group performed below this threshold (81.5% accuracy). Post hoc Dunnett's T3 comparisons revealed that the GI group (M = 81.51, SD = 7.69) produced significantly fewer correct uses of Brown's morphemes in obligatory contexts than both the LN (M = 91.63, SD = 15.24; p = 0.034) and LI subgroup (M = 92.08, SD = 6.29; p = 0.001). See Appendix 2 for analyses by each individual grammatical morpheme.

The last group comparisons examined grammatical errors, as displayed in **Table 5**. One-way ANOVAs showed that the groups were significantly different in the Percentage of Noun Errors [PNE; F(2,48) = 7.21, p < 0.01], Percentage of Pronoun Errors [PPE; F(2,48) = 6.56, p < 0.01], Percentage of Verb Errors [PVE; F(2,48) = 17.56, p < 0.001], and Percentage of Overgeneralization Errors [POE; F(2,48) = 4.12, p < 0.05]. Post hoc Dunnett's T3 tests revealed that the GI group made a significantly higher percentage of noun errors than the LN (p = 0.01) and LI groups (p = 0.02). In addition, they made a significantly higher percentage of pronoun errors than the LN group (p = 0.01). They also made a significantly higher percentage of verb tense and agreement errors than the other two groups (p = 0.001). Finally, the GI group (M = 1.2%


TABLE 5 | Group means for grammatical errors for language sample groups.

Note: PNE, percentage of noun errors; PPE, percentage pronoun errors; PVE, percentage verb errors; POE, percentage overgeneralization errors.

of total utterances) made significantly more overgeneralization errors than the LI group (M = 0.11%), but the LN group was not significantly different than either group (M = 0.51%). Upon close inspection, the majority of overgeneralization errors occurred with past tense -ed (e.g., "won" as "winned"); and overall these overgeneralization errors were infrequent (range of 0–8 per participant across the entire sample).

# DISCUSSION

The current study performed within-disorder comparisons of language in children with ASD using spontaneous language samples, in order to explore the wide range of linguistic ability in this population as well as to probe for the presence of different linguistic subgroups in ASD. Our findings confirm the utility of using spontaneous language samples to capture both the lexical and grammatical skills of children with autism, but more importantly, they demonstrate that there are multiple meaningful subgroups of children with ASD, which vary based on both linguistic and cognitive abilities. Specifically, we distinguished four main groups based on the language samples that were collected: first, a group of children with autism who remained minimally or non-verbal at 5 years of age and did not have enough language to produce a spontaneous sample, and second, at the other end of the spectrum, children with ASD whose standardized tests and spontaneous language samples indicated non-verbal IQ, vocabulary, and grammar at age-appropriate levels. The two 'middle' groups were the most interesting ones, with one subgroup of children (GI) performing in the normal range on non-verbal IQ and vocabulary testing but showing a pronounced deficit in grammatical skills in their spontaneous language, and another group of children (LI) showing deficits in non-verbal IQ, vocabulary, and grammar, but also some unexpected areas in which their speech was more similar to the LN group than the GI group.

Reviewing the findings by each group in more detail highlights the distinct features of their overall language profiles. Starting with the LN group, these children presented much more similarly to what would be expected from TD 5-year-olds. They made few grammatical morpheme errors for nouns, verbs, and pronouns; also, their accuracy with Brown's 14 grammatical morphemes suggested mastery. Their lexical abilities also presented as intact, as they were producing a variety of word types and tokens. Thus, both grammatical and lexical abilities were judged to appropriate for their age. This group accounted for about 26.3% of our participants, similar to the rate of children in the sample from Kjelgaard and Tager-Flusberg (2001) for children with ASD who had both normal cognitive and language abilities.

Children in the LI group, who comprised 16.3% of the sample, produced much less speech than their other verbal peers with ASD; they had shorter MLUs, produced fewer tokens of grammatical morphemes, and had significantly smaller lexicons. Moreover, children in the LI group were significantly more likely to use jargon and echolalia in their utterances compared to the other two verbal groups. Atypical language like echolalia appears to be most common in children with poorer expressive language (Tager-Flusberg et al., 2005). Not surprisingly, the LI group presented with significantly lower non-verbal IQ scores than the other two groups; thus, it seems that deficits in nonverbal IQ coincide with language impairments that include smaller lexicons, more atypical language use, and less frequent grammatical marker use. Moreover, both the LI and LN groups presented with language patterns that were mostly commensurate with their non-verbal abilities and autism severity (globally low and globally high, respectively). What is possibly most interesting about the LI group, though, is that their rates of grammatical errors, including noun, verb, and pronoun errors, were comparable to the LN group, as was their accuracy with Brown's 14 grammatical morphemes. That is, their usage of grammatical markers was not frequent, but when it occurred, was mostly correct.

This is one area where our GI subgroup, who comprised about 21.3% of the current sample, differed from the other two groups. While the GI group did not differ from the LN group on many measures from the standardized tests (nonverbal IQ, receptive and expressive vocabulary) and even some from the language samples (lexical frequency, MLU, and atypical language), grammatical errors consistently distinguished the GI group from the other two. That is, the GI group presented with significant weaknesses in their morphosyntactic production, including more frequent verb, noun, and pronoun morphology errors, as well as more overall ungrammatical language. Moreover, while the GI group was more advanced than the LI group on many measures, including manifesting higher non-verbal IQ and vocabulary testing, as well as higher MLUs, larger lexicons, and more frequent usage of grammatical markers, these two groups also differed on grammatical error rates. In fact, the language impairment of the LI group was unlike that of the GI group, in that the former's language impairment

included both low vocabulary and sparse grammatical usage but not frequent grammatical errors, while the latter group's language impairment was specific to grammatical errors. While the exact explanations for these differences is beyond the scope of this sample, it is important to consider that there may not only be grammatical origins to these deficits, but also semantic ones. It is possible that the difficulties that the GI children have with tense markers, for example, is attributable to semantic challenges in distinguishing temporality. However, because Tovar et al. (2015) have documented that 4-year-old children with ASD successfully distinguish ongoing activities from completed actions (i.e., the '-ing'/past distinction) in a comprehension paradigm, we lean toward the interpretation that the challenges of the GI children in this study, for producing morphemes such as tense, are more grammatical than semantic (see also Modyanova et al., 2017). The findings are less clear for the LI group, but certainly further research that probes both semantics and grammar in the same children would be helpful in further distinguishing these two possibilities (Naigles and Tek, 2017).

Our findings align with some of the previous research that has claimed that some children with ASD meet the general criteria for SLI, evidenced by impaired grammatical skills with a relative strength in vocabulary (Kjelgaard and Tager-Flusberg, 2001; Roberts et al., 2004; Durrleman and Zufferey, 2009). In particular, there are a number of similarities between the language presentation of our GI group and that of children with SLI, such as the high rates of grammatical errors. In fact, some children in the GI group produced grammatical errors in as high as 27–28% of their utterances, a finding consistent with other studies that have explored the frequency of grammatical errors in children with SLI (e.g., Dunn et al., 1996; Eisenberg and Guo, 2013). The error types, too, specifically involving tenseverb agreement, noun markers, and pronouns, are also similar to those observed in children with SLI. Finally, the GI group produced significantly more overgeneralization errors than the LI group, and overgeneralization errors have also been found to be more common in children with SLI than children with commensurately low non-verbal IQ and language (Rice et al., 2004). The very presence of overgeneralization errors in these children with ASD is notable for another reason; namely, that this is the first documentation of overgeneralization errors for this population (cf. Eigsti et al., 2007).

These findings also lend support to theories that suggest that the acquisition of grammar depends, at least somewhat, on factors external to general cognition (e.g., Lewis and Landau, 2015; Valian, 2015; Tuller et al., 2017). That is, while this sample of children with ASD includes two subgroups whose language is generally commensurate with their non-verbal IQ (i.e., the LI and LN groups), it also includes one subgroup whose non-verbal and vocabulary abilities are high, yet whose grammatical abilities are markedly impaired. The processes and knowledge that enable the acquisition of grammar are thus shown to not be simply derived from those of general cognition; instead, they may be comprised of domain-specific configurations and computations (Rice et al., 2004; Naigles and Tek, 2017). The population of SLI has provided one clear example of this domain-specificity, as they have impaired grammar despite normal cognitive abilities (Van der Lely, 2005), and our GI group adds corroborating evidence. While the exact nature of these grammatical errors is not entirely clear and remains an issue for further investigation in ASD, our findings provide support for domain specificity of grammar in another clinical population beyond SLI.

# Limitations

There are limitations to consider about the current findings. Specifically, the classification method used in this study, categorizing children by non-verbal IQ scores and then using total number of ungrammatical utterances, was not ideal for every participant in the study. Four children in the LN group, whose non-verbal IQ was above 85 and rarely produced ungrammatical language, actually presented with considerably less language than other children in the LN group. They had much smaller MLUs and lexicons, and so based on their language, they actually may have been better suited to fall in the LI subgroup. Such 'outlier children' have also been attested in other studies; for example, Kjelgaard and Tager-Flusberg (2001) reported that about onequarter of their sample did not fit neatly into language groups based on variable patterns of performance on testing. The presence of these four children in our sample raises the possibility of yet another language subgroup, one with high cognitive skills coupled with low global language; however, caution is warranted because of the small number of children who might fit this profile. And in fact, the occurrence of 'only' four outlier children in our study, relative to other research (Kjelgaard and Tager-Flusberg, 2001), might be taken as further support for the inclusion of detailed language samples when making such categorizations.

Another limitation was the 10% cut-off for ungrammatical utterances employed for categorizing the children with GI from the LN group. As discussed earlier, there are no normative data in typical language development for frequency of grammatical errors; therefore, there is also no universally accepted cutoff for frequency of grammatical errors in language samples for diagnosing SLI. However, based on performance between children with SLI and those who are TD (Dunn et al., 1996; Eisenberg et al., 2012), 10% was judged to be a potentially meaningful cut-off for Kindergarten-aged children, as it aligned with the grammatical rates of younger TD children in one study (Dunn et al., 1996). Certainly, this is an area important to future research so that specific delineations regarding frequency cutoffs for grammatical errors are consistent and congruent across studies.

One final limitation was our inability to describe the expressive language abilities of the Minimally Verbal group, as they constituted a significant proportion of children in our sample (36.3%). Although Kasari et al. (2013) recommend alternative methods like language sampling for children who are minimally verbal, the language samples from the ADOS were not an ideal measure for capturing the abilities in this sub-group. This is because most parents of children in this group reported at least some expressive vocabulary used at home. In addition, relative to parent-child and examiner-child interactions, ADOS interactions have been found to result in fewer total utterances and less complex language for children with ASD (Kover et al., 2014). While the ADOS had to be used for collecting

language samples given the retrospective design of the current study, we acknowledge this may have impacted the amount of language produced by each child, particularly for those in the Minimally Verbal group. Unfortunately, the language sampling technique used in this study only allowed for detailed exploration of expressive grammatical abilities, and did not allow us to further explore possible receptive grammatical similarities and differences in the children with minimal language (but see Naigles and Fein, 2017).

Despite these limitations, the findings from the current study fill a critical gap in the literature that explores both language subgroups in ASD as well as the possibility of a specific grammatical impairment subgroup in this population. This is the first known study using spontaneous language samples to categorize a relatively large and heterogeneous sample of children with ASD based on both grammatical and lexical abilities. Our results suggest that verbal children with ALI diverge into two subgroups: those with a primary deficit in grammatical language but relatively intact vocabulary, and others with sparse production of both lexicon and grammar, but unexpectedly low error rates in grammatical usage as well.

# Future Directions

The current study lends support to a within-group comparison of language abilities using language samples to categorize children with ASD. Next steps with this dataset include exploring early markers of normal language and language impairment in ASD in the children at Year 1 of the APP study. Using the categorization we completed at Year 3, we will examine the language samples collected at Year 1 to discover which group differences were present earlier in development, and which predictors might be found for group membership 2 years later. In addition, APP participants included in this project had brain scans at Years 1 and 3; therefore, examination of potential neurobiological markers may also be explored as possible predictors to

# REFERENCES


language group membership (e.g., Naigles et al., 2017), and as sources of information about the brain structures that underlie developing language skills in ASD.

# ETHICS STATEMENT

This study was carried out in accordance with the recommendations of the Institutional Review Board at the University of Connecticut. Because the research presented no more than minimal risk to human subjects and data was collected from video recordings made for research purposes, it qualified for expedited review and approval per Waiver of Informed Consent (45 CFR 46.116(d)). The protocol was approved by the University of Connecticut IRB.

# AUTHOR CONTRIBUTIONS

AM, SO, and SR designed the original data collection. KW and LN worked together on the questions, design, coding, analyses, and primary write up of the current study. AM, SO, and SR provided input and final approval to this final paper.

# FUNDING

NSF-IGERT to the University of Connecticut NIMH to the MIND Institute, Autism Phenome Project.

# SUPPLEMENTARY MATERIAL

The Supplementary Material for this article can be found online at: http://journal.frontiersin.org/article/10.3389/fpsyg. 2017.00532/full#supplementary-material



atypical development? Int. J. Speech Lang. Pathol. 14, 95–108. doi: 10.3109/ 17549507.2011.645555


young children with autism spectrum disorders. J. Autism Dev. Disord. 44, 75–89. doi: 10.1007/s10803-013-1853-4


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2017 Wittke, Mastergeorge, Ozonoff, Rogers and Naigles. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

# Role of Two Types of Syntactic Embedding in Belief Attribution in Adults with or without Asperger Syndrome

Morgane Clémentine Burnel1,2 \*, Marcela Perrone-Bertolotti<sup>1</sup> , Stephanie Durrleman<sup>3</sup> , Anne C. Reboul<sup>2</sup> and Monica Baciu<sup>1</sup>

<sup>1</sup> Université Grenoble Alpes, CNRS, LPNC UMR 5105, Grenoble, France, <sup>2</sup> Université de Lyon, CNRS, Institute for Cognitive Sciences – Marc Jeannerod (UMR 5304), Bron, France, <sup>3</sup> Department of Psycholinguistics, Faculty of Psychology and Educational Sciences, University of Geneva, Geneva, Switzerland

The role of syntax in belief attribution (BA) is not completely understood in healthy adults and understudied in adults with autism spectrum disorder. Embedded syntax could be useful either for the development of Theory of Mind (ToM) (Emergence account) or more generally over the lifespan (Reasoning account). Two hypotheses have been explored, one suggesting that embedding itself (Relatives and Complement sentences and Metarepresentation account) is important for ToM and another one considering that the embedding of a false proposition into a true one (Complement sentences and Misrepresentation account) is important. The goals of this study were to evaluate (1) the role of syntax in ToM (Emergence vs. Reasoning account), (2) the type of syntax implied in ToM (Metarepresentation vs. Misrepresentation account), and (3) the verbally mediated strategies which compensate for ToM deficits in adults with Asperger Syndrome (AS). Fifty NeuroTypical (NT) adults and 22 adults with AS were involved in a forced-choice task including ±ToM tasks (BA and a control task, physical causation, PC) under four Interference conditions (silence, syllable repetition, relative sentences repetition, and complement sentences repetition). The non-significant ±ToM × Interference interaction effect in the NT group did not support the Reasoning account and thus suggests that syntax is useful only for ToM development (i.e., Emergence account). Results also indicated that repeating complement clauses put NT participants in a dual task whereas repeating relative clauses did not, suggesting that repeating relatives is easier for NT than repeating complements. This could be an argument in favor of the Misrepresentation account. However, this result should be interpreted with caution because our results did not support the Reasoning account. Moreover, AS participants (but not NT participants) were more disrupted by ±ToM tasks when asked to repeat complement sentences compared to relative clause sentences. This result is in favor of the Misrepresentation account and indirectly suggests verbally mediated strategies for ToM in AS. To summarize, our results are in favor of the Emergence account in NT and of Reasoning and Misrepresentation accounts in adults with AS. Overall, this suggests that adults with AS use complement syntax to compensate for ToM deficits.

Keywords: Theory of Mind, syntax, emergence, reasoning, metarepresentation, misrepresentation, autism spectrum disorders

### Edited by:

Marcela Pena, Pontifical Catholic University of Chile, Chile

### Reviewed by:

Francesca Marina Bosco, University of Turin, Italy Jianfeng Yang, Shaanxi Normal University, China

### \*Correspondence:

Morgane Clémentine Burnel morgane.burnel@univ-grenoblealpes.fr

### Specialty section:

This article was submitted to Language Sciences, a section of the journal Frontiers in Psychology

Received: 05 December 2016 Accepted: 24 April 2017 Published: 12 May 2017

### Citation:

Burnel MC, Perrone-Bertolotti M, Durrleman S, Reboul AC and Baciu M (2017) Role of Two Types of Syntactic Embedding in Belief Attribution in Adults with or without Asperger Syndrome. Front. Psychol. 8:743. doi: 10.3389/fpsyg.2017.00743

# INTRODUCTION

fpsyg-08-00743 May 11, 2017 Time: 17:0 # 2

The concept of Theory of Mind (ToM) was introduced by two primatologists, Premack and Woodruff (1978), and refers to the ability to attribute mental states (i.e., beliefs, thoughts, feelings, desires, emotions, or intentions) to others, in order to predict or explain their behavior. Using ToM, people are able to predict the output of a system (i.e., someone else's behavior) from invisible states (i.e., mental states). Dennett (1978) explained that the only robust way of testing the attribution of mental states is to require people to predict a behavior from a belief or a representation about the world which may contrast with reality, that is to say a False Belief (FB). When evaluating True Beliefs (TB), because the belief is identical to reality, researchers cannot distinguish whether it is the reality state or the belief attribution (BA) (i.e., ToM) that is responsible for success.

A well-known result in the literature is the inability of children under 4 or 5 years to succeed at FB tasks (Yirmiya et al., 1998), indicating that before this age their ToM is not mature enough to allow them to attribute FBs to people and to use these attributions to predict behavior. The ability to pass FB tasks is considered as a milestone for ToM and ToM is mostly studied by means of FB tasks. This task nevertheless entails some limitations given that ToM is not limited to FB and FB is not limited to ToM (Bloom and German, 2000). Although (explicit) FB tasks are successfully performed around the age of 4, (implicit) FB tasks, where children are not specifically required to give an answer, can be performed before 2 years of age (see Onishi and Baillargeon, 2005; Southgate et al., 2007) but it is still debated if implicit and explicit tasks tap on the same processes (San Juan and Astington, 2012). Additionally, different studies showed that other (explicit) ToM tasks can be performed before the age of 4, for instance tasks evaluating the understanding of goals, desires, intentions, perceptions, feelings, knowledge, and ignorance (Wellman and Liu, 2004; Hutto et al., 2011). Additionally, ToM keeps on developing after childhood (Apperly, 2012; Brizio et al., 2015). Adolescents and adults perform correctly not only at FB tasks but also at other tasks evaluating the understanding of more complex social situations (Lawson et al., 2004; Beaumont and Sofronoff, 2008; Bosco et al., 2014). Thus, ToM is not a monolithic ability and the FB paradigm, although useful, does not cover all ToM processes. The second limitation underlined by Bloom and German (2000) is that FB tasks do not only measure ToM because they also involve other cognitive abilities such as attention, working memory and language. In order to succeed, children must pay attention to and subsequently recall the sequence of events, as well as understand the narrations and questions. It is worth noting that this second limitation is not applicable only to FB tasks but also to most of the tools designed to assess ToM as they often include long narrations (e.g., Social stories and Faux Pas test) or enhanced executive demands (e.g., Second order FB tasks). Nevertheless, studies have shown that verbal and non-verbal FB tasks are strongly correlated (Call and Tomasello, 1999). In the current study the choice to study BA (using a non-verbal paradigm) is mainly motivated by the difficulty to create appropriate tasks to assess ToM in adults without language as a confound. Although we are specifically interested in BA in this study, we are well aware that ToM is not limited to this ability, and that conclusions drawn on BA will have to be evaluated regarding other ToM abilities before being generalized to ToM.

Our study builds on the observation that children with Autism Spectrum Disorders (ASD) have a delayed ToM, as they succeed at FB tasks later than their NeuroTypical (NT) peers matched for intellectual abilities (see the meta-analysis of Yirmiya et al., 1998) and as they also succeed later than their NT peers on other ToM tasks. Happé (1995) showed that children with ASD need a higher mean verbal mental age than NT peers to succeed at FB tasks, namely that of 8 years and a half. Given this supplementary level of language required to perform FB tasks, Happé (1995)suggested that children with ASD use language to compensate their ToM deficits.

Asperger Syndrome (AS) is a particular form of ASD without intellectual or language delay. The formal diagnosis of AS which existed in the DSM-IV-TR (American Psychiatric Association, 2000) is no longer available in the DSM-5 (American Psychiatric Association, 2013), however, we maintain it in the current work because all of the participants included in this study received a diagnosis of "AS." Children with a diagnosis of AS have better language abilities compared to children with other forms of autism (Manjiviona and Prior, 1999) they also show better social skills (Walker et al., 2004) and better performance at ToM tasks (Bowler, 1992). One goal of the current study is to test the hypothesis of verbally mediated strategies to compensate for ToM deficits in AS (Goal 3).

Different components of language could be useful for ToM in children, such as semantics, vocabulary, and syntax. The metaanalysis performed by Milligan et al. (2007) on 104 studies of NT children suggested that even if all of these language abilities are linked to FB understanding, some of them would be more useful than others. Indeed, the authors found that syntax accounted for 29% variance in FB scores whereas semantics and vocabulary accounted for 23 and 12% of variance, respectively. Thus, syntax is particularly important for ToM in NT children and could be an important element to promote ToM in ASD (Durrleman et al., 2016). Which syntactic structures are most important to support ToM in NT is still a question under investigation.

de Villiers and de Villiers (2000) proposed that a specific type of syntax used in Complement Sentences (CS) is particularly useful for FB. Indeed, sentential complements are often inserted in sentences with mental state verbs and serve to reflect the perspective of the subject of the matrix. For example, in: "Sally thinks that the marble is in the basket," the underlined complement clause represents the subjective belief of Sally. Related to this is the fact that the embedded proposition can be true or false, independently of the entire sentence (just as beliefs may accurately reflect reality, or not). For example, in our previous example the complete sentence can be true (i.e., Sally really thinks that the marble is in the basket) and at the same time the embedded proposition can be false (i.e., the marble is not in the basket). In order to determine if the sentence is true or false, one should consider the mental world and not the

real world. As proposed by de Villiers and de Villiers (2000), CS would therefore be an excellent tool for enabling children to represent FB.

Smith et al. (2003) proposed that another type of structure, mainly Relative Clause Sentences (RS) could also be important for FB reasoning. According to these authors, "Metarepresentation arises when a representation of an event is embedded inside a representation of an event," thus embedding is what allows metarepresentation and any structure involving the embedding of events (thus metarepresentation) should be related to ToM. Consider the following example, with an underlined RS: "Sally looked for the marble that Anne placed inside the basket." Just as in a CS, RS also includes an embedded proposition. However, in the case of CS, recall that the embedded proposition can be false with the entire sentence being true, while in a RS if the embedded proposition is false (i.e., the marble is not in the basket) then the entire sentence is also false (i.e., Sally cannot look for the marble that Anne placed in the basket if Anne didn't place it in the basket).

Misrepresentation and Metarepresentation accounts are in opposition regarding syntactic structures that are important for ToM. According to the Misrepresentation account, what it important for ToM is the embedding of a false proposition inside a true one. Thus, stronger links should be found between ToM and CS than between ToM and RS. According to the Metarepresentation account, it is only the embedding that is important for ToM. Thus, the same links should be found between ToM and CS as between ToM and RS. Smith et al. (2003) showed in NT children that the understanding of certain RS was significantly linked to FB success. However, given that the study included no CS understanding task, results cannot guarantee that CS would not have been more closely related to ToM than RS. One goal of the current study is to compare the role of CS and RS in NT and in AS so as to evaluate Misrepresentation and Metarepresentation accounts (Goal 2). The comparison between the two populations will allow us to evaluate the existence of verbally mediated strategies for ToM in AS (Goal 3).

While the Misrepresentation and Metarepresentation accounts refer to the type of syntactic structure that is important for ToM, other hypotheses can be made regarding how these structures relate to ToM. Thus, apart from evaluating which element of language is the most important to support ToM, different hypotheses are made about the relationship between language and ToM. In particular, three accounts can be proposed. The first one is the Reasoning account, in which language would allow ToM reasoning over the entire lifespan. The second one is the Emergence account in which language is useful only for ToM development but not in adulthood (see Moses, 2001 for a similar account about the links between ToM and executive function).

According to a third account, the Expression account, the correlations found between language and ToM could arise from the verbal nature of the ToM tasks classically used (Miller, 2001; see also Moses, 2001). Indeed, language is first and foremost necessary to understand the narrations and instructions of these ToM tasks and as such directly affects ToM performance. We note that the Emergence and Reasoning accounts, as well as the Expression account can co-exist and are thus not mutually exclusive. Indeed, language could be a prerequisite for ToM (as suggested by Reasoning or Emergence accounts) and at the same time language could limit ToM performance due to the verbal instructions involved in ToM tasks (Expression account). However, when correlations are found between language and verbal ToM tasks we cannot eliminate the possibility that this link is only due to verbal instructions (i.e., Expression account). Consequently, to correctly evaluate the relationship between language and ToM (Emergence vs. Reasoning accounts), nonverbal ToM tasks are mandatory. The main goal of the current study is to evaluate Reasoning and Emergence accounts by mean of non-verbal tasks (Goal 1).

One way to disentangle the Reasoning and Emergence accounts is to study links between ToM and language in adults, because it is in adults that predictions differ. Indeed, in adults, according to the Reasoning account, language and ToM continue to be closely related. According to the Emergence account, however, no relation should be observed between ToM and language in adulthood. Studying the links between ToM and language in adults entails many challenges. First of all, classical FB tasks used in children cannot be used because they are too simple and adults would be at ceiling. Other tasks are available to assess ToM in adults (e.g., Social stories and Faux Pas test) but they generally include long narrations which are problematic regarding the Expression account. Similar limitations (facility of task) occur for the evaluation of Misrepresentation and Metarepresentation accounts in adults. In order to overcome these limits, two solutions have been proposed, one consisting of the study of brain-injured adults, and another one proposing the use of dual task paradigms, which is the method we adopt in the current study.

The majority of studies evaluating the relation between ToM and language at an adult age are performed in cognitively impaired patients after brain lesions, typically stroke patients. Indeed, the evaluation of patients with post-stroke aphasia and language deficits might be an important source of information on ToM functioning. Investigations have shown that, despite important syntactic deficits, post-stroke patients are able to perform ToM tasks (Varley and Siegal, 2000; Varley et al., 2001; Apperly et al., 2006). But as underlined by Caplan (1992), it is not clear what exactly is affected in such cases: linguistic performance or linguistic competence. If the patients tested in these experiments are affected in their linguistic performance through disrupted access, their linguistic competence might nevertheless be intact, allowing them to perform normally in ToM tasks.

Newton and de Villiers (2007) proposed a dual task paradigm in order to study the relation between language and ToM reasoning in healthy adults. The dual task consisted of the comparison of a non-verbal FB and a non-verbal TB task, during a verbal shadowing task and a non-verbal rhythmic task. The authors reported decreased performance for the FB condition but not for the TB condition, specifically during verbal shadowing but not during the non-verbal interference task. The authors concluded in favor of the role of language in BA. Forgeot d'Arc and Ramus (2011) highlighted different limitations to Newton

and de Villiers's conclusion, specifically criticizing the opposition between FB and TB. Historically, FB is the preferred indication of ToM over TB because, although a correct response to TB task can be achieved by means of ToM, it can also be achieved without ToM and thus a correct answer at a TB task does not always reflect ToM abilities. However, TB can still be achieved by means of ToM, and thus could reflect ToM processes too. Newton and de Villiers (2007) did not explain the reasons why language should be more useful during FB than during TB, and Forgeot d'Arc and Ramus (2011) considered that because both FB and TB are ToM they should be used jointly to assess ToM abilities. Moreover, according to Forgeot d'Arc and Ramus (2011), to prove a specific role of language during ToM, the interference effect of language on ToM should be greater than the interference effect of language on a matched task which does not require ToM. Thus, Forgeot d'Arc and Ramus (2011) proposed a different paradigm to assess the role of inner speech in attributing beliefs in adults. Similarly to Newton and de Villiers (2007), they used verbal shadowing as a dual task to inhibit inner speech. However, rather than contrasting TB and FB as in Newton and de Villiers (2007), they contrasted the ability to attribute (true or false) beliefs with the ability to attribute goals or to infer physical causation (PC). TB conditions were not used in isolation or contrasted to FB conditions, but as a means to control for response biases in FB conditions (see Forgeot d'Arc and Ramus, 2011, p. 977). Results reported for 58 NT adults showed that the role of inner speech in BA was not significantly different from the role of inner speech in goal attribution or in PC inference. Thus, they concluded that BA is not specifically dependent on inner speech. Overall, studies on NT adults are rather in favor of an Emergence account than a Reasoning account, suggesting that inner speech is not clearly implied in BA after childhood. No data is currently available in NT adults regarding the specific role of syntax (rather than inner speech) for ToM reasoning. More specifically, the Misrepresentation and Metarepresentation accounts have not yet been explored in a population of NT adults. Furthermore, these Misrepresentation and Metarepresentation accounts, as well as the Emergence and Reasoning accounts have yet to be examined in adults with ASD.

In the current study we had three main goals. Goal 1 was to assess Emergence and Reasoning accounts in NT adults, by mean of a dual task paradigm, to evaluate the relation between syntax and ToM. According to the Emergence account, language is not useful for ToM reasoning in adulthood. Thus, a verbal interference task should not disrupt the ability to attribute beliefs more than it disrupts the ability to perform a control task. In contrast, according to the Reasoning account, language is useful for ToM reasoning over the lifespan. In this case, a verbal interference task disrupts the ability to attribute beliefs, significantly more than it disrupts the ability to perform a control task. Goal 2 was to evaluate the Metarepresentation and Misrepresentation accounts in adults. According to the Metarepresentation account, the ability to embed a proposition into another is sufficient for ToM reasoning. Thus, being engaged in an interference task that involves RS should disrupt ToM as much as being engaged in an interference task that involves CS. In contrast, according to the Misrepresentation account, the most important linguistic structures for ToM are those embedding a false proposition in a true one. Thus, a dual task involving RS should not disrupt ToM as much as one involving CS. Goal 3 was mostly transversal and consisted in the evaluation of the hypothesis of a verbally mediated strategy to attribute beliefs in adults with AS. If adults with AS use language as a means to compensate for persistent ToM deficits, their ability to attribute beliefs when they are concurrently engaged in a verbal task should be significantly more disrupted than in NT adults. Put differently, we hypothesized that results in NT will be in favor of the Emergence account whereas results in AS will be in favor of the Reasoning account. The methodology of the current paper is a combination of paradigms used by Newton and de Villiers (2007) and Forgeot d'Arc and Ramus (2011). The difference from Newton and de Villiers (2007) and Forgeot d'Arc and Ramus (2011) is the evaluation of the role of specific syntactic structures rather than inner speech during BA. Moreover, we compared ToM and non-ToM tasks (as in Forgeot d'Arc and Ramus, 2011), as well as verbal and non-verbal interference tasks (as in Newton and de Villiers, 2007) in NT adults and in adults with AS.

# MATERIALS AND METHODS

# Participants

Fifty-three NT adults and 25 adults with AS, all French native speakers, were initially included in the study. Three NT participants and three adults with AS were unable to perform all tasks so their data were excluded. We finally retained 50 NT participants (26 males, 24 females; mean age 21 years, SD 4.9) and 22 participants with AS (12 males, 10 females; mean age 32 years, SD 8.9). Participants provided written informed consent and the study was approved by the local ethical committee (CERNI, N◦ 2015-09-15-74). NT participants were students of the local university, while participants with AS were mainly recruited from the local Expert Center for AS diagnosis in adults. All of them completed the Hospital Anxiety and Depression scale (HAD) (Zigmond and Snaith, 1983) to assess possible anxiety and depression symptoms. This test was applied because AS are more prone to anxiety and depression (Stewart et al., 2006).

Recall that the main objective was to evaluate the interaction between ToM and syntactic abilities in NT and AS. ToM evaluation was based on the comparison of two experimental conditions, named ±ToM conditions. The +ToM condition was named BA and allowed ToM assessment, whereas the −ToM condition was named PC and hence was the control condition. ±ToM conditions were performed under four interference conditions, three verbal tasks [involving a series of syllables (SS), RS or complement clause sentences] and Silence.

# ToM Evaluation: Stimuli and Tasks

Stimuli used during BA and PC were cartoons similar to those presented in Forgeot d'Arc and Ramus (2011). Seventy-five cartoons representing 15 scenarios were presented to participants during the ±ToM conditions. Each cartoon was composed of four successive phases: beginning, change, suspense and pair of

asked to choose the correct ending.

possible ends (one correct and another incorrect, see description below and **Figure 1**). Participants were instructed to choose the correct ending from the two presented (i.e., a forced-choice task). In the change phase, cartoons were presented in three situations (No Change, Change Seen, and Change Unseen). According to the pair of possible ends cartoons were presented in two situations (Mentalistic end and Mechanistic end), leading to a total of five situations: Mentalistic No Change, Mentalistic Seen Change, Mentalistic Unseen Change, Mechanistic No Change, and Mechanistic Unseen Change (see **Figure 1** for details and Forgeot d'Arc and Ramus, 2011, pp. 978–980).

The beginning phase was identical to all five situations and represented the general context and the main agent of the scenario, i.e., "man standing between two plants, holds a watering can; there is a faucet in the background; the man waters the plant on the left and then leaves the scene to fill his watering can".

As mentioned in **Figure 1**, the change phase is presented in three versions. In the No Change situation, nothing happens after the agent fills his watering can. In the Change Seen situation (i.e., TB) a change occurs and is perceived by the agent. Specifically, this change consists of a woman appearing in the scene, who switches the two plants while the man is watching her. In the Unseen Change situation (i.e., FB) the change is identical as in the previous situation, except that it is not seen by the agent (see **Figure 1**). The suspense phase is identical across all situations and consists in the agent's action (e.g., after having filled the watering can, he is standing between the two plants). The end phase can be Mentalistic or Mechanistic, each one proposing two choices. For the Mentalistic type, the choice concerns the agent's action (e.g., watering the plant) either on the left or on the right. For the Mechanistic type, the choice concerns the mechanical action (e.g., water leaking from the pot) either on the left or on the right (see **Figure 1**).

Cartoons represented 15 scenarios, each of them being declined in five experimental situations (i.e., Mentalistic No Change, Mentalistic Seen Change, Mentalistic Unseen Change, Mechanistic No Change, and Mechanistic Unseen Change) leading to a total of 75 cartoons. Cartoons were presented in five Cartoon blocks. Each Cartoon block contained 15 cartoons, including one occurrence of each scenario; each experimental situation was presented three times in each Cartoon block (of three different scenarios). The number of correct answers per participant and

per condition was recorded. The number of correct answers in pairs of situations was then used, as in Forgeot d'Arc and Ramus (2011), to compute two sensitivity indices during data processing (signal detection analysis). The assessment of BA was based on answers in the Mentalistic Seen Change and Mentalistic Unseen Change conditions whereas the assessment of PC was based on answers in the Mechanistic No Change and Mechanistic Unseen Change conditions (see Data Scoring section).

# Interference Evaluation: Stimuli and Tasks

We created verbal material to assess interference processes between syntactic processes and ToM tasks described previously. Interferences were manipulated in four experimental conditions as mentioned previously (three verbal and one silent). For the verbal conditions we used Complement sentences (CS) and Relative Sentences (RS). We evaluated which type of sentence, CS or RS, is the most useful to ToM. A third verbal condition was represented by a Series of syllables (SS) as a control condition without syntax but requiring the phonological buffer. Finally, a Silence condition was proposed as a control.

A total of 252 stimuli (84 CS, 84 RS, and 84 SS) were created and presented during the three verbal conditions (see **Table 1**). The number of syllables (i.e., 11) was controlled across conditions. CS and RS were built as pairs (see **Table 1**), differing only in terms of syntax but remaining similar in terms of vocabulary, frequency of occurrence (LEXIQUE database, New et al., 2001) and plausibility (based on a preliminary experiment on a different group of participants, t(54) = −0.25, p = 0.80]. SS stimuli consisted of the same syllable (e.g., BA) repeated 11 times. Stimuli of CS, RS, and SS were incorporated in the Voxygen vocal synthesizer allowing the generation of three audio files, one for each verbal condition, having the same duration (7 min).

The 252 stimuli were split into three Interference blocks: Complements block, Relatives block, and Syllables block. Interference blocks were presented concomitantly with the cartoons described above and presented for ToM evaluation. Using the same 252 stimuli, we also built three No-Interference blocks, each including 28 CS, 28 RC, and 28 SS. No-Interference blocks were used in isolation without cartoons, that is to say without any concomitant evaluation of ToM.

# Experimental Procedure

The procedure is described in **Table 2** and consists in three phases: Training, No-Interference, and Interference.

### Training Phase

Each participant started with a short training session (5 min) consisting in reading aloud written CS and RS from one of the three No-Interference blocks (i.e., a total of 56 sentences to read per participants) to become familiar with the type of sentence and vocabulary.

### No-Interference Phase

The No-interference experiment (7 min) started right after this short training. Participants were required to listen to recorded CS, RS, and SS within a No-Interference block (different from the one they read during the Training phase) and repeat what they heard after the end of each sentence (i.e., a total of 84 sentences or SS per block). Participants were recorded while repeating sentences or syllables. Their performance was evaluated in terms of error rates (%ER) in repetition.

### Interference Phase

Each participant performed ToM tasks in four Cartoon blocks according to CS, RS, SS, and Silence conditions. The association between Cartoon block and Interference conditions, as well as the order of Interference conditions, was counterbalanced across participants. The order of cartoon presentation inside each block was randomized. During the Silence, participants were asked to choose as quickly as possible the end which best completed a presented cartoon. Responses were provided by means of two manual keys on a keyboard ("A" for the end presented on the left side of the screen and "P" for the end presented on the right side of the screen, on an AZERTY keyboard). Similarly, during the three linguistic conditions (i.e., Complements Repetition, Relatives Repetition, and Syllables Repetition), participants were asked to choose as quickly as possible the end which best completed a presented cartoon (pressing the same manual keys as presented above) with concomitant repetition of a heard sentence or SS (i.e., dual task paradigm). Participants were explicitly told that their answer would not be taken into account if they did not repeat what they heard. During these three verbal conditions, the recording of the audio file started with the first cartoon and continued until all 15 cartoons of the Cartoon block were completed. The Interference phase lasted approximately 30 min.

# Data Scoring and Analyses

### Data Scoring

Correct answers and mean reaction times (RTs) were recorded for the five different cartoon situations (i.e., Mentalistic No Change, Mentalistic Seen Change, Mentalistic Unseen Change, Mechanistic No Change, and Mechanistic Unseen Change) with a maximum of three correct answers per situation. To simplify data analysis we computed two sensitivity indices (i.e., BA and PC) among the three initially proposed by Forgeot d'Arc and Ramus (2011). Thus, even if participants answered the Mentalistic No Change situations, these answers were not taken into account for analyses. To compute the BA sensitivity index we considered as hits correct answers at the Mentalistic Unseen Change situation, and as false alarms incorrect answers at the Mentalistic Seen Change situation. The PC index was computed from correct answers at the Mechanistic Change situation (i.e., Hits) and incorrect answers at the Mechanistic No Change situation (i.e., False alarms). There was a total of eight indices per participant: two per Interference condition (i.e., one for BA and one for PC) with a total of four Interference conditions (i.e., CS condition, RS condition, SS condition, and Silence condition).

Responses from No-Interference and Interference phases were recorded and the task execution was evaluated in terms of percentage of error in repetition. This was acted out by syllables repeated incorrectly or not repeated at all. A syllable was considered as incorrectly repeated if more than one phoneme was

TABLE 1 | Examples of complement sentences, relative sentences, and series of syllables, presented as triplets.


TABLE 2 | Conduct of the experimental procedure.


During the Training phase participants read sentences, during the No-Interference phase they repeated complement and relative sentences as well as a series of syllables and during the Interference phase participants completed Theory of Mind and language tasks concurrently.

omitted or pronounced incorrectly. We computed the percentage of mispronounced syllables over the total number of syllables to repeat. Error rates for repetition were computed for CS and for RS.

We considered three Independent Variables, the Group (i.e., NT and AS), the ±ToM condition (BA and PC), and the Interference (i.e., Complements Repetition, Relatives Repetition, Syllables Repetition, and Silence) and three Dependent Variables [i.e., Signal detection indices (d'), RTs, and Error Rates during repetition (%ER)].

### Data Analyses

By means of ANOVAs we begin with the evaluation of the possible difference between NT and AS groups in terms of age and in terms of anxiety and depression symptoms as assessed by the HAD scale. Significant differences were obtained thus we computed mean values of d', RTs, and %ER for each participant and tested Spearman correlations with age, anxiety and depression scores in order to evaluate if being more prone to anxiety and depression would have an influence on participants' performances.

We evaluated if each sensitivity indices (eight per participant according to two ±ToM conditions and four Interference conditions) was significantly above zero by mean of t-tests, in order to know if participants were above chance level. This was performed for NT and AS groups separately.

Then we performed an ANOVA on d' and RTs in order to control experimental interference induction. We computed a three-way 2 × 2 × 4 ANOVA including Group (NT and AS) as between subject factor, two within-subject factors the ±ToM (BA and PC) and the Interference (Complements Repetition, Relatives Repetition, Syllables Repetition, and Silence) on signal detection indices as well as on RTs.

First, the effect of ±ToM was computed on d' and RTs to compare our results with the previous results of Forgeot d'Arc and Ramus (2011) who found that performances on PC were better than performances on BA, and that participants took longer to answer on BA compared to PC. We also computed the Group × ±ToM interaction in order to determine if AS participants performed differently from NT participants on the BA task but not on the PC task. We then computed the Group × Interference × ±ToM interaction effect on d' and RTs so as to evaluate our hypothesis of verbally mediated strategies compensating for ToM deficits in AS. In addition, to evaluate hypotheses that specifically concerned the NT group, we computed the Interference × ±ToM interaction effect on d' and RTs within the NT group by means of planned comparisons. To assess the Emergence and Reasoning accounts, planned comparisons evaluated the difference between Syllables Repetition vs. Relatives Repetition and Syllables Repetition vs. Complements Repetition. To examine the Metarepresentation and Misrepresentation accounts, planned comparisons evaluated the difference between Relatives Repetition vs. Complements Repetition.

To finish, a three-way 2 × 2 × 2 ANOVA including Group (NT and AS) as between subject factor and two within-subject factors, the Interference phase (Interference and No-Interference) and the Syntax (CS and RS), was conducted on %ER.

We computed the Group effect in order to evaluate if AS participants were more prone to errors in repetition compared to NT participants. Furthermore, we computed the Syntax effect in order to evaluate if CS were more difficult to repeat than RS in line with the Metarepresentation account. We computed the Interference phase effect in order to evaluate if participants committed more errors in repetition during the Interference phase compared to the No-Interference phase. Finally, we evaluated the hypothesis of verbally mediated strategies in AS by the Group × Syntax and Group × Syntax × Interference phase interaction.

# RESULTS

# Effect of Control Variables and Control of Experimental Interference Induction

Participants with AS were significantly older than NT [F(1,70) = 51.21, p < 0.05] and they reported significantly more anxiety and depression symptoms (Mean = 15, SD = 1.1 point) than NT participants (Mean = 9.5, SD = 0.7 point)

[F(1,70) = 16.8, p < 0.05]. Spearman correlations indicated that there was no significant correlation between Age and the mean values of d' [r(1,72) = 0.18, p = 0.12], RTs [r(1,72) = −0.17, p = 0.15], or %ER [r(1,61) = 0.10, p = 0.46] during sentence repetition. Similarly, there was no significant correlation between HAD scores and the mean values of d' [r(1,72) = 0.12, p = 0.33] but there was a significant correlation between HAD scores and mean RTs [r(1,72) = 0.39, p < 0.05] with a tendency for more depressed and anxious participants to answer more slowly than less depressed and anxious participants. There was no significant correlation between HAD scores and the error rate during sentences repetition [r(1,61) = 0.07, p = 0.62].<sup>1</sup>

Sensitivity indices were computed for BA (i.e., Theory of Mind – ToM task) and PC (i.e., control task) in four Interference conditions (i.e., Complements Repetition, Relatives Repetition, Syllables Repetition, and Silence). T-tests results showed that the eight d' were significantly greater than zero in the NT group (all t-values > 6.31, p < 0.05) as in the AS group (all t-values > 5.82, p < 0.05).

In the NT group the effect of Interference on d' was significant [F(3,147) = 2.07, p < 0.05]. The planned comparison between Complements Repetition (Mean = 1.25, SD = 0.09) vs. Silence (Mean = 1.41, SD = 0.07) was significant [F(1,49) = 5.28, p < 0.05] just as between Syllables Repetition (Mean = 1.20, SD = 0.07) vs. Silence [F(1,49) = 14.27, p < 0.05] whereas the planned comparison between Relatives Repetition (Mean = 1.28, SD = 0.07) vs. Silence was not significant [F(1,49) = 2.72, p = 0.11] (see **Figure 2**). Moreover in the NT group the effect of Interference on RTs was non-significant [F(3,147) = 0.71, p = 0.55] as for planned comparisons between the silent and verbal conditions [All F(1,49) < 1].

In the AS group, the effect of Interference was not significant either on d' and RTs [All F(3,63) < 1] as for the planned comparisons between the silent and verbal conditions [All F(1,49) < 1].

# Results Provided by Signal Detection and Reaction Times Analyses

Despite a tendency for participants to obtain higher sensitivity indices on the PC condition (Mean = 1.35, SD = 0.06) compared to the BA condition (Mean = 1.22, SD = 0.06) the effect of ±ToM was not statistically significant [F(1,70) = 1.91, p = 0.17]. Nevertheless, participants answered significantly faster [F(1,70) = 114.8, p < 0.05] in the BA condition (Mean = 2.64 s, SD = 0.12) compared to the PC condition (Mean = 3.25 s, SD = 0.15).

The Group × ±ToM interaction was not significant on d' [F(1,70) < 1] suggesting that AS and NT participants succeeded equally to the ±ToM tasks. The Group × ±ToM interaction on RTs was significant indicating a difference of response speed between ±ToM conditions greater in AS than in NT (see **Figure 3**).

In order to evaluate our hypothesis of verbally mediated strategies to compensate for ToM deficits in AS we computed the Group × Interference × ±ToM interaction effect on d' then on RTs.

The interaction for Group × Interference × ±ToM was significant on d' [F(3,210) = 2.92, p < 0.05] suggesting that regarding signal detection indices the Interference × ±ToM interaction effect was different for NT and AS. Planned comparisons on d' showed significant differences between Silence vs. Syllables Repetition [F(1,70) = 7.80, p < 0.05] and Syllables Repetition vs. Complements Repetition [F(1,70) = 6.27, p < 0.05] (see **Figure 2**). The difference between Syllables Repetition vs. Relatives Repetition was non-significant [F(1,70) = 3.42, p = 0.07] as was also the case for Silence vs. Relatives Repetition [F(1,70) < 1], Silence vs. Complements Repetition [F(1,70) < 1], and Relatives Repetition vs. Complements Repetition [F(1,70) < 1].

The interaction for Group × Interference × ±ToM was nonsignificant on RTs [F(3,210) = 1.47, p = 0.22]. A non-significant difference was obtained for each planned comparison performed.

Within the NT group, the Interference × ±ToM interaction effect on d' was non-significant [F(1,70) = 2.55, p = 0.06]. Planned comparisons on d' indicated significant differences between Syllables Repetition vs. Relatives Repetition [F(1,70) = 4.52, p < 0.05], Syllables Repetition vs. Complements Repetition [F(1,70) = 6.12, p < 0.05], and a non-significant difference between Relatives Repetition vs. Complements Repetition [F(1,70) < 1] (see **Figure 2**).

Within the NT group, the Interference × ±ToM interaction effect on RTs was non-significant [F(1,70) = 1.05, p = 0.37] as for the planned contrasts between Syllables Repetition vs. Relatives Repetition [F(1,70) < 1], Syllables Repetition vs. Complements Repetition [F(1,70) = 1.38, p = 0.24], and Relatives Repetition vs. Complements Repetition [F(1,70) < 1].

Because the significant Group × Interference × ±ToM interaction effect indicated that Interference × ±ToM interaction effect was different between groups, we also computed this effect within the AS group by mean of contrasts. The Interference × ±ToM interaction effect within the AS group was non-significant on d' [F(1,70) = 1.18, p = 0.35] as on RTs [F(1,70) < 1].

# Results Provided by the %ER Analysis

ANOVAs on %ER during repetition showed a non-significant main effect of Group [F(1,61) = 2.45, p = 0.12], a nonsignificant main effect of Syntax (i.e., Syntactic structures) [F(1,61) = 6.30, p = 0.09] and a significant effect of Interference phase [F(1,61) = 2.45, p < 0.05] with participants committing more errors in repetition during the Interference phase (mean = 5.55%, SD = 0.69) compared to the No-Interference phase (mean = 0.05%, SD = 0.01). There was a significant Group × Syntax interaction on %ER [F(1,61) = 5.32, p < 0.05] indicating that the difference in error rates between the two Syntax conditions was greater in AS with more errors

<sup>1</sup>We obtained a non-significant effect of Gender regarding RTs [F(1,70) < 1] and %ER [F(1,61) = 1.10, p = 0.30] whereas the main effect of Gender on d' was significant [F(1,70) = 5.55, p = 0.02] with males obtaining higher d' (Mean = 1.40, SD = 0.07) compared to women (Mean = 1.12, SD = 0.07). Nevertheless, the Interference × Gender interaction was non-significant [F(3,210) < 1] as the ±ToM × Gender interaction [F(1,70) = 2.70, p = 0.10] and the Interference × ToM × Gender [F(3,210) = 1.35, p = 0.26].

Interference (Silent, Syllables, Relatives, and Complements) and ToM condition (Belief Attribution and Physical Causation).

during CS repetition compared to RS repetition (see **Figure 4**). The Group × Syntax × Interference phase interaction effect was also significant [F(1,61) = 5.36, p < 0.05] and indicated that the %ER was greater in the Interference than in the No-Interference phase for the AS group compared to the NT group, with more errors during CR compared to RS repetition (see **Figure 5**).

In addition, the interaction HAD × Syntax on %ER during repetition was not significant [F(1,61) = 1.51, p = 0.22] and neither was the interaction HAD × Syntax × Interference [F(1,31) = 1.50, p = 0.23] suggesting that the Group × Syntax and Group × Syntax × Interference effects were not explained by the greater level of depression and anxiety in AS participants.

# DISCUSSION

This study evaluated the relation between syntax and BA in NT and AS adults via a dual task paradigm. Participants performed ±ToM tasks involving BA (test condition) and PC (control condition) under four interference conditions, three verbal (Syllables Repetition, Relatives Repetition, Complements Repetition) and one Silent (i.e., control) Condition. Our goals were to assess (1) Emergence and Reasoning accounts in NT adults, (2) Metarepresentation and Misrepresentation accounts in NT adults, and (3) verbally mediated strategies to compensate ToM difficulties in adults with AS. Only major results will be discussed.

# Emergence vs. Reasoning Accounts in NT Adults

The first goal of this study was to evaluate the Emergence and Reasoning accounts that have different predictions regarding the usefulness of language for ToM reasoning during adulthood. We argued that if syntax is specifically useful for ToM reasoning, in NT adults a verbal interference task should disrupt the ability to attribute beliefs more than it disrupts the ability to perform a control task. Moreover, this interference should be more

important for syntactic tasks (i.e., Complements Repetition or Relatives Repetition) than for a control interference task (i.e., Syllables Repetition). This second point is mostly discussed in the next section (see section Metarepresentation Vs Misrepresentation Accounts in NT Adults).

In the NT group, there was a significant effect of Interference on sensitivity indices but not on RTs. Moreover, results revealed no significant interaction between Interference and ToM both in terms of sensitivity indices and in terms of RTs. This suggests that even if being involved in a verbal interference task could disrupt the ability of individuals with NT to attribute beliefs, this disturbance is not specific to ToM (i.e., BA) as it was also observed for other tasks without a ToM dimension (i.e., PC). Thus, as mentioned in previous studies on NT adults (Newton and de Villiers, 2007; Forgeot d'Arc and Ramus, 2011), we found no specific need for language during ToM compared to control tasks. This result does not support the Reasoning account and thus suggests that according to the Emergence account, language, and more specifically syntax, is useful only for ToM development.

# Metarepresentation vs. Misrepresentation Accounts in NT Adults

Our second goal was to evaluate the Metarepresentation and Misrepresentation accounts. According to the Metarepresentation account, the ability to embed a proposition into another is sufficient for ToM reasoning. Thus, being engaged in an interference task that involves RS (i.e., Relatives Repetition) should disrupt ToM as much as being engaged in an interference task that involves CS (i.e., Complements Repetition), since both of these structures involve embedding. In contrast, according to the Misrepresentation account, the most important linguistic structures for ToM are those embedding a proposition with an independent truth-value. Thus, a dual task involving RS should not disrupt ToM as much as one involving CS. Moreover, both Relatives Repetition and Complements Repetition should disrupt ToM more than Syllables Repetition.

Results of signal detection analyses (**Figure 2**) showed that NT participants performed better in the ±ToM conditions (i.e., BA and PC) when they were silent compared to when they repeated SS or CS, but not when they repeated RS. Thus, the delayed repetition of a SS, or of sentences other than RS, put NT participants in a dual task situation. This result is important, given that in previous studies (Newton and de Villiers, 2007; Forgeot d'Arc and Ramus, 2011) participants performed a continuous shadowing task that could be more difficult than a delayed repetition task. So, repeating CS but not repeating RS, disrupted the ability of NT participants to choose the right end of cartoons during the experiment. Sentences in these two conditions were as close as possible, they included the same number of syllables, the same vocabulary and were judged as similarly plausible in a preliminary experiment. The fact that repeating CS put participants in a dual task situation whereas repeating RS did not, might indicate that repeating RS is easier for NT than repeating CS, and thus, an argument in favor of the Misrepresentation account. However, this result should be interpreted with caution given that (1) planned comparison for RS and CS on the sensitivity indices was not significant, and (2) there was no argument in favor of a specific role of language during BA compared to PC (see section Emergence Vs Reasoning Accounts in NT Adults).

# Verbally Mediated Strategies to Compensate ToM Difficulties in Adults with AS

We will first present results about differences between groups regarding general performances, and later discuss results regarding the hypothesis of verbally mediated strategies to compensate for ToM deficits in adults with AS.

In terms of signal detection analyses, no difference was obtained between AS and NT (see **Figure 2**). Thus, adults with AS did not show a ToM deficit on BA. This result is not surprising

given that previous studies showed that people with ASD were able to perform FB tasks from 8 years of verbal mental age (Happé, 1995). It is important to note that even if participants with AS performed equally well to NT participants at this task, it does not mean that they have the same ToM abilities. Indeed, they could still be less performant on harder ToM tasks (e.g., second order FB, faux pas, etc.). Furthermore, participants with AS could have reached the same level of success as NT participants at these tasks using different strategies, that is to say offset strategies.

Adults with AS were also significantly slower than NT participants at answering. As illustrated in **Figure 3**, both NT and AS were significantly slower to perform PC than BA. This result is in line with Forgeot d'Arc and Ramus (2011) and could be explained by the fact that a majority (i.e., 3/5) of the videos involved Mentalistic ends with a requirement of attributing mental states to characters. Thus, participants who were trained to automatically attribute mental states, had to reevaluate the situation when they were confronted with mechanistic ends and this process might take supplementary information processing. Compared to NT, participants with AS showed increased RT differences between ToM conditions (i.e., BA and PC) (see **Figure 3**), possibly due to increased latency to refocus their attention on Mechanical indices rather than on Mentalistic ones. It could thus be argued that AS participants took more time to answer because ToM tasks were more difficult for them compared to NT adults even though the response accuracy of these groups was similar. The fact that participants with AS are slower than NT participants to answer (see **Figure 3**) is a commonly observed result (Bowler, 1997; Kaland et al., 2002). Given that HAD scores reflected that participants with AS were significantly more depressed and anxious than NT participants, and given that there was a significant correlation between HAD scores and RTs, this might also explain longer latency in AS than in NT (Emerson et al., 2005).

Our third goal was mostly transversal and consisted in the evaluation of the hypothesis of verbally mediated strategies to attribute beliefs in adults with AS. If adults with AS use syntax as a way to compensate for their ToM deficits, their ability to attribute beliefs when they are concurrently engaged in a verbal task should be significantly more disrupted than in NT adults.

The interaction for Group × Interference × ±ToM was significant regarding sensitivity indices (see **Figure 2**). This suggests that the dual task effect on sensitivity indices was different in NT and AS participants. According to the hypothesis of verbally mediated strategies to compensate ToM deficits in adults with AS, we hypothesized that the differences between the ±ToM conditions would be more important for Complements Repetition and for Relatives Repetition compared to Silence or compared to Syllables Repetition. The results did not support our hypothesis as we obtained an unexpected result for the repetition of a SS. Indeed, the SS repetition lead to a greater decline in performance for BA compared to PC in the NT group compared to the AS group (see **Figure 2**). Put differently, repeating syllables disrupted ToM (i.e., BA) in NT compared to a control task (i.e., PC) and to other dual tasks, but not in AS. We interpret this result as reflecting a higher memory load in Syllables Repetition than in the other conditions. Indeed, the Syllables Repetition condition was added as a rhythmic task involving the articulatory loop which stores and manipulates speech-based material. In a sentence made up of 11-syllables, syllables can be grouped together by words (e.g., 8 words in our material), and words can be grouped together according to syntax (i.e., six units in our material). Semantics and syntactic strategies create a reduced number of elements to remember for participants (Baddeley, 2000). However, when presented with a series of 11 syllables, participants could not create chunks to help them to remember the syllables. Participants were not informed that all series contained 11 syllables either. In light of this, repeating syllables arguably required significantly more memory load, both because they did not know how many syllables were to be repeated and could not apply semantic or syntactic strategies to create chunks of syllables.

Furthermore, another unexpected result is that in the AS group, the Interference effect and the interaction Interference × ±ToM were significant neither in terms of sensitivity indices (see **Figure 2**) nor of RTs. Thus, repeating SS, RS, or CS did not disrupt the ability to BA or to infer PC in participants with AS.

Participants with AS show deficits in executive functioning (Ozonoff et al., 1991, 2000; Hill, 2004) so compared to NT, their performance should be more disrupted during a dual task. In some studies, people with ASD were shown to be more sensitive to dual tasks (García-Villamisar and Sala, 2002; Lidstone et al., 2010). However, other studies have revealed that individuals with ASD (i.e., not specifically in AS) are not necessarily as affected as their NT peers during dual tasks. Previous studies using a dual task paradigm to evaluate the role of inner speech during executive functioning showed that children (Whitehouse et al., 2006) and adults (Wallace et al., 2009) with ASD were not disrupted as much as NT peers by articulatory suppression, arguing for limitations in the use of inner speech for executive functioning in ASD. Williams et al. (2008) showed intact inner speech use in ASD during a short-term memory task whereas García-Villamisar and Sala (2002) showed that adults with ASD were as disrupted as NT peers during executive tasks. Thus, previous results on the effect of verbal dual tasks in participants with ASD showed mixed results and it is currently difficult to have a clear overview of the field because populations (i.e., ASD with or without language delay, with or without intellectual delay) and hypotheses (i.e., role of inner speech during executive, working memory, or ToM tasks) varied amongst studies.

Recall that one of our goals was to evaluate the role of two syntactic structures during BA. We thus proposed a delayed repetition instead of a verbal shadowing task. Moreover, our participants were adults with AS, that is to say adults with autism without any language or intellectual delay. Lidstone et al. (2009) showed that inner speech impairment in children with autism is associated with greater non-verbal than verbal skills and AS is characterized by greater verbal than non-verbal skills (Chiang et al., 2014). Our result indicating that participants with AS were less sensitive to dual task effects could be explained by (1) a deficit in inner speech use during ToM or (2) by an expertise in inner speech use during ToM. Indeed, if participants with AS usually use language as a strategy to compensate their ToM

deficits, they could be more used to solving ToM tasks while they are concurrently engaged in a verbal task, as compared to NT peers. Interestingly, if repeating RS or CS did not lead to a specific decrease in performance during ±ToM tasks, being involved in such tasks disrupted participants' ability to repeat CS more than to repeat RS, during the Interference phase but not during the No-Interference phase. This result may be interpreted to suggest a specific role of CS during ±ToM tasks, along the lines of that predicted by the Misrepresentation account.

Based on the percentage of errors in repetition (%ER), results showed that the difference between CS and RS was greater in AS than in NT. Indeed, AS participants showed more errors during CS compared to RS repetition (see **Figure 4**), so adults with AS were more taxed than those with NT in CS repetition than in RS repetition, but crucially this was only during the Interference phase and not during the No-Interference phase. Thus, despite being able to repeat CS equivalently to RS as NT participants during the No-Interference phase, AS participants (but not NT participants) were more disrupted by ±ToM tasks (i.e., BA and PC) when asked to repeat CS compared to RS. This result is in favor of the Misrepresentation account and indirectly suggests verbally mediated strategies for ToM in AS (see **Figure 5**). Further studies are nevertheless needed in order to evaluate if this result is specific to ToM tasks (i.e., BA) compared to control tasks (i.e., PC). Moreover, because possible differences in IQ, executive function or language abilities between AS and NT were not examined in this study, it is thus possible that they could have play a role in explaining the results. Finally, as highlighted in the "Introduction," the current study we evaluated BA while further studies are needed in order to assess if results can be generalized to other aspects of ToM.

# CONCLUSION

Three goals have been addressed in this study. The first was to evaluate the relation between syntax and ToM in adults using a dual task paradigm. Our aim was to understand if language is useful for ToM only during development or over the lifespan. Our results do not support the Reasoning account in the NT group which rather upholds the Emergence account, whereas in

# REFERENCES


the AS group results were indirectly in favor of the Reasoning account. Our interpretation is that NT would need language during childhood only in order to develop their ToM abilities (Varley and Siegal, 2000; Varley et al., 2001; Apperly et al., 2006; Newton and de Villiers, 2007; Forgeot d'Arc and Ramus, 2011), while language in adults with AS could still be useful to ToM. Finally, syntax involving the embedding of a proposition with an independent truth-value (i.e., CS and Misrepresentation account) appears to be more important than other instances of syntactic embedding (i.e., RS and Metarepresentation account). This could suggest that adults with AS use verbally mediated strategies to compensate their ToM deficits.

# ETHICS STATEMENT

This study was carried out in accordance with the recommendations of the local ethics committee for noninterventional research, with written informed consent from all subjects. All subjects gave written informed consent in accordance with the Declaration of Helsinki. The protocol was approved by the CERNI Grenoble.

# AUTHOR CONTRIBUTIONS

MCB, MP-B, SD, AR, and MB designed research; MCB performed research; MCB and MP-B analyzed data; MCB, MP-B, SD, AR, and MB wrote the paper.

# ACKNOWLEDGMENT

We thank all of the participants for their implication in this study. We thank the Centre Expert Asperger at Grenoble, and specifically Sylvain Leigner, for their help in the recruitment of participants. We also thank Anissa Mohdeb who collected some of the data for the current study. We thank Baudoin Forgeot d'Arc and Franck Ramus for sharing their material. We are particularly grateful to Baudoin Forgeot d'Arc for the time spent on Skype, as for his help with the retrieval of some material.

Beaumont, R. B., and Sofronoff, K. (2008). A new computerised advanced theory of mind measure for children with Asperger syndrome: the ATOMIC. J. Autism Dev. Disord. 38, 249–260. doi: 10.1007/s10803-007-0384-2



**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2017 Burnel, Perrone-Bertolotti, Durrleman, Reboul and Baciu. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

# The Contribution of Grammar, Vocabulary and Theory of Mind in Pragmatic Language Competence in Children with Autistic Spectrum Disorders

Clara Andrés-Roqueta<sup>1</sup> \* and Napoleon Katsos <sup>2</sup>

<sup>1</sup> Department of Developmental, Educational Social and Methodological Psychology, Universitat Jaume I de Castelló, Castellón, Spain, <sup>2</sup> Department of Theoretical and Applied Linguistics, University of Cambridge, Cambridge, United Kingdom

Keywords: pragmatics, theory of mind, vocabulary, grammar, autistic spectrum disorders (ASD), structural language, social cognition

# PRAGMATIC COMPETENCE IN CHILDREN WITH ASD AND OTHER DEVELOPMENTAL DISORDERS

# Pragmatic Competence in ASD

Pragmatic skills enable children to produce and comprehend words and sentences in ways that are appropriate to the conversational context. While structural language is known to vary widely in children with Autistic Spectrum Disorders (ASD), pragmatic language has been claimed to be consistently impaired within this population, and has been considered a hallmark of ASD (Volden and Phillips, 2010). Specially, people with ASD frequently demonstrate unusual or inappropriate conversational behavior and deficits in a wide range of pragmatic skills (Philofsky et al., 2007). These difficulties have been experimentally demonstrated in detecting violations of maxims of conversation (Surian et al., 1996), understanding figurative language (Happé, 1993; Norbury, 2005), using context to disambiguate polysemous words (Jolliffe and Baron-Cohen, 1999; Brock et al., 2008), managing topic maintenance and topic shifts (Volden and Phillips, 2010), and comprehending humor, drawing inferences from narratives and understanding indirect requests (Ozonoff and Miller, 1996).

Pragmatic difficulties are often attributed to intrinsic features of ASD. These include a weaker tendency to integrate information from the context (Weak Central Coherence; Happé and Frith, 2006), a deficit in Theory-of-Mind (ToM) that prevents children with ASD from inferring intentions and mental states of other people (Baron-Cohen et al., 1985), a deficit in executive functions such as poor inhibition or cognitive flexibility (Hill, 2004) or lack of social motivation as a result of an attenuated social instinct (Chevallier et al., 2012).

# Pragmatics in Other Communication Disorders

Children with developmental disorders with communication problems in the absence of ToM deficits like Specific Language Impairment (SLI), also display pragmatic difficulties when screening instruments and conversational analysis are used (Adams, 2002; Norbury et al., 2004). Experimental studies have shown that children with SLI face deficits in sensitivity to maxims of conversation, figurative language understanding, narrative or use of context to resolve ambiguities (Surian et al., 1996; Norbury, 2004, 2005; Brock et al., 2008; Katsos et al., 2011; Norbury et al., 2014), and that their pragmatic skills are in keeping with levels of their structural language, as they perform as successfully as younger typically-developing (TD) children matched on language level at experimental tasks.

### Edited by:

Stephanie Durrleman, Université de Genève, Switzerland

### Reviewed by:

Anne Colette Reboul, Claude Bernard University Lyon 1, France Jill Gibson De Villiers, Smith College, United States

\*Correspondence:

Clara Andrés-Roqueta candres@uji.es

### Specialty section:

This article was submitted to Cognitive Science, a section of the journal Frontiers in Psychology

Received: 07 April 2017 Accepted: 30 May 2017 Published: 15 June 2017

### Citation:

Andrés-Roqueta C and Katsos N (2017) The Contribution of Grammar, Vocabulary and Theory of Mind in Pragmatic Language Competence in Children with Autistic Spectrum Disorders. Front. Psychol. 8:996. doi: 10.3389/fpsyg.2017.00996

Nevertheless, the extent of pragmatic impairments in children with ASD and other children with social communication disorders, as well as the underlying cause of these impairments, is still an open question for research and practitioners (Adams, 2002; Norbury, 2014).

# POTENTIAL CAUSES OF PRAGMATIC DIFFICULTIES IN ASD

Recent research has questioned the traditional views of pragmatic competence in ASD: Are children with ASD universally challenged by pragmatics? Are these challenges due to some deficit of ToM intrinsic to ASD?

# The Role of ToM

The literature reveals reliable associations between the process of understanding the ironic meaning of utterances and ToM skills. For example, there is correlational evidence between success with irony understanding and passing False Belief tasks both in children with ASD and TD (Happé, 1993; Filippova and Astington, 2008), as well as evidence that irony comprehension and ToM processing activate the same neural regions in neurotypical adults (Spotorno et al., 2012).

However, Norbury (2004, 2005, 2014) and Happé (1993) reached different conclusions as regards the role of ToM in pragmatic competence of children with ASD. An alternative proposal is that the pragmatic language deficits observed in children with ASD are due to difficulties with grammar and vocabulary, known collectively as structural language (Norbury, 2005; Gernsbacher and Pripas-Kapit, 2012).

# The Role of Structural Language: Grammar and Vocabulary

In particular, Norbury (2004, 2005) put this hypothesis to the test by studying four groups of children with the presence or absence of ASD and of Language Impairment (LI) in metaphor and idiom tasks (ASD-LI; ASD+LI; LI; and age-matched TD). Crucially, structural language competence was measured both expressively and receptively, and both in terms of vocabulary and grammar. It was found that all groups with language impairment (ASD+LI and LI) were indeed impaired in comprehension of metaphors and idioms, but the group with ASD-LI (without LI) performed as well as the TD participants. Moreover, regression analyses revealed that structural language and world-knowledge were the critical predictors for idioms and metaphors, whereas ToM was not. Furthermore, a recent meta-analytic review of experimental studies in figurative language concluded that the differences between children with ASD and TD groups were not statistically significant when the groups were matched on language ability (Kalandadze et al., 2016).

Likewise, it has been demonstrated that children with ASD perform as well as TD peers on the ability to detect pragmatic violations, such as utterances that are literally true but pragmatically under-informative (e.g., "some of the apples are inside the boxes" when shown a picture where all of the apples are inside the boxes), and that higher verbal IQ scores predicted higher sensitivity to under-informativeness within the ASD group (Chevallier et al., 2010). Similar conclusions are reached in studies with an adult ASD population (Pijnacker et al., 2009). However, in the two last studies, the lack of ToM measures prevents establishing a unique contribution of structural language.

# TWO DIFFERENT PRAGMATIC SKILLS: LINGUISTIC- VS. SOCIAL-PRAGMATICS

Here we propose that the relationship between pragmatics, structural language and ToM is not "fixed" but rather modulated by specific properties of the interaction (lexical, syntactic and social-interactional aspects)<sup>1</sup> . In addition to differences in measurement of independent factors (e.g., measuring ToM and structural language in different ways), an element that may explain the variation in research findings within the ASD population is that different ways of testing pragmatics may differentially engage structural language and ToM.

A distinction between types of pragmatic inferences has gained much support in the theoretical pragmatics literature, classified by the extent to which they require ToM skills: Sperber (1994) mentions "Egocentric Relevance" (which does not involve ToM skills), "Allocentric Relevance" (which requires 1st order ToM) and "Gricean" interpretative strategies (which require 2nd order ToM); Levinson (2000) distinguishes between "generalized" and "particularized" pragmatic inferences; Recanati (2004) introduces a distinction between "primary" and "secondary pragmatic processes"; and more recently, O'Neill (2012) uses the terms "social pragmatics," "mindful pragmatics" and "cognitive pragmatics" while Kissine (2012) discusses intersubjective and non-intersubjective aspects of language use. This view has also been supported by empirical research in children with and without communication disorders, suggesting that, for some kinds of pragmatics, a sentence may be fully interpretable based on pragmatic norms and the context as provided from the listeners' egocentric point of view, without the need to infer the speakers' mental state (de Villiers et al., 2007; Kissine, 2012; O'Neill, 2012; Kissine et al., 2015; Janke and Perovic, 2016).

At this point, we suggest two new terms: linguistic-pragmatics and social-pragmatics. We think that they are more intuitively transparent as regards the role of structural language and ToM in each type of pragmatic skill. The term linguistic-pragmatics would be for those cases of pragmatics where structural language and competence with pragmatic norms are enough to perform successfully in the task, while we use the term social-pragmatics for those circumstances where in addition to structural language and pragmatics, the child needs competence with ToM, and specifically the ability to represent other people's intentions, desires and beliefs.

<sup>1</sup> An additional interaction between these concepts arises if participants use their mastery of the syntax of complementation to pass False Belief tasks (de Villiers et al., 2003). This is a possibility that we do not explore here but should be taken into account in future work.

# Linguistic-Pragmatics

Sensitivity to informativeness, as tested by Chevallier et al. (2010), Katsos et al. (2011), and Pijnacker et al. (2009) is a case in point of "linguistic-pragmatics." For example, in order to reject pragmatically infelicitous sentences of a speaker saying that "some of the apples are inside the boxes" (given a picture in which all of the apples are inside the boxes), a child need to draw on vocabulary knowledge (a child who has mastered the semantic meaning of "some" and "all" will know that "all" is a more informative expression), together with sensitivity to the pragmatic maxim that instructs speakers to avoid being underinformative. However, demands on ToM are minimal, because the knowledge that is necessary to evaluate if the utterance is informative or not is visually accessible and shared between the child and the speaker.

As a result, empirical evidence shows that structural language is the key predictor for success with informativeness (Pijnacker et al., 2009; Chevallier et al., 2010 in participants with ASD; Katsos et al., 2011, in participants with SLI).

# Social Pragmatics

In contrast, there are cases that do require using ToM skills. The irony task used by Happé (1993) is a case in point.

In one of the stories, the main character (David) is baking a cake and places the eggs in the batter without removing the shells, and his dad says: "What a clever boy you are, David!." Here, in order to understand this ironic utterance, a child needs to use his/her competence with structural language to grasp the literal meaning, together with the pragmatic maxim that enjoins interlocutors to be relevant and truthful. Moreover, the child does need to use ToM skills for two reasons. First, in order to avoid attributing to David's dad a false belief (David is clever), that would nevertheless be consistent with the literal meaning of the utterance. And second, in order to take into account the true belief of David's dad (David is not clever) that is inconsistent with the literal meaning of the sentence, but becomes consistent once pragmatic inference has taken place.

Consequently, the Strange Stories task (Happé, 1994), which is constructed on similar principles as Happé's (1993) irony task, could be a good measure for social-pragmatic skills. Although this task has been typically used to assess mentalizing through the recognition of the communicative intentions of people using indirect or non-literal utterances, the characters of the stories have unusual or unexpected mental states that are not compatible with the literal meaning of what they say. Correctly inferring those mental states is a prerequisite for making the pragmatic inferences that allow the participant to tell if what the characters say is appropriate or not for the context.

We should clarify here that we do not visualize the distinction between linguistic-pragmatics and social-pragmatics as one to do with pragmatic phenomena per se, but with the communicative situation. Tasks that measure informativeness, for example, need not always fall under the umbrella of linguistic-pragmatics. There may well be cases where sensitivity to informativeness will require ToM and therefore be considered social-pragmatic, e.g., in cases where the speaker but not the hearer has only partial knowledge of the facts.

# THEORETICAL, EXPERIMENTAL, AND CLINICAL IMPLICATIONS

The different roles of ToM and structural language in pragmatics tasks may help to explain some of the variance in findings from previous reports on children with ASD. We expect that the linguistic-pragmatic difficulties of children with ASD (and of children with other developmental disorders like SLI), will be in keeping with structural language (grammar and vocabulary) in tasks such as sensitivity to under-informativeness, but in keeping with their ToM skills in tasks that require social-pragmatic competence, such as irony stories from Happé (1994) Strange Stories task.

Structural language is implicated in the success with pragmatics, including metaphor understanding (Norbury, 2004), informativeness (Pijnacker et al., 2009; Chevallier et al., 2010), idioms (Norbury, 2005), use of context for disambiguation (Brock et al., 2008), and it is one of the significant (if not the only) predictors of success. This highlights the importance of considering structural language when assessing pragmatic difficulties, in order to establish whether any pragmatic difficulties go beyond the overall linguistic differences that a child presents.

Furthermore, structural language must have a key role in intervention. It is likely that in addition to interventions that directly target pragmatic competence, support for structural language is the one component that will benefit all children who show pragmatic difficulties (Kalandadze et al., 2016). Additionally, intervention with ToM is also likely to support pragmatic language in some specific situations.

Finally, the distinction between linguistic- and socialpragmatics may help clarify for some questions pertaining to diagnostic categories. Social (Pragmatic) Communication Disorder has recently been proposed as a distinct diagnostic category (see American Psychiatric Association, 2013). Among others, this disorder includes deficits in using communication for social exchange, adapting communication style to the context, following rules of conversation or narrative convention and understanding implicit or ambiguous language (Norbury, 2014). If our proposal is correct, these deficits are at least partially distinct, as they include both what we called linguistic-pragmatic and social-pragmatic competences. They are also likely to be present in children with ASD, SLI and other disorders, depending on the extent of structural language and ToM impairments.

Screening instruments and diagnostic procedures that measure communicative and pragmatic competence may also take into account the distinction between linguistic- and social-pragmatic competences, which at present tend not to be differentiated (e.g., in the Children's Communication Checklist–2, CCC-2, Norbury et al., 2014).

# AUTHOR CONTRIBUTIONS

Both authors (CA and NK) have contributed in the selection, reflective, and analytical reading of the different studies, making a thorough comparison of the methods used, and discussing the reflections between both to translate it into the manuscript. Moreover, every section has been written and discussed equally by CA and NK, so the opinion given in the ms is from both authors. As the first author, CA has been responsible for minor issues in relation to the final wording of the document, but both authors have revised together the final manuscript before sending it.

# REFERENCES


# ACKNOWLEDGMENTS

The first author wants thank support from Grant GV/2015/092 by Conselleria de Educación, Cultura y Deporte of Generalitat Valenciana (Spain), Grant UJI-A2016-12 funded by Universitat Jaume I de Castelló, and the second author wants to thank support from AHRC (AH/N004671/1) and a British Academy Small Research Grant (SG-47135).


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2017 Andrés-Roqueta and Katsos. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.