# EXECUTIVE FUNCTION AND EDUCATION

EDITED BY : Mariëtte Huizinga, Dieter Baeyens and Jacob A. Burack PUBLISHED IN : Frontiers in Psychology

#### Frontiers Copyright Statement

© Copyright 2007-2018 Frontiers Media SA. All rights reserved. All content included on this site, such as text, graphics, logos, button icons, images, video/audio clips, downloads, data compilations and software, is the property of or is licensed to Frontiers Media SA ("Frontiers") or its licensees and/or subcontractors. The copyright in the text of individual articles is the property of their respective authors, subject to a license granted to Frontiers.

The compilation of articles constituting this e-book, wherever published, as well as the compilation of all other content on this site, is the exclusive property of Frontiers. For the conditions for downloading and copying of e-books from Frontiers' website, please see the Terms for Website Use. If purchasing Frontiers e-books from other websites or sources, the conditions of the website concerned apply.

Images and graphics not forming part of user-contributed materials may not be downloaded or copied without permission.

Individual articles may be downloaded and reproduced in accordance with the principles of the CC-BY licence subject to any copyright or other notices. They may not be re-sold as an e-book.

As author or other contributor you grant a CC-BY licence to others to reproduce your articles, including any graphics and third-party materials supplied by you, in accordance with the Conditions for Website Use and subject to any copyright notices which you include in connection with your articles and materials.

All copyright, and all rights therein, are protected by national and international copyright laws.

The above represents a summary only. For the full conditions see the Conditions for Authors and the Conditions for Website Use. ISSN 1664-8714 ISBN 978-2-88945-572-0 DOI 10.3389/978-2-88945-572-0

#### About Frontiers

Frontiers is more than just an open-access publisher of scholarly articles: it is a pioneering approach to the world of academia, radically improving the way scholarly research is managed. The grand vision of Frontiers is a world where all people have an equal opportunity to seek, share and generate knowledge. Frontiers provides immediate and permanent online open access to all its publications, but this alone is not enough to realize our grand goals.

#### Frontiers Journal Series

The Frontiers Journal Series is a multi-tier and interdisciplinary set of open-access, online journals, promising a paradigm shift from the current review, selection and dissemination processes in academic publishing. All Frontiers journals are driven by researchers for researchers; therefore, they constitute a service to the scholarly community. At the same time, the Frontiers Journal Series operates on a revolutionary invention, the tiered publishing system, initially addressing specific communities of scholars, and gradually climbing up to broader public understanding, thus serving the interests of the lay society, too.

#### Dedication to Quality

Each Frontiers article is a landmark of the highest quality, thanks to genuinely collaborative interactions between authors and review editors, who include some of the world's best academicians. Research must be certified by peers before entering a stream of knowledge that may eventually reach the public - and shape society; therefore, Frontiers only applies the most rigorous and unbiased reviews.

Frontiers revolutionizes research publishing by freely delivering the most outstanding research, evaluated with no bias from both the academic and social point of view. By applying the most advanced information technologies, Frontiers is catapulting scholarly publishing into a new generation.

#### What are Frontiers Research Topics?

Frontiers Research Topics are very popular trademarks of the Frontiers Journals Series: they are collections of at least ten articles, all centered on a particular subject. With their unique mix of varied contributions from Original Research to Review Articles, Frontiers Research Topics unify the most influential researchers, the latest key findings and historical advances in a hot research area! Find out more on how to host your own Frontiers Research Topic or contribute to one as an author by contacting the Frontiers Editorial Office: researchtopics@frontiersin.org

# EXECUTIVE FUNCTION AND EDUCATION

Topic Editors:

Mariëtte Huizinga, Vrije Universiteit Amsterdam, Netherlands Dieter Baeyens, KU Leuven, Leuven, Belgium Jacob A. Burack, McGill University, Canada

Image: Maglara/Shutterstock.com

Executive function is an umbrella term for various cognitive processes that are central to goal-directed behavior, thoughts, and emotions. These processes are especially important in novel or demanding situations, which require a rapid and flexible adjustment of behavior to the changing demands of the environment. The development of executive function relies on the maturation of associated brain regions as well as on stimulation in the child's social contexts, especially the home and school. Over the past decade, the term executive function has become a buzzword in the field of education as both researchers and educators underscore the importance of skills like goal setting, planning, and organizing in academic success. Accordingly, in initiating this Research Topic and eBook our goal was to provide a forum for stateof-the-art theoretical and empirical work on this that both facilitates communication among researchers from diverse fields and provides a theoretically sound source of information for educators. The contributors to this volume, who hail from several different countries in Europe and North America, have certainly accomplished this goal in their nuanced and cutting-edge depictions of the complex links among various executive function components and educational success.

Citation: Huizinga, M., Baeyens, D., Burack, J. A., eds. (2018). Executive Function and Education. Lausanne: Frontiers Media. doi: 10.3389/978-2-88945-572-0

# Table of Contents

*05 Editorial: Executive Function and Education* Mariëtte Huizinga, Dieter Baeyens and Jacob A. Burack

## EXECUTIVE FUNCTION AS PREDICTOR FOR ACADEMIC OUTCOMES

*08 Early Executive Function at Age Two Predicts Emergent Mathematics and Literacy at Age Five*

Hanna Mulder, Josje Verhagen, Sanne H. G. Van der Ven, Pauline L. Slot and Paul P. M. Leseman

*22 Executive Function Buffers the Association Between Early Math and Later Academic Skills*

Andrew D. Ribner, Michael T. Willoughby, Clancy B. Blair and The Family Life Project Key Investigators


### TEACHER, PARENT, AND FAMILY FACTORS IN THE RELATIONSHIP BETWEEN EXECUTIVE FUNCTION AND ACADEMIC OUTCOMES


### INTERVENTIONS AND THEIR IMPACT ON EXECUTIVE FUNCTION AND ACADEMIC OUTCOMES

*133 Relationships Between Motor and Executive Functions and the Effect of an Acute Coordinative Intervention on Executive Functions in Kindergartners*

Marion Stein, Max Auerswald and Mirjam Ebersbach

## *147 Mindfulness Plus Reflection Training: Effects on Executive Function in Early Childhood*

Philip David Zelazo, Jessica L. Forston, Ann S. Masten and Stephanie M. Carlson


## Editorial: Executive Function and Education

#### Mariëtte Huizinga<sup>1</sup> \*, Dieter Baeyens <sup>2</sup> and Jacob A. Burack <sup>3</sup>

<sup>1</sup> Department of Educational and Family Studies, LEARN! Research Institute, Vrije Universiteit Amsterdam, Amsterdam, Netherlands, <sup>2</sup> Faculty of Psychology and Educational Sciences, KU Leuven, Leuven, Belgium, <sup>3</sup> Department of Educational and Counselling Psychology, McGill University, Montreal, QC, Canada

Keywords: executive function, education, academic achievement, intervention, development, parents, family, teachers

**Editorial on the Research Topic**

#### **Executive Function and Education**

Executive function is an umbrella term for various cognitive processes that are central to goaldirected behavior, thoughts, and emotions. These processes are especially important in novel or demanding situations, which require a rapid and flexible adjustment of behavior to the changing demands of the environment. The development of executive function relies on the maturation of associated brain regions as well as on stimulation in the child's social contexts, especially the home and school. Over the past decade, the term executive function has become a buzzword in the field of education as both researchers and educators underscore the importance of skills like goal setting, planning, and organizing in academic success. Accordingly, in initiating this Research Topic/eBook our goal was to provide a forum for state-of-the-art theoretical and empirical work on this both facilitates communication among researchers from diverse fields and provides a theoretically sound source of information for educators. The contributors to this volume, who hail from several different countries in Europe and North America, have certainly accomplished this goal in their nuanced and cutting-edge depictions of the complex links among various executive function components and educational success.

In trying to present a coherent presentation of the many excellent contributions, we conceptually divided the papers in this Research Topic/eBook into the three broad sections of (1) executive function as predictor for academic outcomes, (2) teacher, parent, and family factors in the relationship between executive function and academic outcomes, and (3) interventions and their impact on executive function and academic performance.

## EXECUTIVE FUNCTION AS PREDICTOR FOR ACADEMIC OUTCOMES

The first section of this Research Topic/eBook is focused on executive function as predictors of academic outcomes. All five papers are empirical reports on the extent to which executive function very early in the child's life predict educational success later in childhood (Daucourt et al.; Dekker et al.; Mulder et al.; Ribner et al.; Von Suchodoletz et al.). In a longitudinal study among 552 children in the Netherlands, Mulder et al. find that executive function abilities at age 2 years are significant and relatively strong predictors of both emergent mathematics and literacy tasks at age 5 years, after controlling for receptive vocabulary, parental education, and home language. In a longitudinal study of a sample of 1,292 children between the ages of ∼10.5 and 12.5 years from low-income families in the United States, Ribner et al. highlight that, in addition to being a unique

Edited and reviewed by: Jessica S. Horst, University of Sussex, United Kingdom

> \*Correspondence: Mariëtte Huizinga m.huizinga@vu.nl

#### Specialty section:

This article was submitted to Developmental Psychology, a section of the journal Frontiers in Psychology

Received: 23 April 2018 Accepted: 13 July 2018 Published: 03 August 2018

#### Citation:

Huizinga M, Baeyens D and Burack JA (2018) Editorial: Executive Function and Education. Front. Psychol. 9:1357. doi: 10.3389/fpsyg.2018.01357

**5**

predictor of success in both 5th grade math and reading, high levels of early executive function can help to compensate for low levels of academic ability in Pre-K. In a longitudinal study of a hybrid model of reading disability (a composite consisting of four symptoms, including low word reading achievement, unexpected low word reading achievement, poorer reading comprehension compared to listening comprehension, and dualdiscrepancy response-to-intervention), Daucourt et al. find that three of the components of executive function—inhibition, updating working memory, and shifting—are similarly predictive of subsequent reading disability among a group of 420 children between the ages of almost 5–10.5 years in the United States. Similarly, in study in The Netherlands, Dekker et al. find that teacher and child measures of working memory and shifting are significantly associated with math and spelling outcome among first and second graders (range 6.25–8.5 years old; N = 84). In considering individual differences among a group of 69 first, 121 third, and 85 eighth grade students (mean ages 7.2, 8.5, and 14 years, respectively) in Germany, Von Suchodoletz et al. find that attention shifting is related to spelling outcomes for all three age groups, but that this relationship is further specified by sex differences among the first and eight graders.

## TEACHER, PARENT, AND FAMILY FACTORS IN THE RELATIONSHIP BETWEEN EXECUTIVE FUNCTION AND ACADEMIC OUTCOMES

The four articles in the second section highlight the notion that the study of development of executive function in relation to academic outcomes cannot be confined solely to the study of the child, but must be broadened to include the impact of the essential persons and contexts in the child's life, including teachers, parents, and family situation. Two papers are focused on the role of parental and teacher support on the impact of executive function on school performance (Devine et al.; Vandenbroucke et al.) and the other two on the impact of family risk factors on the relationship between executive function and educational success (Berthelsen et al.; Welsh et al.). In a longitudinal study of 117 parent-child dyads in the United Kingdom, with children between the ages of 3 and 4 at baseline, Devine et al. find that three aspects of parental behavior—parental scaffolding, negative parentchild interactions, and the provision of informal learning opportunities—are unrelated to each other and all show unique contributions to children's early academic ability as executive function mediates the relations between parental scaffolding and negative parent-child interaction and children's early academic ability. In contrast, parental provision of opportunities for learning in the home environment was directly related to children's academic abilities. In a Belgian study of the role of parent and teacher emotional support in promoting working memory performance by buffering the negative effect of social stress in 170 children in grades 1 and 2 (mean age = 7.6 years), Vandenbroucke et al. find that parents and teachers can have a substantial influence on children's working memory performance by offering adequate emotional support, confirming the idea that cognitive processes, such as working memory, do not merely depend on maturation but can also be supported or hindered by environmental factors. Drawing on wave 1 (4–5 years old) and wave 6 (14–15 years old) from the Growing up in Australia: The Longitudinal Study of Australian Children (N = 4,983), Berthelsen et al. find that higher child behavior risk, lower socioeconomic position, and child behavior risk are associated with poorer executive function in adolescence. While the effects of the early ecological risk on the development of executive function are relatively small, they operate through children's early selfregulatory behaviors of attentional regulation and approaches to learning, at the beginning of the school years. In an initial study in the United States on the deleterious outcomes of a self-reported history of child-maltreatment (including emotional and physical abuse and neglect, and sexual abuse) in relation to college academic outcomes in terms of GPA and selfreported adjustment among 64 students, Welsh et al. find that relatively "hot" executive function serve as a link among child maltreatment experiences and college achievement and adaptation.

## INTERVENTIONS AND THEIR IMPACT ON EXECUTIVE FUNCTION AND ACADEMIC OUTCOMES

The articles in the final section are focused on interventions and their impact on executive function and academic performance (Kamkar and Morton; Solomon et al.; Stein et al.; Zelazo et al.). In a study of 101 children in German kindergarten randomly assigned either to a coordinative intervention or to a control condition, Stein et al. find no effect of the intervention on executive function of children in a kindergarten setting. In a pre-test, post-test, follow-up randomized-control trial in the United States of 218 preschool children (mean age = 4.75 years) from schools serving low-income families randomly assigned to Mindfulness + Reflection training; Literacy training; or Business as Usual (BAU) options delivered by trained teachers in 30 small-group sessions over 6 weeks, Zelazo et al. find that that executive function improved in all groups, but the Mindfulness + Reflection group significantly outperformed the BAU group, which did not differ from the literacy group, at follow-up. In Canada, in a cluster-randomized controlled trial of 260 3- and 4 year-olds assigned to either the Tools of the Mind preschool curriculum designed to target self-regulation through imaginative play and self-regulatory language or Playing to Learn (another play-based program that does not target self-regulation specifically), Solomon et al. find no effect of curriculum on any of the outcome measures although children with high levels of hyperactivity/inattention who received Tools instruction showed greater improvement in self-regulation. In a conceptual paper based on empirical evidence, Kamkar and Morton propose the CanDiD framework in which they highlight that dynamic and contextual influences on EF must be considered in relation to development and individual differences, and that these factors are relevant to remedial interventions and curriculum design.

## CONCLUSIONS AND FUTURE DIRECTIONS

This collection of papers highlights that executive function is pivotal for academic achievement. The link is already apparent at preschool age when executive function predicts emergent mathematics and literacy skills (Mulder et al.). In addition, later on in development, working memory, inhibition, and cognitive flexibility are all predictors of (disabilities in) mathematics, reading and spelling in primary and secondary education (Daucourt et al.; Dekker et al.; Ribner et al.; Von Suchodoletz et al.). The predictive value of executive function for academic achievement seems to be robust for controlling measures of socio-economic status, home language, receptive vocabulary, etc., as well as for national differences in schooling systems.

Although brain maturation is important for the development of executive function especially in periods of rapid growth, this development is highly sensitive to influences from environmental factors (Anderson et al., 2008). Yet, researchers have only recently began to focus on the impact of children's social environment on EF development (Hughes, 2011). The importance of both distal and proximal parent and family factors (e.g., parental scaffolding, negative parent-child interactions) as well as characteristics of the teacher-child interactions (e.g., emotional support) for executive function development and in turn academic achievement are stressed

#### in several papers in this collection (Berthelsen et al.; Devine et al.; Vandenbroucke et al.; Welsh et al.). In line with the CanDiD framework (Kamkar and Morton), the findings from these studies suggest that context factors should be taken into account in remedial interventions and curriculum design.

As indicated by the current collection of intervention studies, executive function training programs (1) seem to evolve into broader intervention programs, which are generally implemented in the specific context where the executive function should be applied for the actions of interest (e.g., reading, spelling, mathematics) and (2) should be individually tailored to the needs of the particular child in order to deal with interindividual differences in executive function performance and development (Solomon et al.; Stein et al.; Zelazo et al.). By doing so, the central role of executive function in educational practice can be stimulated and optimized.

We thank the contributors for their thoughtful and provocative contributions to this Research Topic/eBook and hope that this collection will both add to the current literature and serve as foundation for future empirical and applied work to better the academic outcomes of children worldwide.

## AUTHOR CONTRIBUTIONS

All authors listed have made a substantial, direct and intellectual contribution to the work, and approved it for publication.

## REFERENCES

Anderson, V., Jacobs, R., and Anderson, P. J. (2008). Executive Functions and the Frontal Lobes: A Lifespan Perspective. New York, NY: Psychology Press.

Hughes, C. (2011). Changes and challenges in 20 years of research into the development of executive functions. Infant Child Dev. 20, 251–271. doi: 10.1002/icd.736

**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2018 Huizinga, Baeyens and Burack. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

## Early Executive Function at Age Two Predicts Emergent Mathematics and Literacy at Age Five

Hanna Mulder\*, Josje Verhagen† , Sanne H. G. Van der Ven, Pauline L. Slot and Paul P. M. Leseman

Department of Special Education: Cognitive and Motor Disabilities, Utrecht University, Utrecht, Netherlands

Previous work has shown that individual differences in executive function (EF) are predictive of academic skills in preschoolers, kindergartners, and older children. Across studies, EF is a stronger predictor of emergent mathematics than literacy. However, research on EF in children below age three is scarce, and it is currently unknown whether EF, as assessed in toddlerhood, predicts emergent academic skills a few years later. This longitudinal study investigates whether early EF, assessed at two years, predicts (emergent) academic skills, at five years. It examines, furthermore, whether early EF is a significantly stronger predictor of emergent mathematics than of emergent literacy, as has been found in previous work on older children. A sample of 552 children was assessed on various EF and EF-precursor tasks at two years. At age five, these children performed several emergent mathematics and literacy tasks. Structural Equation Modeling was used to investigate the relationships between early EF and academic skills, modeled as latent factors. Results showed that early EF at age two was a significant and relatively strong predictor of both emergent mathematics and literacy at age five, after controlling for receptive vocabulary, parental education, and home language. Predictive relations were significantly stronger for mathematics than literacy, but only when a verbal short-term memory measure was left out as an indicator to the latent early EF construct. These findings show that individual differences in emergent academic skills just prior to entry into the formal education system can be traced back to individual differences in early EF in toddlerhood. In addition, these results highlight the importance of task selection when assessing early EF as a predictor of later outcomes, and call for further studies to elucidate the mechanisms through which individual differences in early EF and precursors to EF come about.

Keywords: executive function, two-year-olds, mathematics, literacy, kindergartners

## INTRODUCTION

Individual differences in executive function (EF) in early childhood have often been shown to be predictive of later academic skills (Blair and Razza, 2007; McClelland et al., 2007; Bull et al., 2008; Clark et al., 2010; Geary et al., 2012). EF refers to a set of cognitive processes needed for goaldirected thought and behavior, and is typically considered to include working memory, inhibition, and shifting (Hughes, 1998; Miyake et al., 2000; Garon et al., 2008). There is now vast evidence that

#### Edited by:

Mariëtte Huizinga, VU University Amsterdam, Netherlands

#### Reviewed by:

Wenke Möhring, University of Basel, Switzerland Rebecca Merkley, University of Oxford, United Kingdom

> \*Correspondence: Hanna Mulder h.mulder2@uu.nl †Shared first author

#### Specialty section:

This article was submitted to Developmental Psychology, a section of the journal Frontiers in Psychology

Received: 19 April 2017 Accepted: 19 September 2017 Published: 12 October 2017

#### Citation:

Mulder H, Verhagen J, Van der Ven SHG, Slot PL and Leseman PPM (2017) Early Executive Function at Age Two Predicts Emergent Mathematics and Literacy at Age Five. Front. Psychol. 8:1706. doi: 10.3389/fpsyg.2017.01706

**8**

EF predicts mathematics (Bull and Scerif, 2001; St. Clair-Thompson and Gathercole, 2006; Brock et al., 2009; Lee et al., 2012; Van der Ven et al., 2012; Friso-van den Bos et al., 2013), and (early) literacy and reading (Adams and Snowling, 2001; Welsh et al., 2010; Engel de Abreu et al., 2014), both concurrently and over time. Across studies, relationships with EF are generally stronger for mathematics than for literacy and reading (Brock et al., 2009; Willoughby et al., 2012; Fitzpatrick et al., 2014; McClelland et al., 2014; but see Miller et al., 2013).

Most of the earlier work on the predictive value of EF for later academic performance has focused on kindergartners and school-aged children (Blair and Razza, 2007; Mazzocco and Kover, 2007; Bull et al., 2008; Best et al., 2011; Toll et al., 2011; Willoughby et al., 2012). Research on EF in children below age three is relatively scarce. EF typically develops rapidly at this young age (Garon et al., 2013), which might make EF a valuable target for early identification of at-risk children and subsequent interventions. However, the rapid development of EF may imply that EF should not be assessed too early, as the construct might then be unstable.

In the present study, we investigate to what degree individual differences in EF predict later (emergent) academic skills, when EF is assessed at a very young age, that is, in two-year-old children. Recent advances in assessment methods of EF in infants and toddlers enabled us to study EF in such young children, and consequently, begin to explore the predictive value of EF in the first years of life for later (academic) outcomes (Garon et al., 2008, 2013; Mulder et al., 2014; Hendry et al., 2016).

Major advances in assessment methods of EF in very young children have occurred in at least two ways over the past decade. First, an increasing number of EF tests has been designed for children this young (e.g., Hughes and Ensor, 2005; Garon et al., 2008, 2013; Willoughby et al., 2010; Mulder et al., 2014). These tasks are often brief to administer, to make them suitable for infants' and toddlers whose attention spans are relatively short, and have simple instructions, sometimes accompanied by gestures, to reduce the influence of language skills on task performance. Second, there is increasing awareness amongst researchers that the most reliable measure of EF can be obtained by working with a battery of EF tasks and latent factor modeling, rather than using single task scores (Willoughby et al., 2010; see also Bull and Lee, 2014 for a similar discussion regarding the assessment of EF in older children). Scores on single EF tasks are likely to be strongly confounded with individual differences in motor and language skills, and subject to high measurement error in young children. Such influences are reduced when working with latent factors, particularly if motor and language demands vary between tasks. In support of this, Willoughby et al. (2010) showed that correlations between EF, IQ and ratings of ADHD symptoms were much stronger when working with a latent EF factor compared to working with separate EF task scores in three-year-olds.

## Factor Structure of EF in Early Childhood

Following the seminal work by Miyake et al. (2000), a tripartite distinction in EF is usually made, according to which EF involves three cognitive functions: (i) working memory, or the ability to update information which is stored in memory, (ii) inhibition, or the ability to suppress automatized or predominant responses, and (iii) shifting, or the ability to switch between cognitive sets or tasks. In a recent update of their model, Miyake and Friedman (2012) included a common EF factor, representing shared variance across all EF tasks, and additional specific shifting and working memory factors. In this more recent model, the factor previously labeled inhibition is replaced with the common EF factor.

Studies on the latent factor structure of EF in young children show mixed results, which are likely at least in part due to interstudy variability in the EF measures used across studies (see also Miller et al., 2012). However, a general finding is that EF becomes increasingly differentiated with age. Specifically, in school-aged children, two- or three-factor models of EF, including working memory, inhibition, and/or shifting factors, are often reported (e.g., Huizinga et al., 2006; Lee et al., 2013; Van der Ven et al., 2013). For children below age four, most studies find that different tasks assumed to assess different EF processes typically load onto one single latent EF factor (Wiebe et al., 2008, 2011; Willoughby et al., 2010, but see Hughes, 1998; Miller et al., 2012).

The idea that EF becomes increasingly differentiated with age receives support from studies in which the same EF battery was administered to children of a broad age range. Three such studies have shown that a single latent EF factor fitted the data best up until middle childhood, while multiple latent factors proved a better fit in early adolescence (Shing et al., 2010; Lee et al., 2013; Xu et al., 2013). Thus, notwithstanding mixed findings in earlier work on the factor structure of EF, a relatively robust finding across studies is that EF constitutes one single factor in early childhood, and becomes increasingly differentiated with age.

## EF and (Emergent) Academic Skills in Preschoolers and Kindergartners

A wealth of studies on the relationship between EF and emergent academic skills in preschoolers, kindergartners, and older children has shown that EF significantly relates to both mathematics and literacy skills (e.g., Alexander et al., 1993; Bull and Scerif, 2001; Blair and Razza, 2007; McClelland et al., 2007, 2014; Clark et al., 2010, 2013, 2014; Welsh et al., 2010; Roebers et al., 2012; Shaul and Schwartz, 2014; Bryce et al., 2015). For example, Welsh et al. (2010) investigated whether a composite EF measure at the beginning of preschool (age 4.5 years) predicted growth in literacy and mathematics from beginning to end of preschool in children from low-income families. Indeed, EF significantly predicted growth in both literacy and mathematics over this period, after controlling for individual differences in language ability. Blair and Razza (2007) found that inhibitory control was related to both mathematics and literacy (phonemic awareness and letter knowledge) in kindergarten. Moreover, inhibitory control assessed in preschool predicted mathematics but not literacy in kindergarten, over and above the contribution of inhibitory control in kindergarten. Finally, a meta-analysis by Duncan et al. (2007) highlighted the importance of attention skills in predicting academic achievement even after controlling for children's prior academic skills, (see Pagani et al., 2010 for

similar results). Across studies, the finding that EF predicts academic skills in early childhood appears to be robust.

Two explanations of the associations between EF and academic skills have been proposed (cf. Welsh et al., 2010; Stevens and Bavelier, 2012), which are not necessarily mutually exclusive. First, it has been assumed that EF is directly required for performing academic tasks – that is, there is task specific involvement of EF (cf. Blair and Razza, 2007; Bull et al., 2008; Brock et al., 2009). For example, solving mathematical problems likely depends for a substantial part on working memory, in particular, on the retrieval and storage of partial results and processing of information while it is stored (Dehaene, 1997; Cragg and Gilmore, 2014). Hence, children with lower working memory skills may not be able to store and update intermediate results, while working on other parts of a math problem. Similarly, selective attention, an important aspect of EF in early childhood (Garon et al., 2008), has been considered a prerequisite for developing academic skills, as it involves selectively focusing attention on stimuli, such as isolating phonemes from words or focusing on important steps in mathematical problems (for a review, see Stevens and Bavelier, 2012). A second explanation of the relationship between EF and academic skills holds that EF impacts on children's academic achievement indirectly – that is, general involvement of EF is required in (classroom) learning. More specifically, the idea is that well-developed EF skills facilitate behavioral regulation and learning-related behaviors which, in turn, are needed for optimal learning in the classroom. High EF abilities would facilitate children's ability to pay attention to the teacher's instruction and could contribute to children's on-task and goal-directed behavior (Gathercole, 2008; Fitzpatrick and Pagani, 2012), thus allowing them to profit maximally from learning activities (Alexander et al., 1993; Howse et al., 2003; Duncan et al., 2007). In support of this, Nesbitt et al. (2015) found that four-year-olds with higher performance on EF tasks were less frequently disengaged and disruptive, and showed more active participation in the classroom. These behaviors, in turn, were significantly related to children's emergent academic skills.

A common finding in earlier studies on preschoolers and kindergartners is that EF predicts mathematics more strongly than literacy (e.g., Blair and Razza, 2007; Brock et al., 2009; Willoughby et al., 2012; Fitzpatrick et al., 2014; McClelland et al., 2014, but see Miller et al., 2013). Willoughby et al. (2012), for example, found that a latent EF factor was a strong predictor of a latent academic achievement factor in a large sample of five-year-olds from predominantly low socioeconomic status backgrounds, but significantly more strongly so for mathematics than literacy. Brock et al. (2009) showed that EF predicted mathematics in kindergarten, even after controlling for earlier mathematics scores and general intelligence. In contrast, only earlier reading scores and general intelligence predicted reading scores in kindergarten, and EF did not. Moreover, Fitzpatrick et al. (2014) showed that differences in EF were significantly concurrently related to emergent mathematics and literacy in preschoolers, even after controlling for processing speed and general intelligence. Yet, when controlling for vocabulary, the association with early literacy (i.e., letter-word identification) was no longer significant. McClelland et al. (2014) showed that growth in EF across four measurement waves from prekindergarten to kindergarten predicted growth in mathematics, but not literacy. However, Miller et al. (2013) observed no differential relations between EF and mathematics and literacy in a sample of three- to five-year-olds. In this study, working memory was a unique predictor of mathematics and literacy scores over and above age, inhibition, vocabulary, and social understanding. Thus, with some exceptions, a common finding in earlier early childhood studies is that EF is related to mathematics more strongly than to literacy.

Blair and Razza (2007) proposed that differences in the strength of the relationships between EF and the two academic domains may be due to the differential nature of these domains. In particular, the ability to solve mathematics problems never becomes fully automatized as children grow older, as children need to consider which strategy or rule is most appropriate for each problem, placing relatively strong demands on EF. Solving mathematics problems, or even simple arithmetic tasks, requires one to keep the teachers' instructions in mind, select a strategy and shift between strategies when necessary, remember the outcome of intermediate computational steps, and ignore distraction (Van der Ven et al., 2012; Bull and Lee, 2014). Just like any learning task, literacy tasks also require one to keep teacher's instructions in mind and ignore distraction, but these tasks draw less strongly on strategy selection and switching between strategies. Indeed, literacy skills, such as phonemic awareness and letter knowledge, become increasingly automatized, and thus less effortful, as children grow older (cf. Blair and Razza, 2007). At earlier stages, however, EF may be involved in the integration of auditory and visual information and in the automatic retrieval of linguistic information from memory while recognizing sounds and letters (Altemeier et al., 2008). Manipulating speech sounds as in phonemic awareness tasks relies, at least in part, on the ability to selectively attend to speech sounds (cf. Stevens and Bavelier, 2012), and manipulate verbal information while it is stored, such as in sound categorization tasks in which children listen to someone naming three or four pictures (e. g., ball, phone, and bath) and are asked to identify which word does not begin with the same sound as the other two words (Oakhill and Kyle, 2000).

In sum, there is ample evidence that EF is related to academic performance from approximately age three onward. Far less is known about these relations in younger children. To the best of our knowledge, only three studies investigating the predictive effects of EF on later academic performance have included children under age three. In the first study, Fitzpatrick and Pagani (2012) found that working memory performance, averaged across assessments at toddler (29 months) and preschool (41 months) age, predicted number knowledge and receptive vocabulary at age six. The reason for averaging scores across assessments was to reduce the influence of measurement error. In the second study, Merz et al. (2014) found that a broad composite measure of EF in a group of two- to four-year-old children predicted emergent mathematics and literacy a year later, even after controlling for initial performance in these domains. However, the mean assessment age in this study was three years. As such, neither of the studies by Fitzpatrick and Pagani (2012) and Merz et al.

(2014) provides insight into the predictive value of EF at toddler age for later academic skills. In a recent study, using data from the same cohort as reported here, we showed that a latent EF factor at two years predicted children's performance on a latent pre-academic factor one year later (Mulder et al., 2014). This pre-academic factor at three years consisted of early math skills (i.e., a composite score of items assessing number sense, measurement, and geometry, taken from a standardized early math test for toddlers, Op den Kamp and Keuning, 2011) and receptive vocabulary. However, in this study, no distinction was made between emergent mathematics and literacy, and the interval between the two study waves was relatively short. Thus, on the basis of earlier work, it is as yet an open question whether EF in children as young as 2 years of age predicts emergent mathematics and literacy in kindergarten, which, in turn, are predictive of academic performance across elementary school (e.g., Magnuson and Duncan, 2016).

## The Current Study

In the current study, we investigated whether the patterns of relations between EF, literacy and mathematics found in older children, can be found at a younger age than previously investigated. Specifically, our first aim was to investigate whether individual differences in EF in children as young as two years are predictive of emergent mathematics and literacy at age five years. Our second aim was to examine whether EF is a significantly stronger predictor of emergent mathematics than of emergent literacy.

Data from a large longitudinal cohort study were used. In order to reach children from diverse family backgrounds in this study, EF assessments were administered in the field (i.e., in preschool, daycare, or at home) rather than in a lab setting. Given a lack of EF measurement instruments that could be used outside of the lab at the onset of the study, a new battery of EF tasks was developed for field-based administration. This battery has previously been validated for use with two-year-olds (Mulder et al., 2014), and includes a measure of working memory, as well as measures of verbal<sup>1</sup> and visuospatial short-term memory and selective attention. The latter three are not typically used as indicators of EF in studies of older children, but these are important precursor skills of more complex EF in early childhood (Garon et al., 2008; Hendry et al., 2016).

In order to reduce the influence of measurement error in our assessment of EF, we adopted a latent factor approach (Willoughby et al., 2010, 2012). As our measures assessed precursor skills to more complex EF (i.e., short-term memory and selective attention) as well as a more conventional EF measure (i.e., working memory), we labeled the latent construct 'early EF,' for consistency with the early childhood literature (e.g., Hendry et al., 2016) and to differentiate from studies on EF in older children, which typically include only measures of more complex EF's (i.e., shifting, inhibition, and working memory). Finally, like several other studies on relationships between EF and academic skills (Welsh et al., 2010; Gray et al., 2014), we controlled for children's receptive vocabulary skills, to rule out that relationships between EF and academic skills found were due to shared variance with vocabulary skills.

## MATERIALS AND METHODS

## Participants

Data were analyzed from 552 preschool children who were selected from a larger sample participating in the pre-COOL study, a longitudinal study on preschoolers' cognitive and linguistic development in the Netherlands (see Mulder et al., 2014; Slot et al., 2015, 2017; Verhagen et al., 2017). In pre-COOL, over 3000 children were enrolled. The first and second study wave took place when children were aged two and three years, respectively. These children had been recruited through preschool and daycare centers as well as municipality records (for more details, see Mulder et al., 2014). A sub-group of 751 participants subsequently enrolled in the so-called "core cohort" in kindergarten (study wave three and four, at ages four and five years, respectively).<sup>2</sup> For the current study, we included children who had enrolled in the pre-COOL study at wave one and had entered the core cohort in kindergarten. Out of all 751 children in the core cohort, 149 had entered the study only at the second wave (age three) due to later enrollment in preschool, and were excluded. A further 50 (8%) children from the remaining 602 children were excluded because they were either older than 36 months (n = 4) or younger than 24 months (n = 13) at wave one, or because age information was missing (n = 33).

The final sample of 552 children included 236 boys [47%, n = 44 (8%) gender unknown to the researchers]. Mean age was 29 months at the first study wave (SD = 3, range 24–36) and 70 months at the final wave (SD = 2, range 64–77). Parents reported on their educational level in questionnaires. If this information was not available, school registry information was used where available. Parental educational level was assessed on a 4-point scale ranging from (1) 'primary school,' to (2) 'lower vocational training,' (3) 'secondary school and/or vocational training,' and (4) 'higher education (i.e., college or university degree)', and averaged for both parents. Mean parental educational level was available for n = 439 children, with a mean score of 3.10 (SD = 0.80, range 1–4, n = 35 of 439 (8%) had a mean educational level of 1–1.5; n = 67 (15%) had 2–2.5; n = 231 (53%) had 3–3.5, n = 106 (24%) had 4; n = 113 of 552 (21%) missing). Home language was also measured in parent questionnaires. Specifically, parents indicated whether their children were only exposed to Dutch at home, or (also) to

<sup>1</sup>The verbal short-term memory task was not included in our original psychometric investigation (Mulder et al., 2014). We included this task in the current study to obtain a more balanced mix of early EF tasks, with both verbal and non-verbal measures. To investigate the effect of the inclusion of this task, we conducted our analyses with and without this indicator of the early EF construct.

<sup>2</sup>The main criteria for inclusion in the core cohort were the following: (i) children had obtained a test score on the vocabulary and attention tasks, as well as on at least two other tasks of the language and executive function test battery at ages two and/or three years, and (ii) contact information about children's schools was available. The rationale behind these criteria was to obtain a dataset for the children in the core cohort that was as complete as possible in order to be able to address the main question guiding the pre-COOL study.

another language or multiple other languages. If questionnaire data were missing, research assistants' reports were used. RA's were instructed to ask parents and/or teachers at preschool or daycare about the child's home language background (see also Mulder et al., 2014). The majority of the children (n = 363 / 73%, 52 missing) were from monolingual Dutch homes. The remaining children (n = 139) were from families in which one or more languages other than Dutch instead of or next to Dutch were spoken.

## Materials

At age two, children were administered a series of tasks assessing EF and precursors to EF (from here on referred to as measures of 'early EF' for brevity), and language skills. At age five, they were administered tasks assessing EF and language as well as tasks assessing emergent mathematics and literacy skills. For the current study, data collected with the early EF tasks at age two and the mathematics and literacy tasks at age five were used. In our analyses, receptive vocabulary assessed at age two was used as a control variable. One mathematics task which was administered at the final wave and assessed children's knowledge of numbers between 1 and 10 was not included in the analysis, because of ceiling performance (see Kolkman et al., 2013 for the same finding with this task in five-year-olds). Regarding early EF, an inhibition task which was included at the first study wave was dropped from the battery after a few 100 children were tested because it turned out to be too difficult (see Mulder et al., 2014), and thus was not included in the current study either. All computerized tasks were programmed in E-prime 2.0 (Schneider et al., 2002).

## Control Measure: Receptive Vocabulary at Two Years

At the first study wave, receptive vocabulary was assessed with a shortened version of the Dutch Peabody Picture Vocabulary Test (PPVT-III-NL, Dunn and Dunn, 2005). In this test, children choose one out of four pictures after an orally presented word. To reduce fatigue, an adapted version was used in which a fixed number of 24 items were presented to all children. Moreover, a laptop was used rather than a test booklet, to facilitate administration and scoring (see Verhagen et al., 2017). Scores were computed as the percentage of correct responses out of the total number of responses for children who responded to at least half of the items of the task (to avoid calculating scores on the basis of few responses). A total of n = 527 (95.5%) children obtained a score on the task (n = 18 did not do the task at all; n = 7 responded to 1–11 items and their data were excluded). The task showed good internal consistency (α = 0.88).

### Early Executive Function Tasks at Two Years Selective Attention

Selective attention was assessed with a visual search task administered on a laptop (Mulder et al., 2014). In this task, children were requested to search for targets (elephants) amongst a display of distractors that were similar in color and size (bears and donkeys). The assessor encouraged the child to search as quickly as possible throughout the task, and provided continuous feedback so that children did not have to remember the rules of the task. That is, if the child pointed to a target, the assessor said: "Well done! Can you find another elephant?". If the child pointed to one of the distractors, the assessor said: "No, where is an elephant? Try to find the elephants quickly!". Children were given three practice items, followed by three test items. Each test item consisted of a structured 6 × 8 grid, including eight targets and 40 distractors. Children were allowed to search each display of targets and distractors for 40 s. Item scores were set to missing in cases where the child did not look at the screen at all during these 40 s, according to assessor report (item 1: n = 5; item 2: n = 4; item 3: n = 14). The task score was computed as the average number of identified targets across valid test items for children who responded to at least two items (n = 24 children responded to none or only one item and their data were not included). Scores of children who did not find any targets across all test items were set to missing, as we cannot be certain that these children understood the task rules (n = 14 children). A total of n = 514 (93.1%) children in the current sample obtained a score on the task. The task had good internal consistency (α = 0.86).

#### Visuospatial Working Memory

An adapted version of the Six Boxes task from Diamond et al. (1997) was used to assess visuospatial working memory (see Mulder et al., 2014). In this task, children were shown how six different wooden toys were hidden in six identical white boxes with blue lids. Children were then allowed to search for the toys, by emptying the boxes one at a time. In between search attempts, children were distracted by the assessor for 6 s. As such, after each search attempt, children had to update their memory of which boxes they had already emptied and which boxes still contained a toy, and hold this information in memory over a delay. Following a brief instruction and practice phase, children were allowed six search attempts. Task scores were computed as percentages correct for those children who had searched on all trials.<sup>3</sup> A total of n = 479 (86.8%) children obtained a score on the task (n = 30 children did not do the task at all; n = 43 searched on 1 to 5 trials and their data were excluded).

#### Visuospatial Short-Term Memory

The visuospatial short-term memory task was based on previous work by Pelphrey et al. (2004) and Vicari et al. (2004) and adapted for the current study (see Mulder et al., 2014). In this task, children were shown how a toy was hidden in one of six identical boxes and asked to search for the toy after a 1-s delay. The task was administered in an adaptive fashion, so that children who passed the one-location item were given a more difficult item in which two toys were hidden in two different locations, etcetera. In the most difficult item, four toys were hidden in four locations. The task score was the highest difficulty level that a child had passed (i.e., number of locations that a child could recall), and

<sup>3</sup>This relatively stringent criterion was chosen because search attempts were interdependent: that is, each next attempt was more difficult than the previous (successful) attempt, because children had to keep an increasing number of empty boxes in mind as they progressed through the task.

could range from zero to four. A total of n = 457 (82.8%) children obtained a score on the task.<sup>4</sup>

#### Verbal Short-Term Memory

fpsyg-08-01706 October 10, 2017 Time: 15:44 # 6

Verbal short-term memory was assessed with a non-word repetition task (Verhagen et al., 2017). This task contained 2 practice items and 12 test items, half of which were monosyllabic and the other half bisyllabic. The items had been prerecorded in a high-pitch child-friendly voice from a Dutch native speaker and they were presented to the children over headphones. For each test item, children were presented with a short video clip showing a picture of a novel object that appeared out of a drawing of a box. At the same time, they heard a prerecorded sentence that encouraged them to repeat the non-word: "Look, a [keupun]! Say [keupun]!" Children's repetition attempts were scored online by the assessors as correct, incorrect, or 'unknown' (<2% of all responses). Task scores were computed as the percentage of correct responses out of all responses for children who responded to at least half of the items of the task. A total of n = 414 (75.0%) children obtained a score on the task (n = 83 did not do the task at all; n = 55 responded to 1–5 items and their data were excluded). The task showed good internal consistency (α = 0.86).

#### Emergent Mathematics at Five Years Number Knowledge (1–100)

A number naming task was used to assess number knowledge (Kolkman et al., 2013). In this task, children were presented with written numerals on a laptop screen and asked to name each numeral. The numerals presented were in the 1–100 range. The task contained five test items (i.e., 12, 30, 54, 70, and 97). Scores were computed as the percentage of correct responses for each child. A total of n = 514 children (93.1%) children obtained a score on the task, and all children had responded to all items. The test had good internal consistency (α = 0.81).

#### Number Estimation (1–10)

To assess children's number estimation skills, a number line task was presented in which children were asked to estimate the position of a given number (in the range from 1 to 10) on a horizontal line (Siegler and Opfer, 2003, current task adapted from Kolkman, 2013, Unpublished). This line was presented on a laptop screen, with '1' and '10' on either side. Prior to the task, children were shown the positions of both extremes, as indicated by '1' and '10.' They were then asked to locate the position of a given number on the line. The task contained six test items. Linear fit scores were obtained by calculating the squared correlation between children's responses and the values corresponding to the location of the numbers on the number line (Geary et al., 2008). Linear fit scores have been shown to be a valid measure of number mapping in young children (Friso-van den Bos et al., 2014). A total of n = 515 children (93.3%) children obtained a score on the task, all of whom had responded to at least five items.

#### Number Estimation (1–100)

To investigate children's number estimation of higher numbers, a number line task was presented which was the same as the previous one, except that numbers between '1' and '100' were presented (adapted from Kolkman, 2013, Unpublished). This task contained six test items. As in the number line 1–10 task, linear fit scores were computed. A total of n = 513 children (92.9%) children obtained a score on the task, and all had responded to at least five items.

#### Cito Mathematics

Mathematical abilities were measured with the criterion-based Cito Mathematics Test for Kindergarteners (Janssen et al., 2005, 2010). Cito Mathematics tests are part of the student achievement monitoring system used in most Dutch primary schools. The earliest Cito assessments take place in the kindergarten departments of primary school, at ages four and five. The version used in this paper was administered mid-year 2 of kindergarten and contained 54 items that were administered on two separate days. Three main domains were covered by the test: (a) number knowledge (e.g., recognizing numbers), (b) measurement (e.g., weight and length), and (c) geometry (e.g., shapes and figures). Raw scores were converted into Rasch-based ability scores (Janssen et al., 2005) that can be directly compared across kindergarten and the primary school period. Scores were available for n = 419 children (75.9%). The reliability coefficient of the version used mid-year 2 is 0.87 (Koerhuis and Keuning, 2011).

### Emergent Literacy at Five Years Letter Knowledge

Letter knowledge was assessed with a shortened version of the letter recognition task used in De Jong (2007). In this task, children were presented with a laminated sheet of paper on which eight letters in lowercase were presented. The assessor then provided children with a certain letter (pronounced phonetically) and asked children to point to the correct letter on the sheet. Scores were computed as the percentage of letters identified correctly out of all responses. A total of n = 499 (90.4%) children obtained a score on the task (n = 35 did not do the task at all; n = 18 responded to 1–4 items and their data were excluded). Internal consistency of the task was sufficient (α = 0.79).

#### Phonemic Awareness

A first sound awareness task was used to assess phonemic awareness, which was taken from De Jong (2007). In this task, children were presented with an array of four pictures presented on a laptop screen. One of these pictures was marked as the target picture. The child's task was to identify the first sound of the word describing the target picture and determine which of the three other pictures displayed a word starting with the same first sound. Children were presented with a relatively long instruction that became shorter after the first four test items. Specifically, for each of the first four items of the task, the assessor named the target picture (e.g., ball) and told children the first sound of this word (/b/). The assessor then also named the three other pictures (e.g., bear, doll, and phone) and asked the child to indicate which

<sup>4</sup>The relatively high number of missing on this task was due to the fact that we had to shorten the test after the first few months of data collection, to reduce overall testing time, and data of the children tested on the initial version of the task could not be included (see also Mulder et al., 2014).

picture was labeled with a word starting with the same first sound as the label of the target picture (in this case: bear). From the fifth item onward, the assessor no longer named the first sound of the word describing the target picture, but asked directly which of the three other pictures represented a word starting with the same sound. The task contained two practice items and 12 test items. Scores were calculated as the percentage of correct responses out of all responses for children who responded to at least half of the task. A total of n = 497 (90.0%) children obtained a score on the task (n = 38 children did not do the task at all; n = 17 children responded to 1–5 items and their data were excluded). Internal consistency of the task was good (α = 0.84).

#### Cito Language and Literacy

fpsyg-08-01706 October 10, 2017 Time: 15:44 # 7

General language and literacy skills were measured by the Cito Language Test for Kindergartners (Lansink, 2009). This is a standardized national test that, like the Cito Mathematics test described above, is part of the student achievement monitoring system commonly used in primary schools in the Netherlands. The test administered at mid-year 2 in kindergarten contained 60 items, which were administered on two separate days. The items covered two broad domains: (a) conceptual awareness and (b) language awareness. The conceptual domain was assessed with items testing children's receptive vocabulary and listening skills. The emergent literacy domain was assessed with items testing children's sound and rhyme awareness, hearing first and last words in sentences, auditory synthesis, and letter recognition. Scores were available for n = 414 children (75.0%). Raw scores were converted into Rasch-based ability scores. Previous studies show good internal consistency of the test (α = 0.89, Lansink and Hemker, 2012).

#### Procedure

Tasks were administered by trained research assistants in a quiet room at children's homes or preschools/daycare (age two years) or at their schools (age five years). At age two, the tasks were intermixed with four other tasks not reported in the current paper<sup>5</sup> , and presented in the following fixed order: receptive vocabulary, selective attention, verbal short-term memory, visuospatial short-term memory, visuospatial working memory. At age five, tasks were intermixed with nine other tasks<sup>6</sup> , and presented in the following order: letter knowledge, number naming and number line tasks, and phonemic awareness. Both sessions lasted about 45 min. To make sure that assistants adhered to the standardized procedures of each task, they had undergone an intensive training prior to data collection, which involved a full-day training, administration of a video recording of a session with a child of the relevant age, and elaborate feedback reports and discussion (for further details, see Mulder et al., 2014). The Cito tests were administered by children's teachers, following a standardized protocol.

#### Analyses

Since children differed substantially in age at the time of data collection, and development is rapid at this age, all early EF, literacy and mathematics measures were corrected for age. However, especially at age two years, children from higher SES backgrounds were tested at a younger age than children with lower SES backgrounds, due to the sampling procedures used in pre-COOL. Since SES was also positively related to most cognitive measures (early EF, mathematics, and literacy), this confound entailed that merely taking age-residualized scores would give an underestimate of the true effect of age (a suppressor effect). In order to counter this suppressor effect, regression analyses were run for each variable with both age at the time of that particular measure and parental education as predictors, and residual scores were determined based on only the age coefficient in this analysis (as correcting for SES would yield the undesired effect of correcting for a source of genuine variation in early EF). The correlations between the original variables and the corrected variables ranged from r = 0.895 to r = 0.994 (mean r = 0.980).

The analyses were run using the Lavaan package of the statistical software R (Rosseel, 2012). In all analyses, full information maximum likelihood estimation with robust (Huber-White) standard errors was used to handle missing data. Model fit was evaluated on the basis of the following commonly used cut-off criteria: RMSEA < 0.05, CFI > 0.90, and SRMR < 0.08 (Hu and Bentler, 1999). The chi-square index was not used, since it is very sensitive to sample size and typically significant in large samples (Little, 2013; Brown, 2015).

The analyses were performed in three steps. First, a Confirmatory Factor Analysis (CFA) was performed to see whether the tasks assessing emergent mathematics and literacy indeed represented two separate latent factors. To this aim, a twofactor model was fitted in which the mathematics tests loaded on one factor and the literacy tests on another factor. In this model, the Cito mathematics and Cito literacy tests were correlated, to deal with shared method-bound variance (Brown, 2015).

Second, for our main analysis concerning whether early EF at age two predicted emergent mathematics and literacy at age five, a Structural Equation Model (SEM) was fitted in which early EF, as a latent factor and based on all four tasks, predicted the two latent factors mathematics and literacy. In this model, the paths from early EF to mathematics and from early EF to literacy were freely estimated. Receptive vocabulary, home language and parental education were included as control variables.

Finally, to test our prediction that the effect of early EF on mathematics would be stronger than the effect on literacy, we fitted a second model in which the paths from early EF to mathematics and from early EF to literacy were constrained

<sup>5</sup>These four additional tasks assessed children's phonological abilities, sentence comprehension, and delay of gratification. Task order of all tasks as follows: phonological processing, receptive vocabulary, selective attention, verbal shortterm memory, sentence comprehension, delay of gratification (snack delay), visuospatial short-term memory, visuospatial working memory, and delay of gratification (gift delay).

<sup>6</sup>These additional tasks measured rapid automatized naming, receptive vocabulary, verbal short-term memory, visuospatial short-term memory, visuospatial working memory, sentence comprehension, inhibition, verbal working memory, and delay of gratification. Task order of all tasks as follows: rapid automatized naming, receptive vocabulary, selective attention, verbal short-term memory, visuospatial short-term memory, visuospatial working memory, sentence comprehension, letter knowledge, inhibition, number naming 1–10, number line 1–10, number naming 1–100, number line 1–100, verbal working memory, phonemic awareness, delay of gratification.

to be equal, rather than freely estimated. Fit of the model in which the paths were freely estimated and the model in which these paths were constrained was then compared through a chisquare difference test. If the outcome of this test was significant, the less constrained model was the preferred model; if nonsignificant, the more constrained (more parsimonious model) was the preferred model.

## RESULTS

## Descriptives and Correlations

Descriptive statistics and correlations for all tasks are provided in **Tables 1**, **2**, respectively.

## Confirmatory Factor Analyses

The outcomes of the CFA in which a two-factor model was estimated, containing an emergent mathematics and an emergent literacy factor, showed good data fit, RMSEA = 0.04, CFI = 0.99, SRMR = 0.03 (n = 530). The correlation between the latent factors 'mathematics' and 'literacy' in this model was 0.58 (p < 0.001), and the correlation between the error terms of the two Cito tests was 0.83. The model fitted significantly better than a model in which all tasks loaded on a single latent academic factor, 1χ 2 (1) = 29.14, p < 0.001 (model fit of the one-factor model: RMSEA = 0.07, CFI = 0.96, SRMR = 0.03).

## Relationships between Early EF at Two and Emergent Mathematics and Literacy at Five

The SEM model in which the latent factor early EF was modeled as a predictor of the latent factors emergent mathematics and literacy, with parental education, home language, and receptive vocabulary at age 2 controlled, fitted the data well, RMSEA = 0.05, CFI = 0.93, SRMR = 0.05 (n = 552). This model is depicted in **Figure 1**.

As can be seen in this figure, the model showed positive and significant relationships from early EF to emergent literacy and from early EF to emergent mathematics, after controlling for receptive vocabulary, parental education, and home language. These associations were positive and strong for both factors: β = 0.56, p < 0.001 for emergent literacy; β = 0.79, p < 0.001 for emergent mathematics.

To test whether the association between early EF and mathematics was stronger than the association between early EF and literacy, an alternative model was fitted in which the paths from early EF to literacy and from early EF to mathematics were constrained to be equal instead of freely estimated. This alternative model fitted the data slightly less well, as indicated by the absolute fit indices of this model, RMSEA = 0.05, CFI = 0.92, SRMR = 0.06, than the previous, less constrained model. However, the result of a chi-square difference test showed that the difference in fit of the two models approached significance, but did not surpass the 0.05 alpha level, 1χ 2 (1) = 3.14, p = 0.076. Thus, the more parsimonious model in which both paths were constrained to be equal was the preferred model. This indicates that there was no significant difference in the strength of the relationship with early EF between both emergent academic skills. The size of the two constrained relationships was estimated at B = 0.63 (β = 0.68 for mathematics and β = 0.67 for literacy, p < 0.001).

A possible reason for why no clear difference in the strength of the associations was found, was that unlike in some of the earlier studies described above (Brock et al., 2009; Willoughby et al., 2012; Fitzpatrick et al., 2014), in our study, a verbal memory indicator of early EF was included (i.e., non-word repetition). Non-word repetition is known to be a strong predictor of children's later language and literacy skills, in particular, letter knowledge (De Jong and Olson, 2004) and reading (Gathercole et al., 1991; Kibby et al., 2014). This might have strengthened the relationship between early EF and literacy.

To examine this possibility, we ran an additional analysis in which we fitted a model that was the same as the model depicted in **Figure 1**, except that non-word repetition was removed as an indicator of early EF, such that only non-verbal measures of early EF remained. The resulting model, which is presented in **Figure 2**, showed good data fit, RMSEA = 0.05, CFI = 0.93, SRMR = 0.05.

As shown in **Figure 2**, the association between early EF and emergent literacy in this adapted model was weaker than in the previous model, albeit still significant: β = 0.42, p = 0.010 rather than β = 0.56, p < 0.001. The association between early EF and emergent mathematics was also weaker than in the previous model, but did not decrease as strongly as the association between early EF and emergent literacy: β = 0.71, p = 0.005 rather than β = 0.79, p < 0.001. All other coefficients in the model were comparable in size to those in the previous model. A comparison with a model in which the paths from EF to mathematics and literacy were constrained showed that the unconstrained model fitted the data significantly better than the constrained model, 1χ 2 (1) = 10.13, p = 0.001. This indicates that the relationship between early EF and mathematics was significantly stronger than the relationship between early EF and literacy, at least when only non-verbal indicators of the latent early EF construct were included.

## DISCUSSION

In this study, we investigated whether early EF in two-year-olds predicted emergent literacy and mathematics three years later. The results showed significant associations between early EF, treated as a latent factor, and emergent academic performance, in line with earlier research findings (Espy et al., 2004; Blair and Razza, 2007; McClelland et al., 2007; Bull et al., 2008; Clark et al., 2010). The current findings extend previous research in two ways. First, they show that early EF, assessed in children as young as two years, is predictive of emerging academic skills at the end of kindergarten. Second, they indicate that differences in early EF are particularly predictive of emergent mathematics, but also play a role in the development of early literacy skills.

The finding that early EF was a stronger predictor of mathematics than literacy was influenced by the specific tasks

used as indicators to our latent early EF construct. Specifically, when we included a non-word repetition task, which required children to process, store, and reproduce non-words, the strength of the association between early EF and mathematics was not significantly different from the strength of the association between early EF and literacy, although a trend toward significance was observed. When this indicator was dropped, and only visuospatial and non-verbal early EF tasks were included, early EF related significantly more strongly to mathematics than literacy. This finding suggests that the specific tasks used to assess EF may explain at least in part some of the mixed results in earlier studies regarding the strength of the associations between EF and mathematics and literacy. More precisely, previous studies reporting differential associations between EF and these two types of academic skills in preschoolers and kindergartners often included EF measures that were either largely non-verbal (Brock et al., 2009) or measures that required children to produce verbal responses (i.e., Willoughby et al., 2012; Fitzpatrick et al., 2014) rather than measures assessing verbal EF skills such as our non-word repetition task (but see McClelland et al., 2014). In two of the studies that did not find stronger relations between EF and emergent mathematics than between EF and literacy in early childhood, verbal working memory tasks, similar to the current study, were included (Welsh et al., 2010; Miller et al., 2013). A wealth of cross-sectional studies has shown that verbal memory is an important predictor of later literacy and


Emergent literacy at age 5 (9) Phonemic awareness 0.15∗∗ 0.13∗∗ 0.15∗∗ 0.23∗∗∗ 0.25∗∗∗ 0.24∗∗∗ 0.38∗∗∗ 0.33∗∗∗ − 0.53∗∗∗ 0.42∗∗∗ (10) Letter knowledge 0.17∗∗∗ 0.05 0.14∗∗ 0.14∗∗ 0.28∗∗∗ 0.25∗∗∗ 0.49∗∗∗ 0.33∗∗∗ 0.52∗∗∗ − 0.31∗∗∗ (11) Cito language 0.29∗∗∗ 0.17∗∗ 0.12<sup>∗</sup> 0.22∗∗∗ 0.22∗∗∗ 0.26∗∗∗ 0.33∗∗∗ 0.65∗∗∗ 0.42∗∗∗ 0.31∗∗∗ −

∗∗∗p < 0.001, ∗∗p < 0.01, <sup>∗</sup>p < 0.05. Correlations above the diagonal are based on true task scores, correlations below the diagonal are based on age-residualized scores.

reading (for a review, see Swanson et al., 2009). Hence, it is not surprising that, in our study, adding this measure resulted in a stronger relationship between the latent early EF factor and literacy.

A further possible reason why we found stronger relationships between early EF and mathematics than between early EF and literacy is that two out of our early EF tasks assessed spatial skills, that is, visuospatial short-term memory and visuospatial working memory. Earlier research has found clear connections between spatial skills and math abilities (Casey et al., 1995; Gunderson et al., 2012; Mix and Cheng, 2012; Verdine et al., 2014). Mix and Cheng (2012) found, for instance, that training on a mental

rotation task enhanced six- to eight-year-olds' performance on calculation problems.

Thus, our data suggest that the specific tasks used to assess EF may have implications for the predictive associations found between EF and different types of academic skills, with the distinction between verbal and non-verbal EF tasks perhaps playing an important role. Note that, given that the model with the non-word repetition task as indicator to the latent early EF factor showed a non-significant trend effect for differential relations between early EF and literacy as opposed to mathematics, these results need to be interpreted with caution. Further research, in which the indicators to a latent (early) EF construct are varied more systematically, is needed to investigate in more detail how the choice of tasks influences the strength of associations with different academic skills in young children.

A consistent finding in our study – regardless of whether the verbal memory task was included – was that individual differences in early EF at age two significantly predicted children's emergent mathematics and literacy skills three years later. Theoretically, these associations could either be direct, through children's reliance on executive processes when performing academic tasks, as argued above, or indirect, through other mediating factors, in particular, children's ability to regulate their behavior or other learning-related skills needed in order to benefit from instructions in the classroom. Supporting this latter notion, we found earlier that early EF at age two positively predicts selfregulation behavior in the classroom at age three in a subsample of the children investigated in the current study (Slot et al., 2017). Likewise, Nesbitt et al. (2015) have shown that the association between EF and emergent academic skills from the beginning (age four years) to end of pre-kindergarten was mediated by learning-related behaviors, although direct effects between EF and academic skills remained significant when learning-related behaviors were taken into account. Not all studies support the idea that learning-related skills mediate the relationship between EF and academic performance, however. In a study on kindergartners, Brock et al. (2009), for example, did not find that learning-related behaviors were a significant mediator between EF and emergent mathematics. Therefore, these authors concluded that EF was directly involved in academic task performance. Summarizing, although there is some evidence that behavioral regulation and learning-related skills mediate the relationship between EF and academic skills in young children, at least part of the association between EF and academics appears to be direct.

A number of issues need to be taken into consideration when interpreting our findings, in particular regarding the assessment of toddler EF. In our study, we modeled early EF as a single latent factor. The advantage of this approach is that the impact of measurement error is reduced, and the predictive value of latent EF constructs is typically stronger than when single task scores are used in the analysis (cf. Willoughby et al., 2012). Indeed, in our study, associations between the latent early EF factor and the two emergent academics factors in our SEM model were much stronger than the correlations between true task scores. Also, factor loadings to the latent early EF factor were all satisfactory to good, while correlations between the task scores of the early EF tasks were pretty low. This underscores the importance of modeling EF as a latent factor, especially at this young age. However, modeling EF as a latent factor leaves unanswered the question as to which specific EF skills are predictive of later academic skills. Moreover, with respect to the early EF assessment, we included both conventional EF measures, such as working memory, and measures of skills that are seen as important precursor skills to EF in early childhood, that is, selective attention and short-term memory (Garon et al., 2008; Hendry et al., 2016). Ideally, a battery of more complex EF tasks would have been used, involving also inhibitory control and shifting. At the outset of the current study, however, such measures were not available for large scale field-based assessments. More recently, EF batteries for toddlers have been developed that include inhibition and shifting measures (Garon et al., 2013), enabling a more comprehensive assessment of EF in very young children.

Another limitation of the current study is the lack of statistical control for early emerging mathematics and literacy at two years. Some studies have worked with such rigorous controls in slightly older children (e.g., Brock et al., 2009). Although toddlers have been shown to have some basic understanding of numerical transformations (Sophian and Adams, 1987), to the best of our knowledge, no suitable measures of precursors to mathematical abilities were available for field-use in our two-year assessment. In fact, we piloted with a magnitude comparison task for toddlers, but were concerned about its validity for this young age group. Instead, we controlled for vocabulary as a key cognitive factor in toddlerhood. The specific set of statistical controls used when testing associations between EF and academics varies widely between studies, and choices regarding these controls may clearly impact on whether or not differential relations between EF and mathematics vs. EF and literacy are found (see Fitzpatrick et al., 2014). In the current study, for example, including a vocabulary measure, but not a measure of precursors to mathematics skills may have affected our finding that early EF at two years was more strongly predictive of mathematics than literacy at five. Thus, future studies investigating toddler EF as a predictor of achievement should ideally consider including basic tests of literacy and numeracy at this young age already, to provide a stronger test of the independent contribution of EF to future academic attainment.

In addition to including early mathematics and literacy measures as statistical controls in the study of EF as predictor of academic achievement, inclusion of such measures as well as later EF would allow to study the reverse relationship as well: do early literacy and mathematics predict later EF? Clearly, the current study findings do not allow us to draw conclusions regarding the direction of effects between early EF and emergent academics. Recent work shows that associations between EF and mathematics may be bidirectional (for a review, see Clements et al., 2016). For example, Welsh et al. (2010) found that emergent numeracy skills at the beginning of prekindergarten predicted EF at the end of pre-kindergarten, while the opposite was also true. In this study, emergent literacy did not predict EF over time. Clements et al. (2016) speculate that highquality mathematics curricula in particular, may provide optimal situations for scaffolding learning of both mathematics and EF in young children.

## CONCLUSION

fpsyg-08-01706 October 10, 2017 Time: 15:44 # 12

The current longitudinal study is the first to investigate the predictive value of early EF in two-year-olds for emerging academic skills over a three-year time interval. The results showed that early EF at two years predicts emergent literacy and mathematics just prior to school entry, after controlling for receptive vocabulary, parental education, and home language. This suggests that early EF can be reliably assessed at this young age, despite the rapid dynamic nature of development during this phase, and has important predictive value for academic achievement three years later. Further work could investigate whether EF measures in toddlerhood can accurately identify children at risk for significant learning impairment in school, and could be implemented as effective screening tools to help identify which children should be referred for intervention. Moreover, findings call for future studies to unravel which genetic and environmental factors impact on early individual differences in EF in the first years of life. Finally, future studies could assess whether individual differences in EF at a very young age affect not only the level but also the growth of children's academic skills over kindergarten or even elementary school.

## ETHICS STATEMENT

Approval for the study was obtained from both Ethical Advisory Committees of the Faculty of Social and Behavioral Sciences of Utrecht University and the Department of Education of the University of Amsterdam. Children were recruited in two different ways, as reported in more detail in Mulder et al. (2014). In short, part of the sample was recruited via directly approaching children's parents (home-based sample), while another part was recruited through children's daycare

## REFERENCES


centers and preschools (center-based sample). In the centerbased sample, parents were given an information letter in which they were given the opportunity to withdraw their child from participation. In addition, children's teachers were requested to inform parents about the study and assessments. Passive parental consent was allowed by the Ethical Advisory Committee of the Department of Education of the University of Amsterdam (the institution responsible for data collection), because of the major challenges involved with obtaining permission from all parents, and because all the assessments were child-friendly and noninvasive.

## AUTHOR CONTRIBUTIONS

HM and JV: task design or adaptation (EF at two years, academic tasks at five years, with the exception of the CITO measures), wrote the manuscript, interpreted the results. SV: analyzed the data, interpreted the results, and revised the manuscript. PS: interpreted the results and revised the manuscript. PL: principal investigator, design of the study, and revised the manuscript.

## FUNDING

The pre-COOL study was funded by the Netherlands Organization for Scientific Research (grant numbers 411-20-442; 411-20-452).

## ACKNOWLEDGMENTS

The pre-COOL study was carried out in collaboration with the Kohnstamm Institute at the University of Amsterdam and the former Institute for Applied Social Sciences at the Radboud University Nijmegen. The authors are grateful to all the children, families, teachers, and preschool and daycare centers and schools for participating in the study.

achievement, learning-related behaviors, and engagement in kindergarten. Early Child. Res. Q. 24, 337–349. doi: 10.1016/j.ecresq.2009.06.001



achievement: a five-year prospective study. J. Educ. Psychol. 104, 206–223. doi: 10.1037/a0025398


literacy, vocabulary, and math skills. Dev. Psychol. 43, 947–959. doi: 10.1037/ 0012-1649.43.4.947


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2017 Mulder, Verhagen, Van der Ven, Slot and Leseman. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

## Executive Function Buffers the Association between Early Math and Later Academic Skills

Andrew D. Ribner<sup>1</sup> \*, Michael T. Willoughby<sup>2</sup> , Clancy B. Blair<sup>1</sup> and The Family Life Project Key Investigators3,4

<sup>1</sup> Department of Applied Psychology, New York University, New York, NY, United States, <sup>2</sup> RTI International, Research Triangle East, NC, United States, <sup>3</sup> Center for Developmental Science, University of North Carolina at Chapel Hill, Chapel Hill, NC, United States, <sup>4</sup> Department of Human Development and Family Studies, Pennsylvania State University, University Park, PA, United States

Extensive evidence has suggested that early academic skills are a robust indicator of later academic achievement; however, there is mixed evidence of the effectiveness of intervention on academic skills in early years to improve later outcomes. As such, it is clear there are other contributing factors to the development of academic skills. The present study tests the role of executive function (EF) (a construct made up of skills complicit in the achievement of goal-directed tasks) in predicting 5th grade math and reading ability above and beyond math and reading ability prior to school entry, and net of other cognitive covariates including processing speed, vocabulary, and IQ. Using a longitudinal dataset of N = 1292 participants representative of rural areas in two distinctive geographical parts of the United States, the present investigation finds EF at age 5 strongly predicts 5th grade academic skills, as do cognitive covariates. Additionally, investigation of an interaction between early math ability and EF reveals the magnitude of the association between early math and later math varies as a function of early EF, such that participants who have high levels of EF can "catch up" to peers who perform better on assessments of early math ability. These results suggest EF is pivotal to the development of academic skills throughout elementary school. Implications for further research and practice are discussed.

Keywords: executive function, math achievement, reading development, elementary school children, academic achievement, moderation

## INTRODUCTION

Children's success in schooling has long been a central focus of research, policy, and practice. In December 2015, the Every Student Succeeds Act was signed into law at a time that marked all-time high graduation rates and low dropout rates in the United States. The Every Student Succeeds Act, in concert with the Common Core State Standards, were meant to improve graduation rates and further minimize student dropout. Yet, still 6.5% of all students entering high school, and 11.6% of students who are born to families from the lowest income quartile drop out of high school. Further, these dropout rates are highest in the American South, and in rural areas across the country (U.S. Department of Education, National Center for Education Statistics, 2016). At the same time, many other countries experience large proportions of students dropping out of high school, and only some 33% of students in OECD countries enroll in postsecondary education. Importantly, while there has been some improvement in secondary school attainment internationally, there is also a marked degree of stability in secondary school dropout (Lamb et al., 2010; OECD, 2016). Decades of research in the

#### Edited by:

Mariëtte Huizinga, VU University Amsterdam, Netherlands

#### Reviewed by:

Camilla Gilmore, Loughborough University, United Kingdom Kerry Lee, National Institute of Education, Singapore

> \*Correspondence: Andrew D. Ribner aribner@nyu.edu

#### Specialty section:

This article was submitted to Developmental Psychology, a section of the journal Frontiers in Psychology

Received: 01 February 2017 Accepted: 11 May 2017 Published: 30 May 2017

#### Citation:

Ribner AD, Willoughby MT, Blair CB and The Family Life Project Key Investigators (2017) Executive Function Buffers the Association between Early Math and Later Academic Skills. Front. Psychol. 8:869. doi: 10.3389/fpsyg.2017.00869

United States and abroad has suggested a substantial amount of student dropout is attributable to school, teacher, and classroom characteristics (e.g., Ehrenberg and Brewer, 1994; Rumberger and Thomas, 2000; Koedel, 2008; Hanushek et al., 2008); however, it remains important to attend to individual-level skills that predict and promote academic success so as to develop effective ways to enhance school achievement.

Extensive evidence has suggested that early academic skills are a robust indicator of later achievement (Duncan et al., 2007). In many cases and to a large extent, high-quality early learning experiences may account for the early academic aptitude of a young child (e.g., Melhuish et al., 2008), but an outstanding question remains: What individual characteristics makes a successful young reader or mathematician? Given that there is extensive variation in early learning experiences which shape early academic skills—what is the contribution of individual factors to later academic success? We seek to better understand whether early academic skills are as deterministic of later academic ability as prior investigations might suggest (e.g., Duncan et al., 2007). The goal of the present study is to understand whether individual cognitive skills may compensate for lower levels of early preparedness of academic success in late elementary school in a sample of students living in lowincome families in two rural areas of the United States. Multiple candidate predictors were tested for their unique contribution to 5th grade math and reading skills. We focus primarily on executive function (EF) prior to school entry and examine its predictive validity while simultaneously considering a number of well-established predictors of school outcomes including subject specific prekindergarten (PreK) academic knowledge, and other indicators of cognitive functioning, including vocabulary, IQ and processing speed. Beyond understanding robust predictors of later academic ability, controlling for early academic ability, we also sought to test whether PreK cognitive abilities might be compensatory, that is, help a child "catch up" to their peers if they begin schooling with relatively low level of academic ability.

## Background

For a child to succeed in modern society, they must be a successful reader. The ability to read is foundational for nearly all schoolbased learning and undergirds opportunities for academic and vocational success. Importantly, the development of reading has been characterized as a process in which the child must transition from "learning to read" to "reading to learn" (Center for Public Education, 2015). For over 20 years, the United States has made it a national priority to make every child a proficient reader by the end of 3rd grade, and yet still over 50% of children test below the level of proficiency on reading assessments as recently as 2015 (U.S. Department of Education, Institute of Education Sciences, National Center for Education Statistics, 2015). Crucially, an even greater proportion of children from low-income families (as measured by eligibility for free/reduce price lunch) test below proficient on reading assessments. This is important because nearly three-quarters of students who test below proficient in 3rd grade remain below proficient in high school (Shaywitz et al., 1992), and are four times more likely to drop out of high school than their peers who test proficient (Hernandez, 2011). More research examining early individual-level predictors of later reading ability and reading difficulty is needed.

As with reading, the late elementary grades appear to be an important transition time for the development of mathematics ability: Children who fail a math course in 6th grade have a 60% chance of dropping out of high school (Balfanz et al., 2007). The National Mathematics Advisory Panel (2008) stated in their report that in order to be prepared for high school graduation and college attendance, students should prove a firm understanding of topics covered in Algebra 2 by the time they are eligible for high school graduation. In order to be on track to succeed in Algebra 2, students should be enrolled in Algebra 1 by 8th grade, or when children are around 14 years old. As with reading, however, national assessments reveal less than 50% of children perform above the proficient level in math, and over 75% of students from low-income homes perform below the proficient level (U.S. Department of Education, Institute of Education Sciences, National Center for Education Statistics, 2015).

The importance of achievement in elementary school academics is not simply related to later academic attainment. Several studies have found that test scores prior to high school are positively associated with labor market outcomes, including income and employment, even when analyses control for educational attainment (Rose, 2006). Such studies have found positive associations between both reading and math achievement and labor market outcomes over and above motivation and intelligence as early as when children are age seven, or around 3rd grade (Ritchie and Bates, 2013).

What, then, differentiates a successful elementary school reader and mathematician from an unsuccessful one? Extensive evidence from the last decade has suggested that early skills predict later skills: the strongest and most robust predictor of a child's later academic skills is their earlier academic skills. Duncan et al. (2007) reported in a meta-analysis of six nationallyrepresentative datasets of three countries that math and reading skills at kindergarten entry robustly predicted high school math and reading skills net of background characteristics and socioemotional skills. These findings have been replicated and extended to suggest that early academic skills are important even for certain socioemotional skills in later years (Romano et al., 2010), and over a variety of time scales (Jordan et al., 2009). Collectively, these studies have tested PreK and kindergarten behavioral, cognitive, and socioemotional skills that predict scores on assessments of reading and math ability and suggested that, net of a broad host of covariates, there is a strong domainspecific stability of academic skills.

However, the development of academic skills does not occur in isolation: children are exposed to a multitude of academic settings that contribute to the promotion of math and reading skills. As such, intervention in the time between school entry and late elementary school could have an effect. Experimental studies have shown that curricular intervention in the early elementary years can result in improved domain-specific skills. However, the effects are limited. A meta-analysis of elementary school math intervention programs for typically performing students found that even the most successful intervention programs

had a median effect size of +0.33 (Slavin and Lake, 2008). Similarly, highly effective reading intervention programs for children between kindergarten and 1st grade showed a median effect size of +0.22, and for children between 2nd and 5th grade, a median effect size +0.13. These findings suggest that there is only so much that can be done in domain-specific instructional settings to move the needle on academic ability between school entry and late elementary school.

A separate, though highly related literature has suggested that there are other classroom skills that may contribute to the development of math and reading skills during elementary years (Durlak et al., 2011). For years, there has been an interest in EF as a driving force of academic learning. EF comprises skills engaged in service of goal-directed behaviors, which include the ability to inhibit highly automatic or prepotent responses to stimulation, to store and manipulate information in working memory, and to flexibly shift the focus of attention among multiple relevant aspects of a given set of stimuli. EF skills are important for children's learning, especially in their ability to attend to and integrate information taught in classroom settings, and have been implicated in the development of academic skills (Blair, 2002; Blair and Razza, 2007; Best et al., 2011). Further, there has been extensive evidence to suggest specific associations between EF and the development of each reading and math in elementary school.

A robust literature has indicated a relation between EF and reading skills throughout the academic lifespan. There is evidence that EF is related to early precursors to reading (Blair and Razza, 2007), and that the associations between EF and reading is present and largely invariant from when children are in elementary grades (e.g., when they are making the transition from "learning to read" to "reading to learn") through high school years (Christopher et al., 2012). Though there is some question of directionality of influence (i.e., whether EF underlies the development of reading or whether successful reading improves EF), there are correlational studies which suggest children who have impaired reading abilities also have particularly weak EF skills (Carretti et al., 2009; Cutting et al., 2009), and that there is unique variance contributed to reading comprehension by EF, net of a host of other factors commonly associated with the development of reading comprehension (Sesma et al., 2009). As well, there is evidence from cognitive neuroscience that the development and change of brain structures that support EF parallels the process of reading acquisition (Cartwright, 2012).

The association between EF and math has been similarly well documented. There has been extensive correlational evidence to suggest EF contributes significant variance to success in math across a wide range of age groups, from preschool and kindergarten (Blair and Razza, 2007; Kroesbergen et al., 2009) through adulthood (Kalaman and Lefevre, 2007) and at intervening ages (Swanson, 2004; Männamaa et al., 2012). Additionally, a limited number of experimental and training studies have corroborated and added a directional component to the hypothesis that EF underlies the development of mathematical skills. For example, there is some evidence that training EF skills in early and middle childhood results in better numeracy and math reasoning skills (Fuchs et al., 2003; Kroesbergen et al., 2014). As with reading skills, there is also evidence that poor EF is often correlated with math learning disabilities (Clark et al., 2010; Toll et al., 2011; Willoughby et al., 2016).

Additionally, prior studies have suggested EF may interact with other early academic skills to moderate the association between early and later academic skills. Studies have shown through cross-lagged models that EF predicts change in math and reading skills over and above stability from PreK to kindergarten (Welsh et al., 2010), and a recent analysis from the current dataset found that the interaction of EF at age 4 with math in PreK moderated the strength of the association between math abilities in PreK and kindergarten (Blair et al., 2016). Similarly, a separate study found higher levels of EF skills in kindergarten related to faster growth of math skills in early elementary school (Lee and Bull, 2016). Together, these suggest that there may be a compensatory effect of EF: despite a high degree of stability between early academic skills and later academic skills (e.g., La Paro and Pianta, 2000; Duncan et al., 2007), there may be alternative mechanisms that could be leveraged to help students "catch up" throughout the elementary years.

## Present Study

The objective of the present study is to investigate the unique role of EF measured in early childhood in predicting academic achievement in late elementary school, an important transition time in children's academic career. In particular, we are interested in the predictors of academic skills for students from predominantly low-income and rural (non-urban) areas of the United States. These students are at elevated risk for failure to complete high school and dropping out of school. We analyze data collected on children's EF and math and literacy skills, along with other cognitive functions such as IQ, speed of processing, and receptive vocabulary prior to kindergarten entry, then assess math and literacy skills again when children are in 5th grade.

We pose two primary questions in the present study. First, we investigate predictors of academic skills in 5th grade. Our first hypothesis is that child EF measured prior to school entry will be uniquely associated with both later math and reading skills, even when controlling for cognitive functions and early math and reading with which EF is known to be associated. As such, we intend to estimate the amount of change in academic achievement attributable to EF above and beyond earlier academic knowledge and cognitive functioning. Second, we extend the analyses of Blair et al. (2016) to test whether there may be a compensatory effect of EF or other cognitive skills to 5th grade. That is, we investigate whether EF ability changes or moderates the association between early math and reading ability, measured in preK, and later math and reading ability measured in grade 5. It is expected that findings from this analysis will indicate an important mechanism through which children with lower levels of academic ability at school entry can "catch up" to their higher-achieving peers. As such, we hypothesize that children with high levels of EF at school entry will perform well on assessments of math and literacy in late elementary

school, even if they had low levels of achievement at school entry.

## MATERIALS AND METHODS

## Participants

Participants were recruited as a part of a prospective, longitudinal study. The Family Life Project (FLP) recruited children and their families from two distinct geographical areas of the United States with high rates of poverty. Three counties in eastern North Carolina and three in central Pennsylvania were selected to be indicative of the Black South and Appalachia, respectively. Children were recruited to be representative of one of the six counties in which families resided at the time of the child's birth. Low-income families were oversampled in both states, and African American families were oversampled in North Carolina. Full details of the sampling procedure have been described elsewhere (Vernon-Feagans and Cox, 2013).

A total 1,292 families were recruited to take part in data collection when the child was 2 months of age, at which point they were formally enrolled in the study.

## Procedures

Demographic data were drawn from regularly scheduled home visits conducted over the course of time when children were 2 months old to 3 years old. EF data were drawn from direct assessment conducted during a home visit when children were 5 years old. Academic skills were measured prior to kindergarten entry (PreK) and in 5th grade. Assessments took place in school settings when possible, or in home settings in cases that children were not enrolled in center- or schoolbased care at any of the time points. Children were also assessed in school settings during kindergarten, 1st, 2nd, and 5th grades. A subset of children was also assessed in school settings during 3rd grade. Additionally, children were assessed in the home seven times between when children were 2 months and 5 years of age. Only data from the PreK, age 5, and 5th grade data collection time points are included in the present study.

#### Measures

#### Executive Function (EF)

Executive function assessment comprised six tasks. All tasks were administered on an open spiral-bound notebook by a trained research assistant. These tasks are described in detail and evaluated elsewhere (Willoughby et al., 2010; Willoughby and Blair, 2011; Willoughby et al., 2012) and thus only abbreviated descriptions of each task are provided.

#### **Working memory span (working memory)**

Children were shown a line drawing of an animal and a color inside an image of a house and asked to keep both the animal and the color in mind, and to recall one of them (e.g., animal name) when prompted. Task difficulty increased by adding items to successive trials: Children received one 1-house trial, two 2-house trials, two 3-house trials, and two 4-house trials. Responses were summarized as the number of items answered correctly within each item set.

#### **Pick the picture game (working memory)**

This is a self-ordered pointing task in which children were presented with a series of 2, 3, 4, and 6 pictures and instructed to continue picking pictures until each picture had "received a turn." Children are presented with successive pages in which the set of pictures within an item set is re-ordered. The ordering of pictures within each item set is randomly changed (including some trials not changing) so that spatial location is not informative. This task requires working memory because children have to remember which pictures in each item set they have already touched.

#### **Silly sounds stroop (inhibitory control)**

This task was modeled after the Day–Night Stroop task. Children were asked to make the sound opposite of that associated with pictures of dogs and cats (e.g., meow when shown a picture of a dog).

#### **Spatial conflict arrows (inhibitory control)**

Children were given two response cards ("buttons") and were instructed to touch the card consistent with the direction in which an arrow presented on the flipbook page was pointing. Training trials presented compatible images on the same side, and test trials presented arrows contralateral to the correct response (e.g., an arrow pointing right was presented on the left side).

#### **Animal go/no-go (inhibitory control)**

This is a standard go no-go task in which children were instructed to push a button (which emitted a sound) whenever they saw an animal appear, except when the animal was a pig. The number of go-trials before a no-go trial varied, in a standard order, of 1-go, 3-go, 3-go, 5-go, 1-go, 1-go, and 3-go trials.

#### **Something's the same game (attention shifting)**

Children were shown two pictures that were similar on a single criterion (e.g., the same color; the same size), and were then shown a third picture, similar to one of the first two pictures along a second dimension of similarity (e.g., shape). Participants were asked to identify which of the first two pictures was the same as the new picture.

#### **Executive function task scoring and composite function**

Item response theory (IRT) scoring was used for all tasks in the EF battery. Z-scores were calculated to reflect accuracy on each of the six EF assessments. The total score reflected the mean of all completed z-scored individual scores. We use a formative composite, as it has been found to more appropriately represent the overarching construct of EF than a latent factor, which is limited to measurement of the shared variance between tasks which are only weakly- to moderately correlated (Willoughby et al., 2016). Prior investigations using the described battery of assessments with the same population have demonstrated acceptable psychometric properties of the resulting EF score (Willoughby et al., 2012). As is typical of EF measures (Willoughby et al., 2014), the reliability coefficient for the composite was relatively low, α = 0.50.

#### Woodcock–Johnson III Tests of Achievement

The Woodcock–Johnson III Tests of Achievement (Woodcock et al., 2001) are a set of co-normed tests that measure general scholastic ability, oral language, and academic achievement and are appropriate for administration for ages 3–92. The reliability and validity of these measures have been well established elsewhere (Woodcock et al., 2001). For all subtests, age-normed standard scores were used.

#### **Applied Problems**

The Applied Problems (AP) subtest measures early math skills including counting, measurement, and verbal and non-verbal arithmetic and operations.

#### **Brief Reading Cluster**

The Brief Reading Cluster (BFR) reflects the average of children's scores on two Woodcock–Johnson subtests: Letter-Word Identification and Passage Comprehension. The Letter-Word Identification (LW) subtest measures basic literacy skills including letter recognition, letter sounds, and reading ability. The Passage Comprehension (PC) subtest also measures basic literacy skills including children's ability to provide the missing word for a sentence so that it makes sense.

#### Covariates

Individual- and family-level covariates were included in final models of analyses. These covariates included indicator variables for child sex (1 = male; 0 = female), as well as continuous variables for cumulative risk, processing speed, general intelligence, and receptive vocabulary.

#### **Cumulative risk**

The cumulative risk variable is a mean composite of z-scored variables collected from home visits between when participants were 6 and 36 months of age. The variable is made up of items that include family income-to-need ratio (i.e., family income divided by the federal poverty threshold for a family of the relevant size), maternal education, maternal working hour, household density, and a rating of safety of the neighborhood in which the child lives.

#### **Processing speed**

At the PreK visit, processing speed was measured using two subtests of Wechsler Preschool and Primary Scales of Intelligence (WPPSI; Wechsler, 2002). For assessment of processing speed, the symbol search and coding subscales were used. The symbol search subtest asks participants to scan a group as quickly as possible and indicate whether a target symbol matches any symbols in the group. The coding subscale asks participants to match symbols with geometric shapes, and to reproduce the geometric shapes corresponding to the appropriate symbols.

#### **General intelligence**

At the age 3 home visit, children completed the block design and receptive vocabulary subtests of the Wechsler Preschool and Primary Scales of Intelligence (WPPSI; Wechsler, 2002). A full-scale IQ score was estimated.

#### **Receptive vocabulary**

At the PreK visit, receptive vocabulary was measured using the Peabody Picture Vocabulary Test-4th Edition (PPVT; Dunn and Dunn, 2007), a norm-referenced assessment commonly used with children of this age. In this direct assessment, the child is shown four pictures, and the data collector asks the participant to point to one of the four images (e.g., "Can you point to the ball?"). Age-normed standard scores were used in all analyses.

## Analytic Strategy

Our primary research question asks whether EF skills before kindergarten entry uniquely predict academic skills over and above earlier academic skills themselves. Simultaneous models were estimated in a path analysis to regress 5th grade math and reading scores onto EF, PreK math and pre-literacy skills, and other covariates measured prior to kindergarten entry. Next, we sought to investigate whether having high levels of EF prior to kindergarten would buffer against having lower academic skills. Two interaction terms between EF and PreK math and pre-literacy skills were added to the path model. Simple slopes of significant interaction terms were assessed. All models were estimated using Mplus (Muthén and Muthén, 2007) and took the complex sampling design of the Family Life Project into account with sample weights and stratification. In all models, coefficients represent the unique variance attributable to each variable, adjusted for all other variables in the model. Correlations between outcome variables and between predictor variables were estimated in all models.

All analyses are limited to children for whom a direct assessment of EF or academic skills was conducted. Thirteen children were excluded from analyses for having no available direct assessment data, leaving a total of 1,279 participants. For those participants who completed at least one wave of direct assessment, missing data was accounted for using Full Information Maximum Likelihood estimation. Full Information Maximum Likelihood estimation takes into account the covariance matrix for all available data on the independent variables to estimate parameters and standard errors. This approach provides more accurate estimates of regression coefficients than do listwise deletion or mean replacement (Enders, 2001).

## RESULTS

## Descriptive Statistics

Unweighted descriptive statistics and correlations for all variables in the analyses are presented in **Table 1**. N = 1026 participants completed EF assessment at age 5; N = 877 completed academic ability assessments during their 5th grade year. Standard scores for 5th grade math and reading assessments were near average for the normative sample (Normative Sample: M = 100, SD = 15; Present Sample: M = 98.82, SD = 14.94; M = 97.86, SD = 14.28, respectively). Participants who were not assessed at 5th grade did not differ from those who were assessed at 5th grade on measures of IQ, t(1033) = 1.323, p = 0.186; speed of processing, t(844) = −0.012, p = 0.990; receptive vocabulary, t(962) = 0.278, p = 0.781; EF, t(1024) = −0.352, p = 0.725; cumulative risk, t(1220) = −0.151, p = 0.880; or PreK literacy skills, t(979) = 1.594, p = 0.111. Participants who were assessed

TABLE 1 | Descriptive statistics.


at 5th grade scored, on average, 1.5 points higher on the PreK math assessment, t(976) = 2.189, p = 0.029.

Bivariate correlations for all variables included in the sample are presented in **Table 2**. Academic outcome measures (5th grade math and reading skills) were highly correlated with one another (r = 0.701, p < 0.001). Additionally, 5th grade scores were highly correlated with pre-kindergarten EF (Math: r = 0.506, p < 0.001; r = 0.435, p < 0.001), respectively. Both constructs were also correlated with speed of processing (Math: r = 0.398, p < 0.001; Reading: r = 0.353, p < 0.001) and receptive vocabulary (Math: r = 0.528, p < 0.001; Reading: r = 0.513, p < 0.001). As such, these direct assessment measures from pre-kindergarten entry were included in all analyses.

## Prekindergarten Predictors of Elementary School Math in 5th Grade

Results of the associations of predictor variables with 5th grade math ability are reported in Model 1 of **Table 3**. Prekindergarten math, and EF were both significantly and uniquely associated with later math ability (β = 0.367, p < 0.001; β = 0.209, p < 0.001, respectively). Additionally, child sex was associated with late elementary math ability such that, on average, male participants had higher scores than their female peers (β = 0.146, p < 0.001). Further, both IQ and processing speed were positively associated with late elementary math ability (β = 0.114, p = 0.004; β = 0.081, p = 0.028). Receptive vocabulary, cumulative risk, and PreK prereading skills were not associated with 5th grade math, net of other variables in the model. As well, 5th grade reading ability remained moderately correlated with math ability (r = 0.497, p < 0.001). The total model accounted for 53.2% of the total variance in 5th grade math ability scores (R <sup>2</sup> = 0.532).

## Prekindergarten Predictors of Elementary School Reading in 5th Grade

Results of the simultaneous regression of 5th grade reading on predictor variables are reported in Model 1 of **Table 4**. PreK math, pre-literacy, and EF were significantly and uniquely associated with later reading ability (β = 0.157, p = 0.002; β = 0.236, p < 0.001; β = 0.149, p < 0.001, respectively). Additionally, prekindergarten receptive vocabulary was positively associated with later reading (β = 0.128, p = 0.004); however, IQ, processing speed, and cumulative risk were not associated with later reading skills. The total model accounted for 44% of the total variance in 5th grade reading scores (R <sup>2</sup> = 0.440).

## Does Early EF Buffer against Low Academic Math Skills?

To test whether high levels of early EF would buffer against low levels of early academic skills, we added interaction terms of both EF with PreK math and EF with PreK pre-literacy scores to path model above. Results are reported in Model 2 of **Tables 3**, **4**. The interaction term of EF with PreK math significantly predicted both 5th grade math (β = −0.585, p = 0.016) and 5th grade reading skills (β = −0.754, p = 0.007), but the interaction term of EF with PreK reading did not relate to either 5th grade outcome.

Inclusion of the interaction terms in the model slightly improved the amount of variance being explained in both 5th grade math (R <sup>2</sup> = 0.544) and reading (R <sup>2</sup> = 0.454). To isolate the moderating role of EF and test whether children who demonstrated different levels of cognitive skills more generally saw differential magnitudes of associations of PreK math and later academic skills, we also tested interactions of PreK math with IQ, processing speed, and receptive vocabulary. None of the resulting interaction effects were significant. For all subsequent analyses, the interaction between EF and PreK reading was removed to better interpret simple slopes.

Analysis of simple slopes revealed that for children who at the sample mean for EF, there was a moderate association of PreK math with 5th grade math (β = 0.426, p < 0.001) and reading (β = 0.234, p < 0.001). For participants who had scores 1SD above the sample mean on EF (e.g., a value of 0.77 on the EF score that reflects the mean of z-scores from each individual EF measure, MEF = 0.29) the coefficient for PreK math was smaller than that of the sample mean in predicting 5th grade math (β = 0.356, p < 0.001) and reading (β = 0.146, p = 0.001). Similarly, the simple slopes for values 1SD below the sample mean for EF (e.g., a value of −0.19) were also significant, such that children who had low levels of EF in PreK had a higher magnitude of the coefficient on pre-kindergarten Applied Problems scores for 5th grade math (β = 0.509, p < 0.001) and reading (β = 0.330, p < 0.001). In other words, the variance of 5th grade math and reading ability associated with earlier math changed as a function of children's EF, such that children with a higher level of EF ability at PreK were better able to "catch up" with their peers who were better at math in PreK. This is shown graphically in **Figures 1**, **2**.

## DISCUSSION

The goal of this study was to investigate the role of EF in predicting academic achievement in late elementary school in a diverse sample of children from low-income families. In particular, we were interested in whether there was an

#### TABLE 2 | Correlations among variables.

fpsyg-08-00869 May 30, 2017 Time: 10:43 # 7


∗∗p < 0.01, ∗∗∗p < 0.001.

#### TABLE 3 | Models predicting Applied Problems scores.


<sup>∗</sup>p < 0.05, ∗∗p < 0.01, ∗∗∗p < 0.001; AP, Applied Problems subtest of the Woodcock–Johnson; LW, Letter-Word Identification subtest of the Woodcock–Johnson.

#### TABLE 4 | Models predicting Brief Reading Cluster scores.


<sup>∗</sup>p < 0.05, ∗∗p < 0.01, ∗∗∗p < 0.001; AP, Applied Problems subtest of the Woodcock–Johnson; LW, Letter-Word Identification subtest of the Woodcock–Johnson.

association between EF and 5th grade math and reading achievement over and above the predictive value of earlier math and reading scores and other cognitive abilities. We also sought to investigate whether the predictive value of PreK math and reading abilities varied as a function of child EF.

In our analysis of main effects only, we found that, while early math and reading were both important predictors of later math and reading, PreK EF was associated with more than 1/5th of a standard deviation in math (three points on the standardized measure of 5th grade math used in the present sample), and nearly 1/7th of a standard deviation in reading (over two points on the reading measure). This association was net of other cognitive covariates, including IQ (1/10th of a standard deviation in math), processing speed (1/12th of a standard deviation in math), and receptive vocabulary (1/8th of a standard deviation in reading), and the predictive value of EF was greater than that of other cognitive covariates.

In testing the interaction between EF and early academic ability, we found a significant interaction between EF and early math (but not EF and pre-reading skills) indicating that higher EF ability can compensate to some extent for limited academic knowledge prior to school entry. Children with initially low math ability but with higher EF may still reach the levels of achievement in math and reading typically associated with more proficient domain-specific prerequisite skills. This suggests EF may serve as an important skill set that helps students "catch up" with their higher-achieving peers in academic settings, even if they start out behind.

Notably, prior investigations (e.g., Duncan et al., 2007; Pagani et al., 2010) have suggested early math is a stronger predictor of later reading than early pre-reading or other skills. In the current investigation, we find there is a domain-specific relation between early and later skills: pre-reading is the strongest predictor of later

reading skills, as early math is the strongest predictor of later math skills. In fact, in the analyses predicting 5th grade reading scores, both EF and receptive vocabulary predict between 13 and 15% of a standard deviation in later reading scores, which is comparable to the 16% of a standard deviation predicted by early math ability.

Altogether, our results suggest one major theme: early EF is important in the development of later academic skills. Not only is EF a unique predictor of 5th grade math and reading ability, but our analysis suggests that high levels of early EF can help to compensate for low levels of academic ability in PreK. This interaction between math and EF in PreK is of particular interest and merits additional investigation. This finding in the present study extends prior analyses from this dataset demonstrating that EF moderates the magnitude of the association between PreK and kindergarten math (Blair et al., 2016). If indeed there is a group of children who had high EF who performed on par with their peers who were more proficient in math in PreK in 5th grade skills, this may support other empirical evidence that suggests intervention on early EF is important for success in school. A person-centered analysis may shed light on this.

The relation between PreK math and 5th grade reading, as well as the relation between the interaction of PreK math and EF and 5th grade reading merits additional discussion. As was suggested by Duncan et al. (2007), the association between early math and later reading may be spurious, despite extensive and robust controls for home and individual cognitive characteristics; however, as this finding has now been replicated in multiple large, prospective datasets (e.g., Duncan et al., 2007; Pagani et al., 2010), it seems likely there is some signal through the noise. Importantly, in the present investigation and others that have found the association between early math and later reading, the assessment of early mathematical skills privileges word problems, which are read to the participant by a trained assessor. In order to correctly solve each problem, participants must understand the demands of the task, decode what the problem is asking them to do, compute and discover the response, and respond in an appropriate way. These steps require engagement of EF and are, in many ways, also central to reading comprehension. In contrast, the assessment of early reading skills in the present investigation requires knowledge of letter words and sounds, which is an important facet of learning to read, but is less relevant for children once they make the transition to reading to learn.

Ultimately, the present investigation contributes to the growing literature about the role of EF in education. Other studies have found EF is a strong and stable predictor of later academic skills (Best et al., 2011). This is an important and provocative finding; however, additional research is needed to better understand the mechanisms by which EF contributes to the development of academic skills. Various hypotheses have been tested and have suggested there may be a role of EF in fostering positive relationships with teachers (Blair et al., 2016) and in promoting self-regulatory behaviors in the service of learning from instruction in the classroom (Brock et al., 2009), or completing homework outside the classroom (Langberg et al., 2013). It is likely a combination of these and other skills that serve as mediating mechanisms by which EF affects academic learning. These may also account for the finding here that high levels of EF serve as a way for children to "catch up" even though they have low levels of math ability in PreK: Children who are able to leverage their high levels of EF to be more engaged, attentive, and productive inside and outside the classroom may ultimately learn more material. Separately, it may be that higher levels of academic skills promote the development of EF, as has been found previously (Daneri and Blair, 2017). Further research is needed to better understand both the uni-and bi-directional relations between EF so as to better intervene upon on and foster the development of EF in young children.

The role of EF in the development of math skills is well established. This study is consistent with the findings of a number of prior analyses, which suggest early EF is related to math ability throughout the academic lifespan (Mazzocco and Kover, 2007; Bull and Lee, 2014), and that having high levels of at least some aspects of EF (e.g., working memory) may be associated with faster growth in math ability (Lee and Bull, 2016). This study is somewhat unique, however, in controlling for a host of covariates, including academic knowledge and ability measured prior to school entry as well as multiple highly robust correlates of both EF and academic achievement, namely processing speed, receptive vocabulary, and general intelligence. Notably, in this analysis, the magnitude of the association on EF was greater in predicting math than reading skills in the 5th grade. Several prior studies have also found this to be the case (Best et al., 2011). The role of EF, particularly working memory, in reading and vocabulary, however, is well established (e.g., Daneman and Carpenter, 1980; Alloway, 2010; Loosli et al., 2012; Karbach and Verhaeghen, 2014). In part, associations between EF and reading are attributable to a close association between EF and language development (Gathercole and Baddeley, 1989; Daneman and Merikle, 1996). Specific effects of EF on reading are seen most consistently for reading comprehension and fluency, however, rather than for more basic, knowledge-based aspects of reading, such as knowledge of letters and words (Cutting et al., 2009; Sesma et al., 2009; Kieffer et al., 2013; Fuchs et al., 2015). The Woodcock–Johnson Brief Reading Cluster analyzed here combines the letter-word subtest with the passage comprehension subtest and it may be that this more knowledge-based aspect of the assessment led to a reduced association between EF and reading. This may also have important implications as to why we do not find a significant interaction of PreK Letter-Word scores and EF: it may be that the variance explained in 5th grade Brief Reading Cluster scores by letter-word scores is unrelated to EF, and that variance attributable to EF is limited to the passage comprehension. It may also be that the relation of EF to mathematics is in fact stronger in the elementary grades.

Of additional interest, our results reveal an association of child gender with scores on math, but not reading. Extensive research has suggested a correlation between cultural beliefs of gender stereotypes in academic performance and the realized gender-based gap in performance on math and science on an international level (Nosek et al., 2009). Indeed, within the United

States, there has long been documented strong cultural stereotype that math is a male domain (Hyde et al., 1990). These beliefs are embedded in our daily lives and can be seen both implicitly and explicitly in children as early as elementary school (Lummis and Stevenson, 1990; Cvencek et al., 2011). Active efforts are being made to better understand how gender stereotypes about academic achievement are communicated to young children (Gunderson et al., 2012), and to intervene on and mitigate the effects of the cultural embeddedness of gender stereotypes in math and science fields (Beede et al., 2011).

There are several limitations that must be addressed in the context of this investigation. First, it is important to note that while this study is longitudinal in nature, causality cannot be inferred. Second, there is a large literature that has described the importance of teacher, school, and classroom characteristics in the development of early academic skills (including math, reading, and EF) and growth in those academic skills throughout schooling. In the present study, we lack measurement of instructional quality and school and classroom context. These are important omitted variables that may account for additional variance in outcome measures. Additionally, the present sample is limited to only two regions of the United States, and results may not generalize to others or to regions outside the United States. The current findings may only apply to children from rural areas of the United States, or to children born to low-income families. Finally, it is important to note that assessment of academic skills was limited to research assistant administered standardized assessments. While performance on these assessments is generally correlated with performance on formative and summative assessments in school contexts, it is likely these assessments capture only some aspects of math and reading achievement. Finally, it is important to note that the measurement of both EF and math and reading is complex, and though we use well-established and comprehensive measures, there remains aspects of those constructs that go unmeasured. For example, one of our assessments of EF assesses aspects of short term memory in addition to working memory, and working memory cannot be isolated. Similarly, the assessment of math

## REFERENCES


ability privileges certain aspects of mathematics knowledge (e.g., counting, cardinality, and operations) over others (e.g., geometry).

Despite these limitations, results from the present investigation make a strong case for the importance of early skills. Beyond math and reading, there should be a focus in early childhood education on the development of EF, as EF fosters the development of high level math and reading in late elementary school, and may even serve as a mechanism by which children can catch up to their high achieving peers.

## ETHICS STATEMENT

This study was carried out in accordance with the recommendations of the Institutional Review Board at Pennsylvania State University and the Office of Human Research Ethics at the University of North Carolina with written informed consent from all subjects. All subjects gave written informed consent in accordance with the Declaration of Helsinki. The protocol was approved by the Institutional Review Board at Pennsylvania State University and the Office of Human Research Ethics at the University of North Carolina.

## AUTHOR CONTRIBUTIONS

AR conceptualized the study, carried out the initial analyses, drafted the initial manuscript, and approved the final manuscript as submitted. CB and MW reviewed and revised the manuscript, and approved the final manuscript as submitted.

## FUNDING

Support for this research was provided by the National Institute of Child Health and Human Development grants R01 HD51502 and P01 HD39667 with co-funding from the National Institute on Drug Abuse.




**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2017 Ribner, Willoughby, Blair and The Family Life Project Key Investigators. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

## Inhibition, Updating Working Memory, and Shifting Predict Reading Disability Symptoms in a Hybrid Model: Project KIDS

Mia C. Daucourt<sup>1</sup> \*, Christopher Schatschneider1,2, Carol M. Connor<sup>3</sup> , Stephanie Al Otaiba<sup>4</sup> and Sara A. Hart1,2 \*

<sup>1</sup> Department of Psychology, Florida State University, Tallahassee, FL, United States, <sup>2</sup> Florida Center for Reading Research, Florida State University, Tallahassee, FL, United States, <sup>3</sup> School of Education, University of California, Irvine, Irvine, CA, United States, <sup>4</sup> Department of Teaching and Learning, Southern Methodist University, Dallas, TX, United States

#### Edited by:

Jacob A. Burack, McGill University, Canada

#### Reviewed by:

Jurgen Tijms, University of Amsterdam, Netherlands Natasha Kirkham, Birkbeck University of London, United Kingdom

#### \*Correspondence:

Mia C. Daucourt daucourt@psy.fsu.edu Sara A. Hart hart@psy.fsu.edu

#### Specialty section:

This article was submitted to Developmental Psychology, a section of the journal Frontiers in Psychology

Received: 03 March 2017 Accepted: 12 February 2018 Published: 20 March 2018

#### Citation:

Daucourt MC, Schatschneider C, Connor CM, Al Otaiba S and Hart SA (2018) Inhibition, Updating Working Memory, and Shifting Predict Reading Disability Symptoms in a Hybrid Model: Project KIDS. Front. Psychol. 9:238. doi: 10.3389/fpsyg.2018.00238 Recent achievement research suggests that executive function (EF), a set of regulatory processes that control both thought and action necessary for goal-directed behavior, is related to typical and atypical reading performance. This project examines the relation of EF, as measured by its components, Inhibition, Updating Working Memory, and Shifting, with a hybrid model of reading disability (RD). Our sample included 420 children who participated in a broader intervention project when they were in KG-third grade (age M = 6.63 years, SD = 1.04 years, range = 4.79–10.40 years). At the time their EF was assessed, using a parent-report Behavior Rating Inventory of Executive Function (BRIEF), they had a mean age of 13.21 years (SD = 1.54 years; range = 10.47–16.63 years). The hybrid model of RD was operationalized as a composite consisting of four symptoms, and set so that any child could have any one, any two, any three, any four, or none of the symptoms included in the hybrid model. The four symptoms include low word reading achievement, unexpected low word reading achievement, poorer reading comprehension compared to listening comprehension, and dual-discrepancy responseto-intervention, requiring both low achievement and low growth in word reading. The results of our multilevel ordinal logistic regression analyses showed a significant relation between all three components of EF (Inhibition, Updating Working Memory, and Shifting) and the hybrid model of RD, and that the strength of EF's predictive power for RD classification was the highest when RD was modeled as having at least one or more symptoms. Importantly, the chances of being classified as having RD increased as EF performance worsened and decreased as EF performance improved. The question of whether any one EF component would emerge as a superior predictor was also examined and results showed that Inhibition, Updating Working Memory, and Shifting were equally valuable as predictors of the hybrid model of RD. In total, all EF components were significant and equally effective predictors of RD when RD was operationalized using the hybrid model.

Keywords: reading, reading disability, hybrid model, executive function, shifting, updating, working memory, inhibition

## INTRODUCTION

fpsyg-09-00238 March 19, 2018 Time: 18:56 # 2

Moving away from a focus on general intelligence, achievement research has shifted to an emphasis on other cognitive and behavioral correlates of academic achievement, including selfregulation. One of the main components of self-regulation is a concept originally introduced by Baddeley and Hitch (1974) as the "central executive," which is currently referred to as "executive function." Executive function (EF) comprises the skills required for an individual to work toward a goal and make judgments in novel, unforeseen situations and includes regulation of both thought and action. Examples of these self-directed skills include planning ahead, problem solving, decision making, attention maintenance and direction, emotional regulation, and behavioral control (Sesma et al., 2009).

Due to the broad scope of the processes and capacities mediated by EF, there is a lack of consensus among researchers about the specific constituents that make up the EF construct (Sadeh et al., 2012). A significant inquiry about EF is whether EF is a part of a unified construct, like g for intelligence, or if it represents a multicomponent system. The unity and diversity paradigm (Miyake et al., 2000; Becker et al., 2014) reconciles this debate by claiming that the EF is both unitary and divisible into subcomponents, which are both inter-related and separate. The shared variance among EF components points to a common thread present in all EF abilities, while the unique variance linked to each individual constituent represents what is distinctive about that particular component of EF (Miyake et al., 2000; Miyake and Friedman, 2012). Research has shown support for EF as an independent yet unitary construct in younger children in both pre-kindergarten and kindergarten (Miyake and Friedman, 2012; Fuhs et al., 2014). On the other hand, research conducted with older children (Brocki and Bohlin, 2004), twins (Friedman et al., 2008), children and adolescents with brain damage (Levin et al., 1996), neurocognitive pathologies (e.g., Culbertson and Zillmer, 1998; Poljac et al., 2010), and typically developing elderly populations (Robbins et al., 1998) has provided evidence for a multicomponent EF system (Lehto et al., 2003; Huizinga et al., 2006). Additionally, many EF tasks that tap presumably separate EFs are not significantly correlated (Miyake et al., 2000; Banich, 2009), which may further indicate the existence of multiple EF constituents.

Even though the precise rudimentary components of EF are still debated, the most common division of EF includes three components: prepotent response inhibition, updating and monitoring of working memory, and mental set shifting (Miyake et al., 2000; Davidson et al., 2006; Best and Miller, 2010; Miyake and Friedman, 2012). "Inhibition" is the capacity to obstruct automatic or dominant responses when they are not appropriate for the context at hand (Miyake et al., 2000; St Clair-Thompson and Gathercole, 2006; Toplak et al., 2013). It includes the ability to suppress the influence of interfering information (Barkley, 1999; Bexkens et al., 2015), and in the case of reading this means being able to suppress the irrelevant meanings of a current word based on the context in which it is nested, or to stop reading at the end of your assigned paragraph when reading aloud in class. "Updating Working Memory" is a screening and coding system that reviews information based on its circumstantial significance, constantly eliminating extraneous information and replacing it with more relevant information. It also represents our cognitive capacity for simultaneous processing of multiple tasks, and in the case of reading, these tasks could include decoding unknown words (Sesma et al., 2009), retrieving the meaning of known words (Sesma et al., 2009), remembering previously read text, and anticipating upcoming text (Daneman and Carpenter, 1980; Sesma et al., 2009; Nouwens et al., 2016). "Shifting" involves back and forth movement between tasks and higher and lower levels of mental processing. It enables us to adapt dynamically to changing task demands and contexts (Deák and Narasimham, 2003; Poljac et al., 2010), and in the case of reading, for example, this could mean mental movement between different verb tenses, known and unknown words, or even between reading environments, such as quietly reading at school versus reading aloud for entertainment at home.

These three components of EF play an important role in learning and memory (McCauley et al., 2010), and have been consistently linked to educational achievement outcomes in reading and math for a variety of age groups (St Clair-Thompson and Gathercole, 2006; McClelland et al., 2007; Foy and Mann, 2013; Becker et al., 2014; Fuhs et al., 2014). For the present study, we will use the theoretically-postulated three component model of EF that includes Inhibition, Updating Working Memory, and Shifting in order to determine the association of EF to reading disability (RD), as well as examine the unique associations of distinctive aspects of EF with RD.

## EF and Reading Disability

Reading difficulties are one of the most pervasive learning impairments found among school-aged children, with 5–10% of students experiencing problems with reading (Compton et al., 2014). Far and beyond all other theories, the prevailing explanation for reading difficulties is a deficit in phonological processing, but recent work has shown that insufficiencies in the EF system may also be underlying deficient reading development (Gombert, 2003; Altemeier et al., 2008; Booth et al., 2010). For example, when accounting for deficiencies in the phonological system, children with RD still have shown diminished performance on tasks assessing EF, such as inhibition and working memory (Swanson et al., 2006; Altemeier et al., 2008; Booth et al., 2010). Indeed, evidence shows that EF deficits are a fundamental feature of RD (Gioia et al., 2002).

The important link between reading and EF lies in the transition from learning to read to reading to learn. At first, the linguistic knowledge necessary for reading is acquired implicitly through regular exposure to patterns in orthography, phonology, and morphology (Gombert, 1992, 2003). This unconscious exposure leads to the creation of a subconsciouslyorganized and instance-bound linguistic lexicon that includes rudimentary awareness of grapheme to phoneme correspondence (GPC; Gombert, 2003). Then, formal reading instruction begins, ramping up print exposure, while students are taught the explicit rules of GPC. As reading instruction advances, students must be able to take conscious control and monitor their linguistic lexicons in order to respond to unexpected external demands,

like reading an unknown word (Gombert, 2003) or properly resolving a conflict between phonology and orthography based on the current context (Bitan et al., 2009). This movement from subconscious to conscious and implicit to explicit is accompanied by a developmental increase in executive control (Gombert, 2003; Bitan et al., 2009). The executive control conferred by EF modulates both top-down and bottom-up processing according to reading task demands (Bitan et al., 2009), which enables readers to discriminate between task-relevant and task-irrelevant information quickly (Bitan et al., 2009) so that reading may become automatic (Gombert, 2003). Without the level of mastery and quick adaptability that EF makes possible, reading difficulties may emerge.

There are many inconsistencies found in the literature exploring the exact role of EF in relation to reading difficulties. According to meta-analytic work, these incongruities can be boiled down to two main moderators: RD definitions and EF task modalities (Stuebing et al., 2002; Booth et al., 2010). Despite the poor 1-year stability of IQ-achievement discrepancy definitions (Schatschneider et al., 2016), they have continued to be one of the most common RD definitions utilized in practice (Spencer et al., 2014), including their use in studies linking EF and RD (e.g., Altemeier et al., 2008). Based on the results from two meta-analyses (Stuebing et al., 2002; Booth et al., 2010), the association of EF to IQ-achievement discrepancy definitions of RD shows lower mean effect sizes than the association of EF to non-discrepancy definitions of RD (Booth et al., 2010). One possible explanation for this difference is that IQ-discrepant readers are no different than IQ-consistent readers, and that the RD definition used only appears to matter because of differences in EF and IQ task modality (Booth et al., 2010). For example, verbal versus non-verbal IQ may yield different results when included with EF in an IQ-discrepant RD framework, especially depending on whether a verbal or non-verbal EF task is used. Since a majority of EF tasks incorporate a verbal component (e.g., Altemeier et al., 2008), it is difficult to parse out whether the driving force behind the task performance is truly EF or the phonological or verbal processing needed to complete the task.

In response to the confusion presented in the literature examining the role of EF in reading achievement due to EF task modality and RD definitions, we implemented two main techniques in the present study to assist in drawing clearer conclusions about the association between EF and RD. First, we measured EF with the Behavior Rating Inventory of Executive Function (BRIEF), which is a non-verbal instrument that captures EF as it is manifested behaviorally. By employing a nonverbal EF measure, we are able to avoid the uncertainty about the contributing role of verbal processing to EF performance. In addition, most of the work on EF and RD thus far has used cognitive indices of EF ability (e.g., Altemeier et al., 2008), which have been shown to correlate poorly with behavioral EF measures (e.g., McCauley et al., 2010), so our use of the BRIEF subscales in predicting RD may yield interesting new results. Second, in order to avoid the potential pitfalls embedded in the use of IQ-discrepant RD definitions, we employed a hybrid model approach to RD classification, the benefits of which we discuss in more detail in the following section.

## Hybrid Models for RD Classification

A promising solution to the low reliability of IQ discrepancybased and other single-criterion RD models, and their inconsistent association with EF, is the implementation of hybrid models for RD classification (Wagner, 2008; Waesche et al., 2011; Fletcher et al., 2013; Spencer et al., 2014). Historically, RD models have relied on a single benchmark, which is most commonly based on either IQ-achievement discrepancy (Bateman, 1965), cognitive discrepancy (Stuebing et al., 2012), or response-to-intervention (RTI) or instruction (Fuchs and Fuchs, 2006) definitions. Hybrid models take a multi-component approach to RD measurement, which makes them comparatively more stable than other RD classification techniques (Spencer et al., 2014; Schatschneider et al., 2016). Fundamentally, the hybrid model approach to RD classification is based on the idea that a construct is more precisely captured by measuring it in many ways. As such, the hybrid model employed in the present study defined RD as a latent construct made up of four measured symptoms that are described in more detail below.

A recent publication by Spencer et al. (2014) examined the 1- and 2-year stability of a hybrid model approach to RD classification utilizing four indicators of RD that were all chosen based on traditional RD definitions. The RD indicators included low achievement in word reading, unexpected low achievement in word reading, poorer reading comprehension compared to listening comprehension, and a dual-discrepancy RTI model that necessitated both low achievement and low growth in word reading. In this version of the hybrid model, RD was characterized using a symptom approach, similar to the method employed in the Diagnostic and Statistical Manual of Mental Disorders (5th ed.; DSM-V; American Psychiatric Association, 2013). The symptoms were calculated utilizing cutoff points ranging in severity from the 3rd to the 25th percentile, and regardless of which symptom was examined, results revealed that severity and stability were inversely related, with the highest cutoff points (i.e., the 25th percentile), yielding the most stable results for RD classification. In the same vein, Schatschneider et al. (2016) conducted a simulation study that found that hybrid models that incorporated many symptoms of RD, instead of any one RD benchmark, provided the most stable classification scheme for RD, and that the 25th percentile was also the most stable cutoff point for each symptom in a constellation (i.e., multisymptom) model. The findings of these two investigations, as well as the results yielded by a similar study conducted by Waesche et al. (2011), provide clear evidence for the advantages of using a hybrid model that classifies RD as a latent construct made up of many measured symptoms of RD as the most reliable and state of the science approach to RD classification. Additionally, the methods and findings outlined by these papers (Waesche et al., 2011; Spencer et al., 2014; Schatschneider et al., 2016) clearly point to the 25th percentile as the best cutoff point for achieving the highest reliability in RD symptom identification within a hybrid model. Accordingly, a 25th percentile cut was utilized for the calculation of each RD symptom in the hybrid model utilized in the present study. Next, we will take a closer look at the each of the four hybrid model symptoms. The first symptom, low achievement in word reading, represents

the simplest conceptualization of RD, and is based on the fact that students who fail to reach a certain level of word reading performance at the end of the school year are likely to be reading disabled and require some form of additional intervention. Unexpected low achievement, an IQ-achievement discrepancy definition for RD, was operationalized as unexpectedly low word reading achievement based on verbal aptitude. This symptom was modeled after the idea that a student who demonstrates a certain capacity in his/her general intelligence should be able to translate that same capacity into all domains, including reading achievement. Our reason for focusing on word reading in these initial two symptoms was that phonological awareness, which underlies word reading ability, serves as a precursor to more advanced reading skills (e.g., Holloway et al., 2015), so it provides a useful early indicator of reading difficulties before reading demands become more advanced (Fletcher et al., 2007; Spencer et al., 2014).

In an effort to differentiate between reading-specific deficits and general insufficiencies in overall cognitive processing the hybrid model also included a symptom based on a cognitive discrepancy definition of RD (Torgesen, 2002; Spencer et al., 2014). Cognitive discrepancy refers to a situation in which a student's achievement in one cognitive domain outperforms his/her achievement in another cognitive domain. More specifically, the symptom was defined as poorer reading comprehension compared to listening comprehension performance. The advantage of this symptom is that it picks up on students that may not be performing poorly in general, but are failing to achieve at the same level in reading as they are in other domains.

Finally, the dual-discrepancy RTI RD symptom requires two elements for qualification: low word reading growth over the school year in conjunction with low end-of-the-year word reading performance (Schatschneider et al., 2016). This symptom is based on the fact that when a student receives direct reading instruction in a classroom and fails to grow or reach a certain level of reading achievement it is an indication that the child has failed to respond to intervention or instruction (Fuchs et al., 2002; Spencer et al., 2014; Schatschneider et al., 2016).

## Specific EF Components and Reading Disability

Past research reveals mixed findings on the differential role of specific EF components associated with RD (e.g., Swanson, 2003; Swanson et al., 2006; Booth et al., 2010; Sáez et al., 2012). There is a general consensus that inhibition is a fundamental element of executive processing, which both allows for the development, and also constrains the performance, of all other executive functioning components (Miyake et al., 2000; Foy and Mann, 2013). On its own, inhibitory ability may be especially important for early reading skills, like processing or making judgments about phonemes (Foy and Mann, 2013). In the case of working memory, poor inhibitory skills are likely to lead to intrusion errors (DeBeni et al., 1998; Foy and Mann, 2013) and the expression of inappropriate responses or guesses (Stevens et al., 2009; Foy and Mann, 2013), and for shifting, poor inhibition will likely result in representational inflexibility, such as an over-reliance on sight word reading (Diamond, 2002). All of these scenarios may create circumstances in which reading difficulties and errors are more likely (Reiter et al., 2005; Altemeier et al., 2008). For example, performance on the Stroop task, a test of inhibitory ability, is diminished in children with reading difficulties (Everatt et al., 1997; Booth et al., 2010). However, as a task that requires reading, the Stroop task may be revealing reading difficulties unrelated to EF. Even though some work has failed to find a significant difference between typically-developing and RD readers on tests of inhibitory control (Bexkens et al., 2015), Reiter et al. (2005) found that children with RD were impaired on inhibitory tasks in their processing time and error correction abilities and were more likely to commit more errors overall. These results provide support for similar findings of specific inhibitory decrements in children with reading difficulties (DeBeni et al., 1998; Altemeier et al., 2008), even when controlling for age, short-term memory, and vocabulary (Foy and Mann, 2013). Further research into the specific relation between inhibition and RD would contribute to resolving these inconsistent findings.

Working memory is the most extensively explored of the EFs (e.g., Swanson, 2003; Pickering and Gathercole, 2004; Reiter et al., 2005; Gathercole et al., 2006; Cutting et al., 2009; Kieffer et al., 2013), but its relation with RD is still not fully understood. While the original conceptualization of EF by Baddeley and Hitch (1974) was divided into a three-part system comprised of the phonological loop, the visuospatial sketchpad, and the central executive, the results yielded by modern factor analytic work (e.g., Miyake et al., 2000) have transformed the contemporary operationalization of EF by creating a working memory component (along with inhibition and shifting components; Miyake et al., 2000). Most commonly, the phonological loop and visuospatial sketchpad are either collapsed into a singular working memory construct (e.g., Cutting et al., 2009), or re-conceptualized as verbal working memory and non-verbal working memory, respectively (e.g., Gathercole et al., 2006). Importantly, studies have found a vital link between working memory and literacy skills, whether working memory was operationalized singularly (Cutting et al., 2009) or divided into its verbal and non-verbal parts (Gathercole et al., 2006). There is evidence that verbal working memory may be especially important for reading-related skills (St Clair-Thompson and Gathercole, 2006; Foy and Mann, 2013), along with evidence that working memory, as a singular measure, also supports reading comprehension and reading fluency growth in schoolaged children (Swanson and Jerman, 2007). Given the evidence for a significant relation between reading outcomes and working memory, regardless of which conceptualization was employed, we chose to utilize a single-component definition for working memory in the present study.

Looking specifically at reading difficulties, the relations found with working memory and RD are also mixed. In a study of 6 to 49-year-olds that examined the relation of working memory and reading difficulties, working memory deficits were present in individuals with reading difficulties across all ages (Chiappe et al., 2000). Even when accounting for potentially confounding

variables by including a battery of tasks for related cognitive skills, RD readers have still demonstrated working memory impairments (Reiter et al., 2005). In some cases, the association of working memory and (word-level) reading difficulties has lacked a significant relation because EF deficits can be fully accounted for by shortcomings in decoding (Sesma et al., 2009) or phonological processing (Locascio et al., 2011), while other studies have shown that poor reading performance cannot be fully attributed to insufficiencies in the phonological system (Swanson, 2003; Swanson et al., 2006). In fact, reading success is most likely the product of both phonological processing skills and the supportive role played by updating working memory (Iglesias-Sarmiento et al., 2015). Beyond phonological processing, studies that account for additional cognitive abilities, like intelligence, still find suppressed working memory task performance in children with reading difficulties (e.g., Gathercole et al., 2006; Swanson et al., 2006). These findings may be explained by the fact that working memory capacity constrains an individual's ultimate level of proficiency in any academic realm, including reading, by serving as a limiting factor on the amount of knowledge and skill an individual can ultimately acquire (Gathercole et al., 2006). It stands to reason that individuals who demonstrate difficulties in reading may simply have a low working memory capacity that limits their capability for reading skill acquisition. By including Updating Working Memory in our model of EF we hope to further elucidate its role in reading disabilities.

Shifting, the third component of the EF model proposed by Miyake et al. (2000), is the most under-explored of the EFs, and findings about its role in reading performance are still conflicting (Stoet et al., 2007). There is some evidence that shifting may be a weaker predictor of reading skills deficits (Bierman et al., 2008) and early literacy skills (e.g., Foy and Mann, 2013) than inhibition and working memory. In fact, some investigators posit that shifting is simply an expansion of inhibitory control and its interaction with attention, and not a separable skill (Diamond, 2002; Diamond et al., 2005). Although others have found a specific role for shifting in processing linguistic information (Wolf et al., 1986), recent neurological work utilizing EEG technology has found that children with RD do not show impaired performance on shifting tasks when compared with typically developing controls (Horowitz-Kraus, 2014). On the contrary, Poljac et al. (2010) found a shiftingspecific delay in RD children but not in autistic children. When considering these conflicting results, there is an obvious need for further exploration of the role played by shifting for children with reading difficulties.

## Present Study

Taken together, these findings suggest that there is a relation between executive functioning and RD. Overall, EF and its component skills contribute to reading by helping students organize, recall, and integrate new and existing information, but the details of the specific relation between EF and RD are still mixed. Furthermore, work examining the association of EF with RD has not previously used a hybrid model approach for defining RD, which is a more comprehensive and modern definition of RD than single-criterion models. In this paper, we will examine the relation of the threecomponent model of EF, which includes Inhibition, Updating Working Memory, and Shifting (Miyake et al., 2000; Booth et al., 2010; Nouwens et al., 2016), with the hybrid model of RD (Spencer et al., 2014; Schatschneider et al., 2016). Moreover, we will explore the predictive strength of each EF component skill in order to determine whether one EF is more important for RD identification or not. Our first research question was "How does EF predict RD classification in a hybrid model of RD?" Our second research question was "Is one EF component more important than the others for RD classification?"

## MATERIALS AND METHODS

## Participants

The participants in this study were 420 children (51.20% female) who participated in Project KIDS. Project KIDS had two components. The first component involved combining data, using integrative data analysis (IDA), from eight completed literacy and math randomized-control trial intervention projects that occurred in north Florida schools at some point during the 2005–2006 to the 2012–2013 school years (Connor et al., 2007, 2011a,b, 2013; Al Otaiba et al., 2011a,b, 2014a,b). This data integration resulted in a dataset of literacy, math, and related achievement tests of 3868 children, which then served as the population to draw from for the second component of Project KIDS. This second component involved an extensive parental questionnaire, including a parental report of EF. During the spring and summer of 2014, questionnaires were mailed to the last known addresses of the original intervention participants' families. The final sample size for the second component of Project KIDS was n = 445, however only n = 420 had EF data available, so those 420 participants were moved forward into all analyses.

Given the low response rate for the questionnaire portion of Project KIDS, comparisons of the differences between the original population (n = 3868) and the sample of this current report (n = 420) were done for the achievement measures used in this report, as well as on demographic information. There were significant differences between the groups for word reading [t(3315) = 3.46, p < 0.01; original population M = 35.52, SD = 11.88, n = 2946; report sample M = 37.77, SD = 11.21, n = 371], reading comprehension [t(3320) = 3.45, p < 0.01; original population M = 17.94, SD = 7.47, n = 2942; report sample M = 19.34, SD = 7.17, n = 380], and vocabulary [t(3348) = 4.11, p < 0.0001; original population M = 20.10, SD = 3.49, n = 2977; report sample M = 20.89, SD = 3.53, n = 373]. There were no significant differences noted between the original population and the current report sample for age [t(3864) = −1.36, p = 0.18], sex [χ 2 (1) = 1.42, p = 0.23], and race-ethnicity [χ 2 (1) = 0.89, p = 0.35], although there was a significant difference for free and reduced lunch status, with the questionnaire sample showing fewer students qualified for free or reduced lunch [χ 2 (1) = 4.67, p = 0.03].

The original intervention projects occurred when the children were in kindergarten, first, second, or third grade (age M = 6.63 years, SD = 1.04 years, range = 4.79–10.40 years), although at the time of questionnaire completion, the participants had a mean age of 13.21 years (SD = 1.54 years; range = 10.47–16.63 years). The demographic distribution of the current sample included 56.56% White, 35.08% Black/African American, 5.73% Other or Mixed children. Parental informed consent in writing was obtained for all participants in Project KIDS. The Florida State University Institutional Review Board approved all aspects of Project KIDS.

## Measures

One caregiver of the original intervention project children (88% biological mother responded) was asked to complete a questionnaire either by mail or online, using Qualtrics. This questionnaire asked about the parents' basic demographic information, such as age, education level, occupation, household income, ethnicity, and race and about the siblings of the child involved in the study, including their age, gender, and relationship to the participant. The questionnaire also included a section on family medical history that asked about learning difficulties and learning disability diagnoses, and a series of questionnaires concerning the home environment, child's behaviors (including the BRIEF), nutrition, and sleep habits. All children completed a large battery of cognitive ability and achievement measures during the original intervention projects' protocols, usually administered three times during the original intervention year, early fall, winter (early spring semester), and late spring semester.

#### Behavior Rating Inventory of Executive Function

The parent form of the BRIEF (Gioia et al., 2002) is an 86 item questionnaire that assesses the EFs of children. Parents were asked to read a list of statements that describe their child and report on whether their child had problems with the listed behaviors over the past 6 months using a 3-point scale (Never, Sometimes, Often). Each item loads onto one of eight scales (Inhibit, Shift, Emotional Control, Initiate, Working Memory, Plan/Organize, Organization of Materials, and Monitor), which combine into two summary measures and one composite score. The goal of these indices is to detect possible deficiency in one or more areas of EF based on child behavior (high BRIEF scores correspond to low executive functioning; McCauley et al., 2010). For the present report, the Working Memory, Shift, and Inhibit scales were used. Reliabilities in this sample for all three were good (Cronbach's alphas: Inhibition = 0.93, Updating Working Memory = 0.92, Shifting = 0.87).

#### Woodcock–Johnson III Tests of Achievement Letter–Word Identification

The Woodcock–Johnson III (WJ) Tests of Achievement Letter– Word Identification subtest (LWID; Woodcock et al., 2007) is a norm-referenced standardized measure. It is comprised of 75 items that measure reading decoding, or the ability to visually recognize word forms or use phonological ability to pronounce words associated with word forms. Published median split-half reliability for the LWID is 0.94 (Schrank et al., 2001).

#### Woodcock–Johnson III Tests of Achievement Picture Vocabulary

The WJ Picture Vocabulary subtest (PV; Woodcock et al., 2007), which measures expressive language through picture naming, was used to assess children's vocabulary. The test–retest reliability on this test falls in a range of 0.70–0.81 (Schrank et al., 2001), and the assessment includes 44 items.

#### Woodcock–Johnson III Tests of Achievement Passage Comprehension

The WJ Passage Comprehension subtest (PC; Woodcock et al., 2007) is used to measure written text comprehension through matching of pictures with words and phrases and fill-in-the-blank sentences and paragraphs of increasing complexity. For ages 5– 19, its median reliability is 0.88 (Schrank et al., 2001), and the assessment includes 47 items.

## Data Analytic Plan

Prior to analyses specific to this paper, IDA (Curran et al., 2008, 2014) was used to combine all eight intervention projects' early fall (pre-intervention) and late spring (post-intervention) LWID data. At the heart of IDA lies measurement invariance modeling. Measurement invariance modeling in IDA is a disciplined approach to combining datasets from multiple projects. IDA involves using a moderated non-linear factor analysis (MNLFA), which allows for raw item-level data to be combined across projects, modeling potential sources of heterogeneity (e.g., sampling, age/grade) using differential item functioning (DIF). In this case, the MNLFA was the equivalent of a 2-PL model with project, and age (both linear and quadratic terms) DIF modeled. As recommended by Curran et al. (2014), we randomly selected one time point per student for a calibration sample, and also pruned any item that did not have at least 5% coverage of responses (resulting in LWID items 11–75 being included). Using the calibration sample, we first tested for DIF on the factor mean and variance. Second, we tested for DIF on each item intercept and loadings, accounting for factor DIF. Any nonsignificant DIF for a parameter was constrained to equality. After the final model was settled, the full data was run using code where the final beta weights were fixed, and the factor score was saved out as the new LWID score for both time points for each child. All analyses were conducted in Mplus 7.3 (Muthén and Muthén, 1998–2012). After conducting IDA on LWID, we found that the new IDA LWID factor scores and the previous simply combined LWID raw total scores were correlated at r = 0.97 (early fall) and r = 0.99 (late spring). We believe these high correlations are the result of the WJ tests being developed using Item Response Theory models for their scoring and having standardized administration. The original project staff for all projects were very experienced, and the children were relatively close in age and geographic region, meaning that chances for DIF were minimized. Given the computational time for doing the IDA was very large (weeks of run time on a dedicated server) and the correlations between raw score and ability score were

so high, we decided to use the raw total scores for all the WJ measures.

Since the hybrid model was based on the WJ LWID, PV, and PC assessments administered in the original project, and some assessment data were missing, we first conducted multiple imputation (Rubin, 1987) to avoid case-wise deletion and enable all analyses to be conducted on a full data set with no missing values. Multiple imputation requires that all variables be normally distributed, and inspection of descriptive statistics confirmed that this assumption was met (see **Table 1**; skewness and kurtosis between ±2; Tannenbaum et al., 2009). Prior to performing the imputation, students missing all data for the assessments needed for symptom calculation were dropped from the sample (n = 168, 4% of overall sample), since they had no achievement data on which to estimate replacement values, resulting in a drop in the sample size from 4036 to 3868. As a next step, we assessed the missingness of each of our variables of interest and found that no variable was missing more than 14.19% of data (see **Table 1**). Additionally, a Shifting score was missing for one of the Project KIDS questionnaire participants, so Shifting was also included in the imputation model in order to replace the missing value. Finally, Proc MI (multiple imputation) in SAS 9.4 was used to impute 20 datasets based on the covariance matrix of all available data. The resulting 20 data sets were combined, and a mean score of all 20 data points was calculated for each missing value to replace previously missing data points. Pre-imputation descriptive statistics are presented in **Table 1**, and post-imputation descriptives are presented in **Table 2**. The tables show that the means and standard deviations before and after imputation are comparable, and that the data moving forward after multiple imputation are complete, with no missing values for the EF subscales


All values reflect raw scores prior to standardizing. Nmiss, number of missing observations; Updating WM, Updating Working Memory; WJ assessments, Woodcock– Johnson III Tests of Achievement; LWID, Letter–Word Identification subtest; PV, Picture Vocabulary subtest; PC, Passage Comprehension subtest.

TABLE 2 | Post-imputation descriptive statistics.


All values reflect raw scores prior to standardizing. Nmiss, number of missing observations; Updating WM, Updating Working Memory; WJ assessments, Woodcock– Johnson III Tests of Achievement; LWID, Letter–Word Identification subtest; PV, Picture Vocabulary subtest; PC, Passage Comprehension subtest.

or the WJ assessments used to calculate the hybrid model symptoms.

The hybrid model of RD was operationalized following Spencer et al. (2014) and Schatschneider et al. (2016), where students were categorized as having any one, two, three, or four symptoms of RD (modeling the "ANYn" categorization in Spencer et al., 2014). The four symptoms of RD included low word reading achievement, unexpected low word reading achievement, poorer reading comprehension compared to listening comprehension, and a dual-discrepancy RTI model that required both low growth and low achievement in word reading. All symptoms were calculated using the full sample of achievement data (n = 3868) in SAS 9.4. Low achievement was operationalized as any score below the 25th percentile on spring word reading scores. Unexpected low achievement, an IQ-achievement discrepancy definition for RD, was operationalized as unexpectedly low word reading achievement based on verbal aptitude. It was calculated by residualizing spring word reading scores on spring vocabulary scores (a proxy for verbal aptitude) and implementing a 25th percentile cut. Poorer reading comprehension compared to listening comprehension captured the cognitive discrepancy definition of RD, and was calculated by residualizing spring PC scores on spring PV scores (a proxy for listening comprehension; Senechal et al., 2006; Spencer et al., 2014). The dual-discrepancy RTI symptom, as its name suggests, required two parts and was calculated with a 25th percentile cut on the slopes (i.e., growth) and intercepts (i.e., end-of-theyear score) of each child's residualized gains in word reading. After the calculation of these four symptoms, children were assigned a value of 0 (i.e., not showing any symptom), 1 (showing any one symptom), 2 (showing any two symptoms), 3 (showing any three symptoms), or 4 (showing all four symptoms).

#### Research Question 1

Using the hybrid model symptoms assignment described above, we determined if the EF measures predicted RD using three proportional odds models for ordinal logistic regression analyses in a hierarchical linear modeling (HLM) framework. EF data were only available for students that participated in the second part of Project KIDS, the questionnaire follow-up study, so the full sample size in all analyses utilizing EF was 420. Since many children from the same classroom were part of the original interventions (and therefore were in this sample), HLM was used to control for teacher-level variance and account for any teacher effects. The use of proportional odds models was necessary in order to extend the standard binary logistic model to account for a response variable, like the hybrid model of RD, that had ordered categories (i.e., having four symptoms is worse than having three symptoms; Brant, 1990). Although the hybrid model is not set up so that any one symptom is considered more important or severe than any other, having more than one RD symptom qualifies as more severe RD because the child would be demonstrating difficulties in multiple reading domains. In a proportional odds model, the event being modeled, which in this case was RD status in the hybrid model of RD, is the outcome of being classified in a particular category or any later category, and in our analyses, any later category represents one additional symptom of RD. For instance, when Inhibition was used to predict RD status in the case of the three-symptom group, the model predicted the likelihood of being classified as having any three or four symptoms of RD. In predicting the two-symptom group, Inhibition predicted the likelihood of classification into the two-, three-, or four-symptom group, and when predicting the one-symptom group, Inhibition predicted the likelihood of classification into the one-, two-, three-, or four-symptom group. Only when predicting the four-symptom group, was the outcome independent from other groups. Three different proportional odds models were run, one for each EF component, predicting our composite measure of RD that included all four symptoms of the hybrid model. First, Inhibition was used to predict RD status, which could be defined as any one, any two, any three, or any four symptoms of RD from the hybrid model. Subsequently, Updating Working Memory was used to predict RD status, and finally, Shifting was used to predict RD status. This was done using Proc Glimmix in SAS 9.4.

#### Research Question 2

To answer our second inquiry, we conducted a Profile Analysis, controlling for teacher-level variance, to examine whether there were significant differences in the association of each of the EF components with each RD symptom group (one, two, three, or four symptoms of RD). In other words, we were interested in not only determining to what extent EF predicted RD status, but also, when multiple EF components predicted RD status, which EF was the best predictor. Although Profile Analysis has a series of proposed models in the model building process, for this analysis we utilized the "flatness" test. The flatness test is used to establish whether one point on a line has a significantly different mean than any other point on the same line. If the points do not differ significantly, the line is considered statistically flat, indicating that no one point on that line is a better predictor than any other point on that line. In this case, there were four lines, each representing one of the four RD symptom groups (i.e., one-symptom RD group, two-symptom RD group, etc.), and each line had three points, one for each of the three EF components (one for Inhibition, one for Updating Working Memory, and one for Shifting). This analysis was conducted using Proc Mixed in SAS 9.4.

## RESULTS

#### Descriptives

Descriptive statistics for all EF components and WJ assessments used in subsequent analyses are displayed in **Table 2**. These values reflect unstandardized values and the BRIEF scores before reverse-scoring, so that high scores on any BRIEF subscale represented weaker executive functioning. **Table 3** shows the Pearson and Spearman correlation coefficients for the three components of EF, the WJ assessments, and the hybrid model symptoms and RD groups. Prior to calculating


the correlations, all scores were standardized, and, for ease of interpretation, BRIEF scores were reversed so that high scores reflected high executive functioning. The correlations of the EF components indicated that they are separable indices (r = 0.59–0.65, p < 0.0001). Inhibition and Shifting were significantly correlated with all four hybrid model symptoms separately, but Updating Working Memory was only significantly associated with two of the four hybrid model symptoms, namely low word reading achievement and dual-discrepancy RTI in word reading. All three EF components were significantly negatively correlated with the hybrid model of RD variable, indicating that higher EF was associated with having fewer symptoms of RD. **Table 4** displays the frequency of students identified in the one-, two-, three, and four-symptom RD groups of the hybrid model as well as the frequency of students exhibiting each specific symptom.

#### Primary Analyses

#### Research Question 1: How Does EF Predict RD Classification in a Hybrid Model of RD?

The proportional odds models for hierarchical ordinal logistic regression indicated that, while controlling for teacher-level variance, Inhibition (OR = 0.74, 95% CI = 0.58, 0.94), Updating Working Memory (OR = 0.78, 95% CI = 0.62, 0.99), and Shifting (OR = 0.78, 95% CI = 0.60, 0.96) were all significantly related to the hybrid model of RD for students that exhibited all four symptoms of RD, and students who demonstrated any three or four, any two, three, or four, or any one, two, three, or four symptoms of RD (see **Table 5**). According to the proportional probabilities, those students at the mean level of Inhibition ability have a 31% chance of being classified as having one, two, three, or four symptoms of RD, while those students at one standard deviation above the mean of Inhibition (i.e., higher EF than average) have a 26% chance of being classified as having one, two, three, or four symptoms of RD, and those students functioning one standard deviation below the mean of Inhibition (i.e., lower EF than average) have a 40% chance of being classified as having

TABLE 4 | Frequency of students identified by the hybrid model of RD.


N = 420. RC < LC, poorer reading comprehension compared to listening comprehension.

one, two, three, or four symptoms of RD. When predicting the two-symptom RD group, students at the mean level of Inhibition ability have a 16% chance of being classified as having two, three, or four RD symptoms, while students one standard deviation above the mean and one standard deviation below the mean of Inhibition have a 12% and 20% likelihood, respectively, of being classified as having two, three, or four symptoms of RD. When Inhibition is used to predict classification in the three-symptom RD group, students at the mean functioning of Inhibition have a 9% chance of being classified as having three or four symptoms of RD, while students one standard deviation above, and one standard deviation below the mean of Inhibition have a 7% and 11% chance, respectively, of being classified as having three or four symptoms of RD. Finally, when utilizing Inhibition to predict the four-symptom RD group, the likelihood that students will be classified as having all four symptoms of RD is 1% for all levels of Inhibition.

In the second model, students at the mean, one standard deviation above the mean, and one standard deviation below the mean of Updating Working Memory have a 31%, 26%, and 37% chance, respectively, of being classified as having one, two, three, or four symptoms of RD. Students at the mean, one standard deviation above the mean, and one standard deviation below the mean of Updating Working Memory have a 16%, 13%, and 19% chance, respectively, of being classified as having any two, three, or four symptoms of RD. Students at the mean, one standard deviation above the mean, and one standard deviation below the mean of Updating Working Memory have a 9%, 7%, and 11% chance of being classified as having any three or four symptoms of RD. Finally, all students have a 1% chance of being classified as having four symptoms of RD, regardless of their level of Updating Working Memory ability.

In the third model, students at the mean, at one standard deviation above the mean, and those at one standard deviation below the mean of Shifting have a 31%, 26%, and 38% chance, respectively, of being classified as having one, two, three, or four symptoms of RD. Students at the mean, one standard deviation above the mean, and one standard deviation below the mean of Shifting have a 16%, 12%, and 20% chance, respectively, of being classified as having any two, three, or four symptoms of RD. Students at the mean, at one standard deviation above the mean, and at one standard deviation below the mean of Shifting have a 9%, 7%, and 11% chance, respectively, of being classified as having any three or four symptoms of RD. Finally, all students have a 1% chance of being classified as having four symptoms of RD, regardless of their Shifting performance.

#### Research Question 2: Is One EF Component More Important Than the Others for RD Classification?

Results from the flatness test indicated that the main interaction effect of EF component skill and RD group status was not significant (F = 0.68, p = 0.6652). This non-significant interaction effect between the three-components of EF and the four RD symptom groups means that the association of EF and RD group status does not depend on which EF component is used as a predictor.


TABLE 5 | Hierarchical ordinal logistic regression results.

fpsyg-09-00238 March 19, 2018 Time: 18:56 # 11

BRIEF scores were reversed so that high values corresponded with high executive functioning. Coeff, log odds coefficient; PP, predicted probability of being in that category or any later category; 1+ RD symptoms, classified as having one, two, three, or four symptoms of RD; 2+ RD symptoms, classified as having two, three, or four symptoms of RD; 3+ RD symptoms, classified as having three or four symptoms of RD; 4 RD symptoms, classified as having all four symptoms of RD; exact p-values were reported.

## DISCUSSION

In the present study, we sought to explore the association between EF and RD. Specifically, we examined the link between the three components of EF, consisting of Inhibition, Updating Working Memory, and Shifting (e.g., Miyake et al., 2000), and a hybrid model of RD (Waesche et al., 2011; Spencer et al., 2014; Schatschneider et al., 2016). Although the relation between Updating Working Memory and RD has been extensively explored in the literature (e.g., Sesma et al., 2009), less work has been done examining the relation of Inhibition and Shifting with RD. Additionally, the hybrid model of RD represents the state of the science in RD definition, and there has been no work thus far examining the association of EF and the hybrid model of RD. Our results showed that EF was a significant predictor of RD, and that the probability of RD classification changed based on EF performance. As a second research aim, we pursued the inquiry of whether any one EF more strongly predicted RD within the hybrid model of RD. In doing so, we hoped to determine the EF most likely implicated in deficient reading performance, so it could potentially be targeted in intervention efforts. Our results showed that the number of RD symptoms captured did not vary depending on which EF component was used, and as such, any EF had equal predictive value for RD classification in a hybrid model of RD.

In regards to our first research question, we found that there was a significant relation between all three components of EF (Inhibition, Updating Working Memory, and Shifting) and the hybrid model of RD, no matter how many symptoms of RD the student had. We also found that the chances of being classified as having RD (i.e., having at least one symptom of RD) increased as EF performance worsened, and the chances of RD classification decreased as EF performance improved. Given that reading is a skill that must be taught explicitly in order to be mastered (Gombert, 2003; Vaessen and Blomert, 2010), and that reading skill development has been associated with cognitive control over actions (Gombert, 2003; Shaywitz and Shaywitz, 2008; Bexkens et al., 2015), it is not surprising that EF, our cognitive control system, would play a role in reading acquisition and dysfunction. Previous work has commonly found EF is associated with RD, although the effect size of this association is moderated by RD definition and EF task modality (Stuebing et al., 2002; Booth et al., 2010). Here, we used the hybrid model of RD and a parent-report of EF behaviors and found a significant negative association between EF and RD.

Previous work has suggested that "ANY1PLUS" definition of RD, in which a student has at least one or more symptoms of RD, is the most stable operationalization of RD in predicting future RD symptoms (Spencer et al., 2014). Interestingly, we found that EF's predictive power of RD was the highest when a student had at least one or more symptoms of RD. This was evidenced by the finding that the predicted probabilities for all three EF components were highest when the outcome being modeled included any combination of hybrid model symptoms (i.e., ANY1PLUS, or any one, any two, any three, or any four symptoms of RD) and were lowest for the outcome that included only the four-symptom RD group. This was likely due to a few possibilities. Either EF is associated with poor reading performance, no matter how it is defined, and/or EF is associated with RD when RD is

operationalized in a reliable way, and/or the group with at least one symptom of EF was simply the biggest. This could also be attributable to reasons we cannot establish in the current study.

For our second research question, we conducted a profile analysis in order to explore the differential predictive power of each EF component. We found that Inhibition, Updating Working Memory, and Shifting were all equal predictors of RD. There is considerable conflicting research on the role of each given component of EF with achievement, and less research altogether examining the differential role of each component with RD. Given that no one EF component emerged as a superior predictor, our results point to the idea that EF as a whole (maybe represented as a unitary construct), or any one component of EF, is important in RD identification. We caution that this may be attributable to our use of a single parent-reported measure of EF that used subscales to represent the components. It is likely that the single-reporter measure meant that the correlations between the components of EF were higher than normal, and thus they acted more similarly to each other than task-based measures would demonstrate (e.g., Foy and Mann, 2013). Despite the limitation of the measurement of EF, the BRIEF is a relatively inexpensive, parent-report measure that could be conveniently completed by parents. Moreover, the BRIEF provides a behavioral, instead of a cognitive, index of EF. By virtue of its basis on observable behaviors, the BRIEF is less subject to the task impurity issues that plague most cognitive EF measures (e.g., Miyake et al., 2000) because the behaviors are easier to pinpoint than underlying cognitive processes. In addition, since most investigations of EF and RD use cognitive EF indices (e.g., Altemeier et al., 2008), which correlate poorly with behavioral EF measures (McCauley et al., 2010), our use of a behavioral EF index provides a novel contribution to the research on EF and RD.

Outside of our main research questions directly, we had other interesting findings. Based on our correlations, both Inhibition and Shifting were significantly correlated with all four hybrid model symptoms separately, but Updating Working Memory was only significantly associated with two of the four hybrid model symptoms, namely low word reading achievement and dualdiscrepancy RTI in word reading. Therefore, while Inhibition and Shifting would possibly still identify RD in children if single-criterion RD definitions were used, the predictive power of Updating Working Memory in cognitive discrepancy or IQ-achievement discrepancy models (the two symptoms with which it did not significantly correlate) could possibly not fare as well in a single-criterion framework that did not utilize low word reading or RTI definitions. Accordingly, our current study provides evidence for all three EFs as predictors of RD in a hybrid model framework, but does not directly speak to the predictive power of Updating Working Memory when some less comprehensive operationalizations of RD are employed.

An important point to consider is that the current study's examination of the relation between EF and RD was conducted solely in English, and different relations may have emerged if a more transparent language were used. It is presumed that, as a process, reading acquisition is variable and language-specific (Ziegler and Goswami, 2006). This claim has been corroborated by evidence from neuroimaging studies showing differential brain activation in response to comparable stimuli among different language readers (Ziegler and Goswami, 2006; Holloway et al., 2015). It stands to reason that reading in different languages calls upon different cognitive abilities and their corresponding brain regions, in order to properly respond to cross-linguistic differences in reading demands, like differences in orthographic depth (Gombert, 2003; Ziegler and Goswami, 2006; Holloway et al., 2015). For example, as a language with a deep orthography that is characterized by unpredictable language and speech sound pairs, English may require increased demands on cognitive control to counteract the unpredictable connections between the audio and visual aspects of language when learning to read in English (Holloway et al., 2015). It is not surprising that the present study, which was conducted in English, found a significant relation between EF, the mechanism that enables cognitive control, and reading difficulties. In contrast, learning to read in more shallow orthographies that have transparent language and speech sound pairs, like those found in Dutch and Italian, results in the formation of easier to follow audiovisual rules that make processing more automatic (Holloway et al., 2015). Accordingly, the demands on EF for explicit monitoring and adaptation created by the incongruences in the English language may not exist in a shallow orthography like Dutch, and as a result, students with deficient EF may not display the same RD. To test this possibility, future work should replicate the methods used in the present study using a less orthographically complex language, like Dutch or Italian.

In general, our findings suggest the need for more research to examine the directionality and fundamental nature of the relation between EF and RD as a diagnostic mechanism and, potentially, a way to intervene effectively to reduce the sequelae of RD. We know that EF works as a regulatory system for higher order cognitive processing by enabling the acquisition of new knowledge through the setting, revision, and monitoring of learning-related goals and strategies (Lin et al., 2016), but we are still unsure how this cognitive regulation translates into reading ability and disability. One explanation for the significant association we found between EF and RD is that poor executive functioning overwhelms the cognitive processing system, making reading difficult (Swanson, 2003; Gathercole et al., 2006; Swanson et al., 2006). Without the cognitive resources necessary to choose appropriate strategies to overcome reading difficulties (i.e., setting a time to practice reading daily, choosing an appropriate location to allow concentration when reading, or taking breaks in between reading excerpts in order to mentally review main ideas), children with poor EF may not be able to overcome their reading struggles (Lin et al., 2016). As such, creating learning environments that support EF and self-regulated learning might contribute to stronger reading development (Connor et al., 2010).

Another possibility is that poor reading skills in children with RD result in poor EF through a common third variable that impacts both EF and reading. One such mechanism may

be metacognition, whereby children who do not learn to read at an average level also fail to develop effective metacognitive skills, which are vital for the cognitive and self-regulatory processes utilized in EF (Cain et al., 2004; Connor et al., 2016). Recent work also supports the idea of a reciprocal relation between RD and self-regulatory processes, like EF (Connor et al., 2016). For example, children with higher EF abilities may be better able to engage with reading instruction, and together, repeated exposure to such instruction and repeated practice of self-regulation may lead to the enhancement of both reading ability and self-regulatory ability. In the negative direction, it is also possible that poor instruction in reading (i.e., instructionally-induced RD) may also proscribe the development of EF (Vellutino et al., 1996). On the other hand, children with RD may simply also have poor EF skills. Our current study cannot discern this distinction, but future work can begin to differentiate the directionality of these relations.

As an interesting aside, only 2.17% of our sample fell in the four-symptom RD group. Although this group of children was small in absolute numbers, they may, in fact, represent a subset of "inadequate responders" (Toste et al., 2014) or "treatment resisters" (Torgesen, 2000). Torgesen (2000) coined the term "treatment resisters" to describe the 2–6% of children who are resistant to reading intervention, and regardless of targeted preventative efforts, will never reach a "normal" word reading level (Torgesen, 2000). Coincidentally, our sample's foursymptom RD group falls within this 2–6% resister range, as do the 3% of children that were classified as having only the dual-discrepancy RTI in word reading symptom (see **Table 4**). As such, the treatment resisters may simply be the children that qualify for only the dual-discrepancy RTI in word reading symptom, rather than the children that have all four hybrid model symptoms. This possibility makes sense, as the dual-discrepancy RTI in word reading symptom group captured the children that do not respond to intervention and was also significantly associated with all three EFs. Whether the treatment resisters are the children that have all four RD symptoms or just the dualdiscrepancy RTI symptom, these results provide evidence that resisters may not just have reading disabilities, but are likely to have multiple deficits, including poor EF. In fact, students with reading difficulties have been shown to suffer from EF deficits (Fuchs and Fuchs, 2015), and inadequate responders have even been shown to differ from adequate responders in working memory performance (Toste et al., 2014). The hybrid model of RD is not meant to be limited to just the four symptoms used in this study, and adding EF to the model may further reduce measurement error in identifying the students most likely to have RD. In doing so, the extended model that includes EF might support early identification efforts to improve outcomes for children with RD, and possibly, for the most severely reading impaired students as well. This idea was not explicitly tested here, so more future work is needed before conclusions can be drawn.

Although these findings contribute to our understanding of the links between EF and RD, this study is not without limitation. First, our measurement of executive functioning was based on a single parent-report questionnaire, whereas the use of multiple EF measures, including cognitive indices, like the Wisconsin Card Sorting task (Horowitz-Kraus, 2014) or the Stroop task (Miyake et al., 2000), may have increased the reliability of EF scores. Parents may have a skewed concept of their child's EF abilities, and direct measurement techniques that employ an outside observer could reduce potential biases. On the other hand, the fact that the BRIEF indexes EF based on behavioral manifestations of EF may also be an advantage because it helps avoid the task impurity inherent in verbal EF measures. Second, the questionnaire portion of Project KIDS had an 11.50% recruitment rate from the original intervention projects' population. Different relationships than those shown in this study may have been revealed by a sample with a greater response rate, as the parents who responded to our questionnaire had children with slightly higher reading and language performance and who were slightly less likely to qualify for free and reduced lunch (indicating higher socioeconomic status). Third, the parent report of EF was measured at a different time point than the language and literacy variables that made up the RD hybrid model. RD classification is relatively stable with the hybrid model (one of the benefits of this model; Schatschneider et al., 2016), but EF undergoes significant developmental changes during these ages (e.g., Anderson, 2002; Bitan et al., 2009), with the component skills of EF continuing to develop along different trajectories until adolescence, when executive control emerges (Anderson, 2002). Therefore, when generalizing our results, we must remain cautious and take into account the fact that parents may have reported on EF abilities that were more developed than they had been at the time of intervention. In order to more closely and accurately test the relation between the EF components and RD classification in a hybrid model, future work should replicate our methods with concurrent EF and achievement data to see if the associations found hold. Another possible consequence of the time elapsed between the original assessment testing and the parent EF ratings is a buildup of frustration due to years of reading difficulties for the students, which they acted out in the form of behaviors measured by the BRIEF (e.g., "talks at the wrong times" or "gets out of control more than friends"; Mahone et al., 2002). Finally, unlike Spencer et al. (2014), no reading fluency measure was included in the calculation of the hybrid model symptoms, and different relations may have been found if a timed reading measure were used.

Not only is RD hard to identify, but it is also one of the most pervasive learning disabilities present in our school systems (Spencer et al., 2014). Children with reading deficiencies encounter myriad of disadvantages, including less practice in developing their reading comprehension skills, a potential for the acquisition of negative views toward reading, and an inability to acquire important knowledge available through print resources (Torgesen, 2000). Although no specific EF component emerged as a superior predictor, this study provides evidence for the overall negative association between all three EF components and RD. As we learn more about the causal mechanisms that underlie RD, including EF, we will be able to contribute to emerging models and theoretical frameworks and design more effective methods for early identification and intervention to help all children, and especially children with RD, succeed in school and throughout their lives.

## AUTHOR CONTRIBUTIONS

fpsyg-09-00238 March 19, 2018 Time: 18:56 # 14

Conceptualization: MD, SH, and CS. Data curation: SH. Formal analysis: MD, SH, and CS. Funding acquisition: SH, CS, CC, and SA. Investigation: MD and SH. Methodology: MD, SH, and CS. Project administration: SH. Resources: SH, CS, CC, and SA. Software: MD and SH. Supervision: SH. Validation: MD and SH. Visualization: MD. Writing original draft: MD, SH, and CS.

#### REFERENCES


Writing review and editing: MD, SH, CS, CC, and SA. Final approval: MD, SH, CS, CC, and SA.

### FUNDING

The research reported in this publication was supported by the Eunice Kennedy Shriver National Institute of Child Health and Human Development of the National Institutes of Health under award number R21HD072286. The content is solely the responsibility of the authors and does not necessarily represent the official views of the National Institutes of Health.

an effective connectivity fMRI study. J. Cogn. Neurosci. 21, 1135–1145. doi: 10.1162/jocn.2009.21065




**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2018 Daucourt, Schatschneider, Connor, Al Otaiba and Hart. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

## Cognitive, Parent and Teacher Rating Measures of Executive Functioning: Shared and Unique Influences on School Achievement

Marielle C. Dekker <sup>1</sup> \*, Tim B. Ziermans 1, 2, Andrea M. Spruijt 1, 2 and Hanna Swaab1, 2

*<sup>1</sup> Department of Clinical Child and Adolescent Studies, Faculty of Social and Behavioural Sciences, Leiden University, Leiden, Netherlands, <sup>2</sup> Leiden Institute for Brain and Cognition, Leiden University, Leiden, Netherlands*

Very little is known about the relative influence of cognitive performance-based executive functioning (EF) measures and behavioral EF ratings in explaining differences in children's school achievement. This study examined the shared and unique influence of these different EF measures on math and spelling outcome for a sample of 84 first and second graders. Parents and teachers completed the Behavior Rating Inventory of Executive Function (BRIEF), and children were tested with computer-based performance tests from the Amsterdam Neuropsychological Tasks (ANT). Mixed-model hierarchical regression analyses, including intelligence level and age, showed that cognitive performance and teacher's ratings of working memory and shifting concurrently explained differences in spelling. However, teacher's behavioral EF ratings did not explain any additional variance in math outcome above cognitive EF performance. Parent's behavioral EF ratings did not add any unique information for either outcome measure. This study provides support for the ecological validity of performance- and teacher rating-based EF measures, and shows that both measures could have a complementary role in identifying EF processes underlying spelling achievement problems. The early identification of strengths and weaknesses of a child's working memory and shifting capabilities, might help teachers to broaden their range of remedial intervention options to optimize school achievement.

#### Keywords: working memory, inhibition, shift, math, spelling

## INTRODUCTION

Executive functions (EFs) are generally defined as effortful cognitive abilities that help plan, guide and control goal-directed mental processes and behavior. Executive control is assumed to be involved in both math and spelling performance. Math calls for executive control to select and manipulate relevant numbers, to disregard irrelevant information, to choose the right computational methods, to temporarily store and manipulate numbers and other information, and to be able to switch between various procedures or operations (e.g., Raghubar et al., 2010; Frisovan den Bos et al., 2013; Yeniad et al., 2013; Cragg and Gilmore, 2014). Written spelling requires understanding in the language forms (i.e., morphology), sound structures, word meanings, and origins. Written spelling is also assumed to require executive control in order to efficiently integrate phonological, orthographical, and morphological information, and motor planning (Berninger et al., 2006; Garcia et al., 2010; Preßler et al., 2013).

#### Edited by:

*Mariette Huizinga, VU University Amsterdam, Netherlands*

#### Reviewed by:

*Peter K. Isquith, Dartmouth College, USA Michelle Ellefson, University of Cambridge, UK*

\*Correspondence: *Marielle C. Dekker dekkerm@fsw.leidenuniv.nl*

#### Specialty section:

*This article was submitted to Developmental Psychology, a section of the journal Frontiers in Psychology*

Received: *31 October 2016* Accepted: *09 January 2017* Published: *30 January 2017*

#### Citation:

*Dekker MC, Ziermans TB, Spruijt AM and Swaab H (2017) Cognitive, Parent and Teacher Rating Measures of Executive Functioning: Shared and Unique Influences on School Achievement. Front. Psychol. 8:48. doi: 10.3389/fpsyg.2017.00048*

**50**

The observation that EF abilities mature at different rates over time and have their peaks at different ages, suggests that EF incorporates separable abilities (e.g., Klenberg, 2001; Davidson et al., 2006; Simonds et al., 2007; Best et al., 2009; Best and Miller, 2010). In many studies of school-aged children, there is an agreement that there are at least three fundamental EF abilities that are interrelated, but distinguishable: working memory, inhibitory control, and cognitive shifting or cognitive flexibility (e.g., Miyake et al., 2000; Jacob and Parkinson, 2015).

Working memory (WM) refers to the ability to temporarily store, manipulate and control incoming information at the same time. WM improves gradually during childhood and adolescence in a linear fashion (Best et al., 2009; Best and Miller, 2010). Inhibitory control allows for the suppression of actions and resistance to interference from irrelevant stimuli entering the WM and is considered to be a precondition for other EFs. During the preschool years, inhibition skills improve rapidly and around age four children show basic inhibitory control. These skills gradually and linearly improve between ages five to eight and further refinements in accuracy and speed occur in middle childhood and in adolescence (Best et al., 2009; Best and Miller, 2010; Clark et al., 2010). Shifting or cognitive flexibility refers to the ability to flexibly switch between strategies, rules, tasks or mental states. Both WM and inhibition skills are needed to shift effectively and efficiently (Garon et al., 2008; Best and Miller, 2010). Shifting ability develops from preschool years through adolescence (Best et al., 2009; Best and Miller, 2010).

Most research on the influence of EF on school achievement focuses on performance-based measures of EF (e.g., Allan et al., 2014). Cognitive performance-based EF tasks tend to measure the efficiency of information processing mechanisms of the brain. WM capacities in children have been clearly linked to math skills (e.g., DeStefano and LeFevre, 2004; Raghubar et al., 2010; Frisovan den Bos et al., 2013; Gerst et al., 2015). In two meta-analyses, inhibitory control has also been positively linked to various math skills in preschoolers and kindergartners (Allan et al., 2014) and in primary school-aged children (Friso-van den Bos et al., 2013), and also in recent studies a significant association between inhibition and math performance has been found (e.g., Gerst et al., 2015; Ten Eycke and Dewey, 2016). In two meta-analyses, shifting was associated with math skills in primary school-aged children (Friso-van den Bos et al., 2013; Yeniad et al., 2013). A recent study by Gerst et al. (2015) also reported a significant and positive relation between math and shifting.

A varying amount of research has been performed on the relation between cognitive measures of EF and spelling outcome, with most studies on WM, and only a few on inhibition or shifting. Studies on WM in relation to spelling skills show a positive association (e.g., Jongejan et al., 2007; Malstädt et al., 2012; Cardoso et al., 2013; Fischbach et al., 2013; Preßler et al., 2013; Becker et al., 2014; Re et al., 2014; Bexkens et al., 2015). Both inhibition (Altemeier et al., 2008) and shifting (Altemeier et al., 2008) have also been positively linked to spelling in first to fourth graders. Although cognitive EF performance is associated to cognitive performance in math and spelling, it remains unclear whether cognitive measures of EF are the best option to explain the more complicated, more demanding, and less structured performance situations at school where factors like fear and motivation also play an important role. Cognitive EF measures tend to neglect the effects of motivation, goals, and beliefs on EF, and their use in predicting quality of cognitive learning is complicated by task impurity problems (Salthouse et al., 2003). EF functioning is thought to be visible in everyday life whenever planning, problem solving, inhibition or troubleshooting is challenged. One might ask whether daily executive functioning at school or at home is also related to math and spelling performance. This would indicate the pervasive influence of EF on school performance on several levels of control.

Behavioral ratings of EF were developed to assess the application of EF skills in typical performance situations at home or at school and are assumed to be more ecologically valid. However, studies relating behavioral measures of EF to school achievement are limited. A significant association between behavioral WM problems and poorer math outcome has been reported by some (Clark et al., 2010; Gerst et al., 2015), but not by others (Ten Eycke and Dewey, 2016). Behavioral inhibitory problems have been found to show either a significant association (Clark et al., 2010; Gerst et al., 2015) or no association with math (Ten Eycke and Dewey, 2016). Behavioral problems with shifting have also been related to poorer math outcome (Gerst et al., 2015; Ten Eycke and Dewey, 2016). To our knowledge, only one study reported on the association between spelling and behavioral EF (teacher report) and showed that behavioral aspects of memory, shifting, and inhibitory control were related to children's spelling outcome in kindergarten and first grade (Kent et al., 2014). Nevertheless, behavioral ratings are challenged by rater bias (e.g., the halo effect, central tendency bias, leniency bias) and situational specificity of behavior, resulting in low crossinformant agreement (Achenbach et al., 1987). Furthermore, the high correlations between the different subscales also point to scale-impurity problems, questioning whether general behavioral impairment is being measured rather than different aspects of executive dysfunctioning (McAuley et al., 2010).

Both cognitive performance-based EF measures and behavioral EF rating measures clearly have their pros and cons. Results from a recent review study on the association between these EF measures in 13 studies using the Behavior Rating Inventory of Executive Function (BRIEF; Gioia et al., 2000), showed that only 19% of the reported correlations were significant with a median correlation of 0.18 (Toplak et al., 2013). It is evident that measures assessing cognitive and behavioral EF across informants tap into different aspects of EF. Meta-analytical evidence on inhibitory control in preschoolers and kindergartners (Allan et al., 2014), showed that the mean association between math achievement and inhibition was stronger for performance tasks (r = 0.35) compared to otherreports (r = 0.22). However, it is not yet clear how these different EF measures concurrently relate to real world external criteria like school achievement. Understanding to what extent different EF measures share variance and add unique variance in relation to school achievement could verify their validity and could provide us with a more balanced view of relevant EF aspects.

Thus far, only the studies of Gerst et al. (2015) and Miranda et al. (2015) provide some insight into the relative impact of these different types of EF measures on school outcome, although math outcome was only studied by Gerst et al. (2015) and neither of these two studies looked at spelling. Gerst et al. (2015) examined both cognitive EF measures and teacher behavioral EF rating measures of WM, inhibition and shifting and found moderate correlations for all measures with math and reading comprehension outcome. Analyzing the shared and unique influence of these cognitive and behavioral measures for each EF in a full model with relevant covariates showed that both types of WM measures were complementary in the prediction of math and reading comprehension outcome. However, for inhibition and shifting, the behavioral EF rating did not add any unique variance to the prediction of math by the performance measure. In contrast, for reading comprehension, the cognitive measures for inhibition and shifting did not add any unique variance to the teacher rating. Miranda et al. (2015) concluded that teacher's global EF rating was more strongly related to reading accuracy and speed then parent's global EF rating.

A key issue when examining the impact of EF on school achievement is to what extent it is independent from intelligence (IQ). There is some evidence that IQ has associations with WM (Mahone et al., 2002; Friedman et al., 2006; Alloway and Alloway, 2010), inhibition (Mahone et al., 2002) and shifting (Ardila et al., 2000; van der Sluis et al., 2007), and that this relationship is partially attributable to shared executive or nonexecutive processing demands (e.g., processing speed) underlying both EF and IQ assessment (van der Sluis et al., 2007), as well as to shared method variance reflected in the ability to take tests in the case of performance based EF tasks. Some studies did indeed show that EF shared a lot of variance with IQ in predicting school achievement (e.g., Bull and Scerif, 2001; Espy, 2004). However, other studies, have shown that both performance-based and rating-based EF measures were uniquely related to school achievement after taking into account the possible confounding effects of intelligence (e.g., George and Greenfield, 2005; Alloway and Alloway, 2010; Preßler et al., 2013; Yeniad et al., 2013; Gerst et al., 2015; Dekker et al., 2016). These latter findings suggest that traditional intelligence tests might not assess abilities that are considered important from a neurocognitive perspective, and that IQ cannot be considered a proxy of EF or vice versa. However, the mixed findings point to the need to study the possible confounding effect of intelligence level.

The aim of this study was to examine the shared and unique influence of three different types of EF measures, i.e., performance-based, teacher's rating-based, and parent's ratingbased, on math and spelling outcome in first and second graders, while taking level of intelligence into account. Based on the presented evidence we expected cognitive measures of WM, inhibition and shifting to be related to math and spelling. Because there are only a couple of studies, with contradicting results, concerning behavioral EF measures as markers for math and spelling differences, our expectations were tentative. Nevertheless, we assumed that behavioral executive dysfunctioning had a negative association with math and spelling outcome. Based on the findings of Gerst et al. (2015), we expected cognitive measures of EF to have the biggest impact on math outcome, except for WM where we predicted the behavioral rating-based measure would add unique variance. Based on Gerst et al. (2015) findings on reading comprehension, we tentatively assumed that behavioral EF ratings would have the biggest impact on our language related spelling outcome, except for WM for which the cognitive measure was also expected to add unique variance. We further assumed that teacher's ratings of EF would have a bigger association with school achievement than parent's EF ratings (Miranda et al., 2015), as EF demands at home are different then EF demands at school, with the latter being more likely to be related to school readiness, attitude toward learning and testing, and thus with school achievement.

## METHODS

## Procedure

The current study is part of an ongoing pretest-posttest intervention study called "Curious Minds' that focuses on neurocognitive, social, and environmental factors affecting children's" learning at school and at home. Children were recruited from two primary schools in the Dutch province of Zuid-Holland during November 2013 (school 1) and March 2014 (school 2). The Ethical Board of the department of Education and Child studies at Leiden University has given ethical approval for this study (ECPW-2010016).

Only children in grade 1 or 2, all aged 6–8 years, were included in this study. All parents of students from grade 1 or 2 (N = 172) received written information about the study from their child's school and were invited to attend an informational meeting. Written informed consent was obtained from all 105 parents who participated (response = 61.0%). Chi-square tests with a continuity correction showed no significant differences between participants and non-participants in gender, grade, or school (all p > 0.05), neither did a t-test for age (p > 0.05).

All parents and teachers were asked to complete a questionnaire on their child's or student's behavioral EF. Cognitive EF data was collected during school visits. Each child completed several computer-based performance-based EF tasks. Each assessment period lasted about an hour and a half and took place in a quiet room to minimize distraction. All assessments were done by the researchers or by Master's students who completed an extensive training in test administration, including video-feedback sessions. Pretest data was collected in the period between November 2013 and February 2014 (school 1), and May and June 2014 (school 2). Intelligence level was assessed during the post-test data collection phase. As IQ is considered to be quite stable over time, we expected that the time between this study's pre- and post-test of about half a year, would be of negligible influence (Canivez and Watkins, 1998). Dutch standardized paper-and-pencil achievement tests scores used to monitor math and spelling progress were retrieved from each school's records at pretest. We obtained full achievement test score information, full cognitive EF data and teacher EF ratings for 104 out of the 105 participating children, for 103 children we were able to estimate intelligence level, and we received 86 EF ratings from parents. Complete data for this study was available for 84 children (80.0% of all participating children; 48.8% of all eligible children) from 7 different classes. Children with complete data did not significantly differ from children without complete data (N = 21) on age, grade, school or gender (all p > 0.05).

#### Measures Cognitive EF

Cognitive EF was measured with three neuropsychological tasks from the Amsterdam Neuropsychological Tasks (ANT, version 2.0; De Sonneville, 1999, 2011). The ANT has been used extensively to examine EF and related cognitive processes in various clinical and non-clinical populations and has high sensitivity for neuropsychological problems as well as good reliability and appropriate validity (De Sonneville, 2005, 2014; Rowbotham et al., 2009). All computer tasks were preceded by instructions from the test leader and practice trials. All test stimuli were presented on a computer screen and the child had to respond by pressing a mouse key.

#### **Working memory**

Visuospatial working memory was measured with the ANT Spatial Temporal Span (STS–part 2)—backward span. In this task, nine squares are presented on the computer screen in a three-by-three matrix. During each trial, an incremental sequence of these squares (two up to a maximum of nine) is pointed out by a hand animation. Each sequence of appointed squares is presented in two successive trials. The participant is instructed to repeat this sequence by clicking the same squares in reverse order. In each trial the sequence is preceded by an auditory cue (a beep). The task aborts automatically whenever two successive trials of the same sequence number are incorrect. The number of correct identified targets in correct order backwards was used as a measure of visuospatial working memory.

#### **Inhibition**

Inhibition of a prepotent ongoing motor response was assessed with the ANT Go-NoGo (GNG–biased) task. In the GNG task the mouse button has to be clicked whenever a yellow square with a hole at the bottom is displayed (the Go signal; 75% of the trials). Whenever a full yellow square is displayed (the NoGo signal; 25% of the trials) the child has to withhold the prepotent motor response and do nothing. The number of false alarms on the 18 NoGo trials was used as a measure of level of inhibition. A higher amount of false alarms (e.g., the participant clicks when the target signal is not presented) indicates that a child is less able to stop an ongoing response.

#### **Shifting**

Shifting was assessed with the ANT Response Organization Objects (ROO–part 3)—mixed compatible and incompatible. During the third part of the ROO task, the color of the ball alternates randomly between green and red and the child has to shift between response sets. Whenever the green ball appears a compatible dominant response is required (click the mouse button that corresponds to the side where the green ball is presented) and when the red ball appears an incompatible subdominant response is required (click the mouse button on the opposite side of where a red ball is presented). This part consists of 80 trials; 40 trials requiring a compatible response and 40 trials requiring an incompatible response. The overall amount of errors in part 3 was used to measure level of visuospatial shifting.

#### Behavioral EF

Behavioral EF was measured with BRIEF (Gioia et al., 2000; Huizinga and Smidts, 2009, 2010). Both the teacher's form (BRIEF-teacher) and the parent's form (BRIEF-parent) were used. The BRIEF teacher's form assesses everyday behavioral EF problems in the classroom and the BRIEF parent's version does the same for the home situation. Fifteen different classroom teachers filled out 5–9 BRIEF-teacher questionnaires (mean = 5.6; mode = 4; SD = 1.6). The BRIEF has satisfactory internal consistency, test-retest reliability, moderate interrater agreement and appropriate evidence of predictive and discriminant validity and is used for children from 5 to 18 years old. The BRIEF contains 86 items that make up eight scales that form a Behavior Regulation Index. In this study we used the raw scale score of the Working Memory, the Inhibit, and the Shift scale. A higher BRIEF scale score indicates a higher level of executive dysfunction.

#### **Problems with working memory**

The Working Memory scale (WM) of the BRIEF assesses the capability to hold information when completing a task, when encoding information, or when generating goals/plans in a sequential manner (e.g., forgets what he/she was doing, trouble remembering things, losing track of what they are doing).

#### **Problems with inhibitory control**

The Inhibit scale of the BRIEF assesses the amount of trouble a child has controlling impulses and to stop engaging in a behavior (e.g., gets out of control more than friends, has difficulty staying seated in the classroom, often interrupts others in class, requires more adult supervision).

#### **Problems with shifting**

The Shift scale of the BRIEF assesses the problems a child has with moving freely from one activity or situation to another, alternating attention or changing strategies (e.g., difficulty to flexibly solve problems, to make transitions, tolerate change, or shift attention).

#### Intelligence Level

Level of intelligence (IQ) was estimated using the Vocabulary (V) and Block Design (BD) subtest of the Dutch Wechsler Intelligence Scale for Children 6–17 years old (WISC-III-NL) at post-test, about half year later (Kort et al., 2005). The short form estimates of full scale IQ for the WISC-III (FSIQ) were obtained according to the algorithm: 2.9 × (sum of normed scores) + 42; an algorithm based on Tellegen and Briggs's linear scaling technique (Tellegen and Briggs, 1967; Campbell, 1998). The WISC-III V-BD estimate has been found valid for the estimation of full scale IQ, given a sufficient corrected FSIQ validity (r = 0.82) and split-half reliability (r = 0.91) (Campbell, 1998). The 2.8 year stability of the WISC-III Vocabulary subtest has been found to be 0.75, and that of Block Design subtest 0.78 (Canivez and Watkins, 1998).

#### School Achievement

To assess math and spelling ability we used the Dutch standard CITO Mathematics Test (CMT; Janssen et al., 2010) and CITO Spelling Test (CST; de Wijs et al., 2010). The CMT and the CST are both composite national curriculum paper-and-pencil achievement tests that are standardized and norm-referenced. They have good psychometric properties and are commonly used in Dutch schools to monitor the progress of students in primary education (de Wijs et al., 2010; Jansen et al., 2013). There are two different tests for each grade, one regularly administered halfway through the year (January) and one around June. We collected the CMT and CST scores through the schools at the time of the pretest. Therefore, in this study we used the January 2014 CITO tests scores from school 1, and the June 2014 CITO tests score from school 2. To allow for comparison between the students' math and spelling scores we used the age equivalent math score (AES) and subtracted the number of months of education the student had received up to that point (10 months per year, starting from grade 1). A positive score of 5 means that a student is about 5 months ahead in mathematical or spelling skills relative to the amount of education received up to that point in time (the general population AES mean is 0 months).

#### **Mathematical abilities**

The Dutch standard CITO Mathematics test (CMT) was used to assess various mathematical abilities (Janssen et al., 2010). In the current study's grades the following math skills are covered: (a) number and number relations; (b) addition and subtraction; (c) multiplication and division; and (d) measuring (e.g., weights, length, surface, time).

#### **Spelling abilities**

The Dutch standard CITO Spelling test (CST) was used to assess implicit spelling abilities (de Wijs et al., 2010). Spelling ability for the current study's age group is tested by having children write 50 words (January Grade 1) or sentences (June grade 1) dictated by their teacher. Starting from grade 2 there are two parts: (1) 25 dictated sentences; and (2). 25 questions where children have to pick out the sentence with the wrongly spelled word (in bold case) out of four different sentences. All CST scores are rescaled to make the CST comparable across children.

#### Statistical Analysis

Data was analyzed using simple correlations and with linear mixed-effects modeling using IBM SPSS version 23. All variables that were significantly skewed (SE > 3.0) were first log transformed (BRIEF Inhibit and Shift scale for both parent and teacher rating) or square root transformed (GNG number of false alarms, ROO number of errors part 3, BRIEF WM scale for both parent and teacher rating). A hierarchical mixed-model regression analysis, based on our hypotheses, with maximum likelihood estimation was used to test each hypothesized model explaining math or spelling achievement outcome. Analysis were performed for each type of EF (WM, inhibition, shifting) using all three methods (cognitive, teacher rating, and parent rating), and including IQ. A random intercept for class (n = 7) was included to control for the slight non independence of our data due to students being nested in classes (multi-level data). The intra class correlation (ICC = Variance (intercept)/(Variance(intercept) + Variance(error)) for the null model (intercept-only model) of math was 0.03 (3% of the variance was attributed to class level) and for spelling the ICC was 0.08. The difference in −2Log Likelihood, which follows a χ <sup>2</sup> distribution with the difference in degrees of freedom between the two nested models as its degrees of freedom, between two adjacent nested models was calculated and also the Schwarz's Bayesian Information Criterion (BIC) difference. A BIC difference between two nested models can be considered a weak (0–2), a positive (2–6) or a strong (>6) indication for a better model (Raftery, 1995). A model was considered an improved model whenever the −2LL difference was significant (p < 0.05) and the BIC difference was bigger than 0. In each hierarchical model, IQ was entered first (model 1). For math outcome, the next model included the cognitive EF measure (Model 2). If this model was a significant improvement over the IQ only model, a model adding the corresponding teachers' EF rating was estimated (Model 3). The matching parent's EF rating was entered after the teacher's rating (Model 4). For spelling outcome, Model 2 included the teacher's EF rating. If this model was a significant improvement over the IQ only model, a model adding the corresponding cognitive EF measure was estimated (Model 3). The matching parent's EF rating was entered after the cognitive EF measure (Model 4). Whenever an EF measure would not significantly improve a previous model, we would replace this measure with the next EF counterpart measure (adding b or c to the model name). As only a small pool of not substantially correlated independent variables (see **Table 2**) were included in this study, we also ran a mixed-model stepwise backwards regression analyses. As similar results were found when using this method of model selection, we only report the hierarchical approach estimates in this paper, including fixed effect (intercept, regression weights) and the random effect estimates (variance around the intercept and random error). Effect sizes were interpreted as: I. a small 'practically' significant effect (r or β ≥ 0.2 and <0.5); II. a moderate effect (r or β ≥ 0.5 and <0.8) or III. a strong effect (r or β ≥ 0.8) (Ferguson, 2009).

## RESULTS

## Sample Description

Sample characteristics are shown in **Table 1**. Age (range 6–8 years) and gender (51.1% male) distributions were as expected. Children in this study were on average around 2 months ahead in math and spelling compared to a norm sample of Dutch peers, and had a somewhat higher estimated mean IQ score of 106. Comparing the educational level of the 164 parents in our sample to the educational level of the general Dutch population of 25- to 45-year-olds (N = 4,267,000), showed that the parents in our study were less likely to have a low educational level (11.6 vs. 33.6%; z = −5.96, p < 0.001), were more likely to have a medium educational level (48.8 vs. 28.3%; z = 5.83; p < 0.001), and equally likely to have a high



*† At time of Standardized CITO Math and Spelling test. ‡ % based on N* = *164 parents using the Standard Classification of Education (SOI) 2006, edition 2014/15: "Low educational level (1)," including Primary and Lower secondary education (level 1 and 2 of the SOI); "Medium educational level (2)," including Upper secondary and Post-secondary non-tertiary education (level 3 and 4 of the SOI); "High educational level (3)," including Short cycle tertiary education and Bachelor's, Master's and Doctoral level (level 5–8 of the SOI; Dutch Central Bureau for Statistics [CBS], 2006, 2011).* ¶*Difference between achievement level (expressed as equivalent to number of months of education) and number of moths of education (10 months per grade).* §*The short form (Vocabulary and Block Design) estimates of full scale IQ for the Wechsler Intelligence Scale For Children for children aged 6–8 years old (WISC-III-NL; Kort et al., 2005) were obtained according to the algorithm: 2.9 x (sum of normed scores)* + *42 (Campbell, 1998). STS, Spatial Temporal Span (raw score number of identified targets in correct order backwards); GNG, raw score number of false alarms–biased; ROO-3, raw score number of errors compatible and incompatible part 3; BRIEF, Behavior Rating Inventory of Executive Function.*

educational level (39.6 vs. 38.1%; z = 0.40; p = 0.689) (CBS, 2013). Around 12% of the children were referred to mental health care in the past year (95% Confidence Interval = 5.0–18.8%) for the assessment and/or treatment of various developmental, emotional and behavioral problems (e.g., problems with attention and hyperactivity, anxiety, conduct related problems, pervasive developmental problems). This percentage is significantly higher than the 5.9% referral rate found in a large (N = 1710) Dutch general population study of 6–18-year-olds (z = −2.23, p = 0.026) (Tick et al., 2008). Teachers in our sample scored their students significantly more often in the clinical range of WM problems (T-score ≥ 65 = 20.2%) compared to 7% of the BRIEFteacher Dutch norm sample of 5- to 8-year-olds (N = 55) (Zscore = −2.138, p = 0.032). No significant difference with the Dutch norm sample on the percentage of reported students in the clinical range was found for inhibition and shifting. Parents in our sample reported a similar percentage of children in the clinical range on all three BRIEF-parent scales compared to the Dutch BRIEF-parent norm sample of 5- to 8-year-olds (N = 311; all p > 0.05).

## Correlations between EF, IQ, and School Achievement

Correlations between all measures are reported in **Table 2**. Both standardized measures of math and spelling were significantly correlated with all three types of WM measures (|r| range = 0.28– 0.43), which were significantly interrelated amongst themselves as well (|r| range = 0.25–0.31). Math and spelling were also significantly associated with the cognitive shifting measure, as was spelling with the teacher shifting problems rating. All effects were within the small range. None of the inhibition measures were related to school achievement. Parent-teacher cross-informant agreement of similar EFs were all significant and within the small range, while the cross-informant correlations between different types of EF were higher and in the moderate range. Intelligence level was significantly associated with math achievement (r = 0.41) and with the teacher's rating of WM problems (r = −0.31), but not with spelling achievement or any of the other EF measures. Furthermore, no significant correlation between age with any of the EF variables was found in this sample of 6–8 year olds.

## Math Achievement: Shared and Unique Influence of EF Measures

In the best mixed models explaining math achievement (see **Table 3**), standardized math achievement was uniquely associated with intelligence level (b<sup>∗</sup> ranging from 0.34 to 0.38), the cognitive measure of WM (b<sup>∗</sup> (number correct) = 0.35), and the cognitive measure of shifting (b<sup>∗</sup> (number of errors) = −0.22), all with an effect size within the small range (see **Table 3**). None of the inhibition measures had a direct impact on math achievement. None of the teacher's or the parent's EF ratings added any unique variance to their cognitive EF counterpart in relation to math achievement. As age was uncorrelated with any of the outcome or the EF measures (see **Table 2**), including age in the analysis did not make a difference to the final results. Similar



\**p* <*0.05;* \*\**p* <*0.001. WM, working memory; BRIEF, Behavior Rating Inventory of Executive Function; STS, Spatial Temporal Span (number of identified targets in correct order backwards); GNG, Go-NoGo; ROO-3, Response Organisation Objects-part 3; Bold, monotrait–heteromethod correlations; Italic, heterotrait–monomethod correlations; regular, heterotrait–heteromethod correlations.*

results for EF on math were found when IQ was excluded from the analysis, showing somewhat higher standardized regression weights for WM (b<sup>∗</sup> = 0.43) and shifting (b<sup>∗</sup> = −0.29), as shared variance with IQ was not corrected for.

### Spelling Achievement: Shared and Unique Influence of EF Measures

The best mixed models for spelling outcome (see **Table 4**), showed that both teacher rated WM problems (b<sup>∗</sup> = −0.34) and the cognitive WM measure (b<sup>∗</sup> (number correct) = 0.29) uniquely explained differences in spelling achievement, while IQ did not. A similar result was found for shifting, with both teacher rated problems with shifting (b<sup>∗</sup> = −0.24) and the cognitive shifting measure (b<sup>∗</sup> (number of errors) = −0.27) accounting for spelling differences. All effects sizes were within the small range. None of the inhibition measures were related to spelling achievement, neither were any of the parent EF ratings. As age was uncorrelated with any of the outcome or the EF measures (see **Table 2**), including age in the analysis did not make a difference to the final results. Excluding IQ from of the model resulted in similar findings for EF with regard to spelling achievement.

#### CONCLUSION AND DISCUSSION

The aim of the present study was to develop a better understanding of the interrelations between cognitive EF measures and behavioral EF ratings from both parents and teachers and to investigate their shared and unique influence on math and spelling achievement in first and second graders. A novel aspect of this study is the inclusion of EF ratings from multiple informants concurrently with cognitive EF performance measures to explain differences in school achievement. Furthermore, little research on the relation between EF and spelling has been published, especially in typically developing children using multiple modes of EF assessments. Analyses included IQ, a confounding factor for both school achievement and EF.

The main findings of this study were that the cognitive WM measure was correlated with its parent- and teacher-reported behavioral WM counterpart, and that all WM measures were significantly associated with school achievement. Furthermore, both the cognitive shifting and the teacher-reported behavioral shifting measure were also related to school achievement. None of the inhibition measures were significantly correlated with school outcome. Moderate correspondence was observed between parent's and teacher's ratings of children's behavioral EF. Cognitive performance and teacher's ratings of WM and shifting concurrently explained differences in spelling achievement. However, teacher's behavioral EF ratings did not explain any additional variance in math outcome above IQ and cognitive EF performance. Parent's behavioral EF ratings did not add any unique information to either outcome measure.

In comparing similar cognitive and behavioral aspects of EF, a significant and modest monotrait-multimethod correlation was only found between cognitive and behavioral ratings of WM. Thus, visual spatial working memory performance was somewhat linked to real-life WM problems that were observed by others, like forgetting what one was doing and having trouble remembering things at school or at home. Furthermore, modest correlations between parent and teacher ratings across all comparable EFs were found. These modest relations were consistent with findings by Toplak et al. (2013) and cross-informant findings in the related field of child


TABLE 3 | Mixed model hierarchical regression analyses results of best model explaining MATH outcome (N = 84) for each type of EF using multiple methods, IQ, and with random intercept for class (n = 7).

*† Whenever difference* −*2LL between fuller model minus adjacent nested more parsimonious model (lower number)* = *significant and BIC difference* > *0, fixed and random estimates of best model are reported. ICC, intra class correlation;* ∆*-2RLL,* −*2Log Likelihood difference between two adjacent nested models (*∆ *df* = *difference in degrees of freedom between two adjacent nested models) following* χ*2 distribution;* ∆ *BIC, difference in Schwarz's Bayesian Criterion between two adjacent nested models; p (*∆ *nested model), significance level improvement of adjacent more parsimonious model; b, regression weight; SE, Standard Error; b*\**, standardized regression weight; Var(intercept), variance attributed to class; Var(error), random error; WM, Working Memory; STS, Spatial Temporal Span; GNG, Go-NoGo; ROO-3, Response Organization Objects -part 3; BRIEF-p, Behavior Rating Inventory of Executive Functioning–parent rating; BRIEF-t, BRIEF-teacher rating.*

psychopathology (Achenbach et al., 1987). Teachers perceived on average similar amounts of EF problem behavior compared to parents, but they only modestly agreed on which children had relatively more or less EF problems. This was also true for reporting the presence of a clinical level of EF problems (T-score ≥ 65). Teachers in our sample were, compared to a norm sample of peers, more likely to report a clinical level of EF problems than parents did; this was especially true for WM. The observed absent or modest monotrait-multimethod correlations suggest that each type of EF measure taps different aspects of EF across different situations and under variable conditions. Furthermore, the similar or even higher multitrait-monomethod correlations point to method variance caused by rater biases, e.g., halo and leniency bias, and test impurity problems.

#### Math Achievement

Based on the presented evidence we expected cognitive measures of WM, inhibition and shifting to be correlated to math


TABLE 4 | Mixed model hierarchical regression analyses results of best model explaining SPELLING outcome (n = 84) for each type of EF using multiple methods, IQ, and with random intercept for class (n = 7).

*†Whenever difference* −*2LL between fuller model minus adjacent nested more parsimonious model (*= *lower number)* = *significant and BIC difference* > *0, fixed and random estimates of best model are reported. ‡ IQ is left in model to control for confounding even though BIC* < *0. ICC* = *intra class correlation;* ∆−*2RLL* = −*2Log Likelihood difference between two adjacent nested models (*∆ *df* = *difference in degrees of freedom between two adjacent nested models) following* χ *<sup>2</sup> distribution;* ∆ *BIC, difference in Schwarz's Bayesian Criterion between two adjacent nested models; p (*∆ *nested model), significance level improvement of adjacent more parsimonious model; b, regression weight; SE, Standard Error; b*\**, standardized regression weight; Var(intercept), variance attributed to class; Var(error), random error; WM, Working Memory; STS, Spatial Temporal Span; GNG, Go-NoGo; ROO-3, Response Organization Objects-part 3; BRIEF-p, Behavior Rating Inventory of Executive Functioning–parent rating; BRIEF-t, BRIEF–teacher rating.*

achievement (e.g., Yeniad et al., 2013; Friso-van den Bos et al., 2013; Gerst et al., 2015; Ten Eycke and Dewey, 2016). Our study confirmed these findings, except for inhibition. Our finding that inhibition did not have a direct relation with math was in contrast to findings from a meta-analysis of 4–12-year-old children (Frisovan den Bos et al., 2013), and from recent studies in 9- to 11-year-olds (Gerst et al., 2015), and in 5–18 year-olds (Ten Eycke and Dewey, 2016), although the meta-analysis of Frisovan den Bos et al. (2013) also showed that WM had the strongest relation to math, and that inhibition and shifting showed the weakest relation. Furthermore, our findings also differed from previous findings linking inhibition to emerging math skills in preschoolers and kindergartners (e.g., Espy et al., 2004; Blair and Razza, 2007; Allan et al., 2014). Perhaps, only more extreme levels of inhibitory problems affect math outcome negatively, or inhibition is more likely to play a role in children with mathematical disorders or from economically disadvantaged families, which were included in the meta-analyses of Allan et al. (2014) and Friso-van den Bos et al. (2013). In fact, the metaanalysis of Friso-van den Bos et al. (2013) showed that children with math, psychological or physical problems have stronger associations between EF and math outcome. The children in our study were not at risk for mathematical problems nor inhibition problems, and predominantly came from families with medium to high socio-economic backgrounds. This study also showed that the influence of EF on math is in addition to the effect of IQ, which is in line with previous research (e.g., George and Greenfield, 2005; Alloway and Alloway, 2010; Preßler et al., 2013; Yeniad et al., 2013; Gerst et al., 2015; Dekker et al., 2016), and underscores the suggestion that IQ cannot be considered a proxy of EF or vice versa.

Based on the study of Gerst et al. (2015), we expected that only for WM a behavioral measure, most likely the teacher's rating, would add unique variance to the cognitive WM measure and IQ in explaining math performance. Unlike Gerst et al. (2015), we did not observe a similar impact for the teacher WM rating, nor for the parent rating of WM, although the latter measure was borderline significant. Nevertheless, comparable to Gerst et al. (2015), our results showed that none of the behavioral measures of inhibition or shifting added any unique variance explaining math outcome besides IQ. Thus, for math achievement we were able to confirm most of Gerst et al. (2015) findings in a younger age group, while also including parent EF ratings.

## Spelling Achievement

Based on research about the relation between EF and spelling, we expected the cognitive measure of WM to be related to spelling outcome (e.g., Fischbach et al., 2013; Preßler et al., 2013; Becker et al., 2014). We could confirm that WM was related to spelling performance. Our results also extend the previous finding by Altemeier et al. (2008) that in typically developing first to fourth graders shifting ability is related to spelling, although we could not confirm their finding of a significant relation between inhibition and spelling. Inhibition and emerging writing skills have also been linked in preschoolers (Blair and Razza, 2007; McClelland et al., 2007; Brock et al., 2009). Altemeier et al. (2008) used a verbal word-color naming task to assess inhibition and shifting, while in our study we used nonverbal tasks. Perhaps, measures of verbal inhibition have a stronger association with spelling skills than non-verbal measures. Research in math, for example, has shown that visual spatial WM is more strongly related to learning something new, while verbal working memory is more related to learned math skills, which are typically evaluated through standardized achievement tests that are also used in this study (Van de Weijer-Bergsma et al., 2015). Similar differences across different stages of spelling attainment might also be observed for inhibition. Future research is needed to address the relative impact of verbal vs. visual spatial performance based EF measures in relation to various school outcomes and taking into account different stages of the learning process (e.g., acquiring or mastering).

No previous publications have considered the joint impact of different EF measures on spelling. We based our expectations, i.e., teacher's EF ratings having the biggest influence, and the cognitive measure of WM also adding variance, on the findings by Gerst et al. (2015) on another language related outcome, i.e., reading comprehension. In our study we found that both teacher behavioral ratings and cognitive measures of WM and shifting were related to spelling outcome, partially confirming our tentative hypotheses. Thus, real life application of WM and shifting skills at school helps to explain differences in spelling outcome concurrently with their cognitive counterparts. Spelling in this study was assessed through a dictation test, which might ask for different EF skills compared to a general math achievement test, although in first grade the math questions were also read out loud by the teacher. Perhaps attentional processes play a bigger role during dictation tests. Indeed, parent and teacher ratings of inattention in children with emotional and behavioral problems have previously been associated with behavioral EF ratings on the BRIEF (McAuley et al., 2010), which might partially explain the contribution of behavioral EF ratings concurrently with cognitive EF measures in explaining differences in spelling outcome.

In sum, although the ecological validity of cognitive performance-based tasks have been questioned, this study confirmed that cognitive EF measures actually explained most unique variance in math outcome compared to behavioral EF measures. This study also provides support for the ecological validity of performance- and teacher rating-based EF tasks by showing that both measures have a complementary role in identifying spelling achievement problems. Furthermore, both WM and shifting abilities were related to school achievement in general rather than to a specific domain.

Several study limitations need to be acknowledged. First of all, children from only two Dutch schools in the same provincial region were included in this study. One school from a rural area and a second school from a town that is part of the metropolis of the cities of Rotterdam and The Hague. Although the distributions of our independent and outcome measures seem to represent levels of typically developing children, with the exception of teacher reported level of clinical WM problems, it is clear that the children in our study are not representative as far as the educational level of their parents is concerned. Children from parents with a low educational level are underrepresented, and our results cannot be generalized to this group. Our low risk sample might have resulted in weaker relations between EF and school achievement than those found in other studies comprising at-risk samples (e.g., Waber et al., 2006; Gerst et al., 2015). Stronger associations between EF and math outcome exist in children with relatively more math, psychological or physical problems, as was shown in the meta-analysis of Frisovan den Bos et al. (2013). Secondly, the inclusion of more classes from more schools would have given more reliable estimates of random variation around the intercept for class. Thirdly, this study used a cross-sectional design, so we could not study the differential predictive power of the various EF measures nor the development of EF in relation to school achievement over time, which precludes any causal inferences. Finally, it might be possible that the inclusion of teacher-based math and spelling grades could have resulted in a different pattern of the relative contribution of each type of EF measure, as grades might share more variance with behavioral measures.

Despite these limitations, the observation that WM and shifting were related to spelling and math outcome, regardless of the child's IQ level, points in the direction of possible benefits from stimulating EF skills in young children in addition to extra domain specific instruction, to optimize school performance. There is some evidence that school-based and computerized interventions aimed at improving EF skills have promising cognitive outcomes in young children (Thorell et al., 2009; Diamond and Lee, 2011; Diamond, 2012; Wass, 2015), although questions remain concerning the actual causal mechanisms involved in improving school achievement. For example: To what extent do these interventions directly train academic achievement? Or to what level do these interventions improve EF by reducing EF suppressors like anxiety, depressive feelings, sleep deprivation or low physical activity level? (Jacob and Parkinson, 2015; Diamond and Ling, 2016). Other remaining questions are the transfer of EF skills, the heterogeneity or homogeneity of the training regime, how long benefits last, and which children benefit the most. There is some indication that younger children and children from at risk groups (e.g., economically disadvantaged background, poor EF) benefit more from EF training (Diamond, 2012; Wass, 2015). Nevertheless, identifying and monitoring each child's EF strengths and weaknesses, especially in the WM and shifting domain might help teachers and other caregivers to broaden their range of remedial intervention options to optimize school achievement. This study's findings also show that both types of EF measures, cognitive performance tasks and teacher's behavioral rating scales, complement each other in explaining spelling achievement and suggest that both could be used to identify likely candidates for additional support.

Future research is needed to cross-validate our final models, and to compare the impact of each type of EF measure across a wider age range of students, preferably longitudinally, to detect developmental differences, and across more school achievement domains, using both verbal and non-verbal cognitive EF measures. Also, within certain domains, e.g., mathematics,

#### REFERENCES


it might be informative to study independent aspects of math (e.g., factual, procedural, conceptual; Raghubar et al., 2010).

### ETHICS STATEMENT

This study was carried out in accordance with the recommendations of the standards of the Ethical Committee of the Leiden Institute of Education and Child Studies with written informed consent from the parents of all subjects (minors). All parents of subjects (minors), gave written informed consent in accordance with the Declaration of Helsinki. The protocol was approved by the Ethics Committee of the Leiden Institute of Education and Child Studies at Leiden University (ECPW-2010/016).

## AUTHOR CONTRIBUTIONS

MD was involved in the conception and design of the work, data collection, data analysis and interpretation, drafting the article, critical revision of the article and gave her final approval of the version to be published. TZ was involved in data interpretation, critical revision of the article and gave his final approval of the version to be published. AS was involved in the design of the work, data collection, data interpretation, critical revision of the article and gave her final approval of the version to be published. HS was initiator of the Curious Minds study and was involved as project leader in the conception and design of this work, data interpretation, critical revision of the article and gave her final approval of the version to be published.

#### FUNDING

This research is funded by the Curious Minds Program, which is supported by the Dutch Ministry of Education, Culture and Science and the National Platform Science & Technology.


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2017 Dekker, Ziermans, Spruijt and Swaab. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

## The Role of Attention Shifting in Orthographic Competencies: Cross-Sectional Findings from 1st, 3rd, and 8th Grade Students

Antje von Suchodoletz 1, 2 \*, Anika Fäsche<sup>1</sup> and Irene T. Skuballa1, 2

<sup>1</sup> Department of Psychology, University of Freiburg, Freiburg, Germany, <sup>2</sup> Department of Psychology, New York University Abu Dhabi, Abu Dhabi, United Arab Emirates

Attention shifting refers to one core component of executive functions, a set of higher-order cognitive processes that predict different aspects of academic achievement. To date, few studies have investigated the role of attention shifting in orthographic competencies during middle childhood and early adolescence. In the present study, 69 first-grade, 121 third-grade, and 85 eighth-grade students' attention shifting was tested with a computer version of the Dimensional Change Card Sort (DCCS; Zelazo, 2006). General spelling skills and specific writing and spelling strategies were assessed with the Hamburger Writing Test (May, 2002). Results suggested associations between attention shifting and various orthographic competencies that differ across age groups and by sex. Across all age groups, better attention shifting was associated with less errors in applying alphabetical strategies. In third graders, better attention shifting was furthermore related to better general spelling skills and less errors in using orthographical strategies. In this age group, associations did not differ by sex. Among first graders, attention shifting was negatively related to general spelling skills, but only for boys. In contrast, attention shifting was positively related to general spelling skills in eighth graders, but only for girls. Finally, better attention shifting was associated with less case-related errors in eighth graders, independent of students' sex. In sum, the data provide insight into both variability and consistency in the pattern of relations between attention shifting and various orthographic competencies among elementary and middle school students.

Keywords: attention shifting, spelling, cross-sectional study, elementary school children, secondary school children, gender differences, cohort study

## INTRODUCTION

Attention shifting, one core component of executive functions, is defined as the ability to flexibly shift "back and forth between multiple tasks, operations, or mental sets" (Miyake et al., 2000, p. 55). Spelling mastery appears to require children to flexibly shift between multiple demands that are embedded in the process of transforming a spoken word into written symbols (Lubin et al., 2016). For example, recognizing smaller units of meaning and sound, retrieving the correct letter or letter combination for each sound, and finally writing the letter in the correct form requires the flexible

#### Edited by:

Jacob A. Burack, McGill University, Canada

#### Reviewed by:

Thomas Lachmann, Kaiserslautern University of Technology, Germany Natalie Ann Munro, The University of Sydney, Australia

> \*Correspondence: Antje von Suchodoletz avs5@nyu.edu

#### Specialty section:

This article was submitted to Developmental Psychology, a section of the journal Frontiers in Psychology

Received: 31 January 2017 Accepted: 11 September 2017 Published: 26 September 2017

#### Citation:

von Suchodoletz A, Fäsche A and Skuballa IT (2017) The Role of Attention Shifting in Orthographic Competencies: Cross-Sectional Findings from 1st, 3rd, and 8th Grade Students. Front. Psychol. 8:1665. doi: 10.3389/fpsyg.2017.01665

**63**

shifting of attention (Aram et al., 2014; Blair and Raver, 2015). In addition, spelling requires one to shift between strategies, lexical and non-lexical strategies in particular, when decoding and spelling words (Sheriston et al., 2016).

The ability to voluntarily focus or shift attention as needed develops during the early elementary school years, between 7 and 9 years of age (Anderson, 2010). Attention shifting continues to improve throughout middle childhood and becomes relatively mature by the beginning of adolescence (Anderson, 2010). A variety of measures exist to assess attention shifting. However, only few can be used at different stages of the lifespan and across age groups, one of which is the computer-based version of the Dimensional Change Card Sort (DCCS; Zelazo, 2006). The DCCS requires participants to sort objects by two dimensions, color and shape. Preschool children are able to switch tasks as long as the stimuli vary along only one dimension (Diamond et al., 2005). As they grow older, accuracy on the DCCS increases, as children are able to switch from sorting by either color or shape to sorting by the other (Diamond et al., 2005). Most children are able to complete the DCCS accurately around the age of school entry (Diamond and Kirkham, 2005). However, there is evidence that, despite high accuracy rates in sorting objects by switching dimensions, attentional inertia persists. Diamond and Kirkham (2005) used a computer-based version of the DCCS with young adults (i.e., undergraduate college students in their early twenties). While the participants were able to switch sorting dimensions, their reaction time pattern was similar to the accuracy pattern among young children, i.e., reaction time was significantly slower when the sorting criterion changed (Diamond and Kirkham, 2005). Evidence that the computerbased version of the DCCS captures individual differences in attention shifting among children and adults suggests that the DCCS is an appropriate measure for investigating attention shifting at different stages of the life span and across a wide age range.

## The Acquisition of Spelling

Spelling is an important prerequisite for competent writing and predicts a number of literacy outcomes at later ages (Temple et al., 1982; Aram et al., 2014). Spelling in alphabetic orthographies can be defined as the ability to transform a spoken word into written symbols on the page (Berninger et al., 1996). Learning to spell means being able to map phonemes (i.e., units of speech) onto letters (i.e., units of print), and to understand that letters primarily represent sounds in language rather than meaning (McBride-Chang, 2004; Aram et al., 2014). Three sequential schemes of early spelling development have been suggested: graphic, writing-like, and symbolic writing (Levin and Bus, 2003). Writing in the graphic phase is characterized by the spontaneous production of small graphic forms and shapes. As soon as children know that letters, not pictures or shapes, represent print units, they move to the next phase, although they might not yet understand the relation between letter names and their sounds (McBride-Chang, 2004). During the preschool period, children discover writing-like features (Temple et al., 1982). Once they reach the phase of symbolic writing children are able to use symbolic units and move from phonetic writing to conventional spelling (Temple et al., 1982; Levin and Bus, 2003).

Although, in alphabetic orthographies letters typically map onto phonemes, the writing system also contains non-alphabetic aspects (Nagy et al., 2006). As children learn to spell, they acquire knowledge about morphology and orthographic patterns. Such knowledge is successively incorporated in children's attempts to spell as they learn to conform to the standard spelling rules of their language (McBride-Chang, 2004). Most research on spelling acquisition has focused on the early childhood years. However, spelling development continues after school entry across all years of schooling when children are increasingly confronted with words of irregular spelling patterns, abstract words, and complex clause types that require specialized knowledge of spelling rules (Temple et al., 1982; McBride-Chang, 2004; Christie and Derewianka, 2010). While children have some basic morphological knowledge as early as in first grade (Treiman and Cassar, 1996), their use of morphological strategies is still fragile and not reliably reflected in their spellings until after third grade (Nagy et al., 2006). Similarly, basic orthographic knowledge emerges early in spelling acquisition, i.e., in kindergarten and first grade (Treiman and Bourassa, 2000). It is not until the later school years, however, that children can reliably incorporate their knowledge about orthographical strategies in their spellings (Treiman and Bourassa, 2000). For example, knowledge of allowable consonant doublets (i.e., twoletter spellings that typically occur in the middle and at the end of words) emerges in first grade, but proficiency in applying knowledge of orthographic patterns is not reached until sixth grade and above (Cassar and Treiman, 1997; Treiman and Bourassa, 2000). However, spelling is typically studied from a word-level perspective, thus limiting conclusions about the role of morphology and knowledge of orthographic units in spelling.

## Attention Shifting and Spelling

Early attention shifting supports young children's acquisition of precursor skills to the development of later spelling, such as letter/alphabet knowledge and print awareness (Blair and Razza, 2007; Bierman et al., 2008). As children grow older and enter formal schooling, attention shifting helps them to develop adaptive learning strategies and apply them flexibly to changing task demands. During the early elementary school years, children are increasingly confronted with non-alphabetic aspects of the writing system that require them to flexibly shift between several mental tasks, including retrieving the spelling of words from memory, applying orthographic patterns, or using phoneme-grapheme correspondence rules (Lubin et al., 2016). Although, various executive function components contribute to spelling acquisition, Lubin et al. (2016) found that attention shifting seems to be particularly predictive of spelling outcomes among elementary school children. Fourth-grade children were administered the Creature Counting subtest from the Test of Everyday Attention for Children (Manly et al., 1999) to measure attention shifting. The test requires children to use arrows as a cue to switch the direction of their counting. Spelling outcomes were assessed with a dictation test. After controlling for child age, sex, and nonverbal intelligence, executive function skills explained 19% of the variance in the spelling outcome. However, the findings indicated that only attention shifting significantly contributed to explaining the variance in children's spelling skills.

While the relation between attention shifting and literacy skills is well-established in samples of English-speaking children (Blair and Razza, 2007; Bierman et al., 2008), there is some evidence suggesting that associations might be different in other languages. Among French-speaking kindergarten children, for example, attention shifting and emergent literacy skills were not significantly associated (Monette et al., 2011). In another study with a sample of Dutch-speaking elementary school students, the relation between executive functions and reading was found to be negative (van der Sluis et al., 2007).

## Sex Differences in Spelling and Attention Shifting

The sex achievement gap suggests that boys may be at greater risk for school difficulties than girls (Matthews et al., 2009; Rimm-Kaufman et al., 2009; DiPrete and Jennings, 2012; Wanless et al., 2013). Girls frequently outperform boys on a wide range of measures of school achievement across different learning domains. For reading, for example, girls' advantage corresponds to approximately one school year's progress [PISA<sup>1</sup> 2009 study: Organization for Economic Co-operation and Development (OECD), 2010]. Girls also demonstrated higher writing competence (Pajares and Valiente, 1999) and written orthographic fluency (Berninger and Fuller, 1992) compared to boys. The pattern of sex differences in spelling skills among typically developing writers was also replicated in samples of children with dyslexia and their parent with dyslexia (Berninger et al., 2008a). In both samples, the male participants were consistently more impaired than their female counterparts in measures of spelling and orthographic skills. In search of an explanation, McGeown et al. (2013) argue that sex differences may be related to differences in girls' and boys' strategy preferences and their ability to use strategies effectively. Strategies may be specific to the learning domain (such as orthographical spelling strategies) or domain-general (such as executive functions). Individual differences in executive functions might be related to sex differences in orthographic competencies (Berninger et al., 2008b). Indeed, there is considerable research to suggest that girls may be more efficient at using executive functions (Wilson, 2003; Sabbagh et al., 2006; Matthews et al., 2009; Rimm-Kaufman et al., 2009; Wanless et al., 2013). Specifically, sex differences in attention shifting in favor of girls have been reported across age groups (Klenberg et al., 2001) and various measures of attention shifting (Klenberg et al., 2001; Wilson, 2003). Neuropsychological differences may underlie girls' advantage in attention shifting. Feng et al. (2011) found greater brain activation (interpreted as the use of more attention resources) for women compared to men when completing an attention shifting task among students in their early twenties (mean age in both groups was 21.9 years).

Together, these studies suggest that both spelling and attention shifting skills differ by sex. However, it remains unclear whether there are (a) sex differences in the relation between attention shifting and spelling, and whether (b) sex differences are similar or different across age groups. While some research suggests that the gap between boys and girls in executive functions increases from childhood to adolescence (Else-Quest et al., 2006; Matthews et al., 2009), other evidence indicates that boys could catch up in their executive function skills as they grow older (Gunzenhauser and von Suchodoletz, 2015).

## The Present Study

The goal of the present study was to investigate whether attention shifting, one core component of executive functions, is equally important for orthographic competencies at different ages and for boys and girls. By using the same measure of attention shifting (a computer version of the DCCS) and gradeappropriate versions of the same spelling task in a sample of first, third, and eighth graders, the study aimed to provide new information on individual differences in attention shifting in middle childhood and early adolescence and its associations with spelling competencies. By doing so, we addressed gaps of prior research regarding differences in task characteristics that limited strong conclusions (Best et al., 2011; Cuevas et al., 2014). The selection of age groups was based on previous literature suggesting that voluntary attention shifting starts to emerge during the early elementary school years and becomes relatively mature by the beginning of adolescence (Anderson, 2010). Moreover, the contribution of attention shifting might depend on the specific outcome being measured. We therefore investigated general spelling skills as well as specific spelling skills, including alphabetical, orthographical, and morphological strategies. Understanding these associations can help to determine the extent to which attention shifting relates to which particular aspect of spelling.

Two research questions guided the present study: (1) Does attention shifting relate to orthographic competencies across all age groups? Specifically, we tested for cross-group invariance between first, third, and eighth graders. (2) Are there sex differences in the relation between attention shifting and orthographic competencies within each age group? We tested for cross-group invariance regarding sex within the samples of first, third, and eighth graders.

## METHODS

## Participants and Procedure

The participants were 275 school-aged children (51% girls) from South-West Germany. Sixty-nine students were in Grade 1, 121 in Grade 3, and 85 in Grade 8. The students' mean age was 7.23 years in Grade 1 (SD = 0.39), 8.47 years in Grade 3 (SD = 0.45), and 13.99 years in Grade 8 (SD = 0.40). All of the children were typically developing insofar that they were not enrolled in special education or special needs programs at their school. The first- and third-grade students were recruited from public primary schools; the eighth-grade students were recruited from the highest track of the public German secondary school system, i.e., the Gymnasium. For 78% of the first graders, 68% of the third

<sup>1</sup>Program for International Student Assessment (PISA).

graders, and 87% of the eighth graders, German was the primary language spoken at home.

To ensure that the protocol conforms to ethical standards, the study protocols in Grade 1 and Grade 3 were reviewed by the ethic committee of the German Psychological Association (DGPs). The study protocol in Grade 8 was reviewed by the Department of Psychology, Educational, and Developmental Psychology, at the University of Freiburg, Germany. All of the participants were recruited in their schools. Data was collected only from the children whose parents gave their informed written consent. Recruitment differences resulted in different sizes of the subsamples, in particular, the larger sample of the thirdgrade students. The third-grade students represent a randomly selected subsample drawn from a sample of over 700 students recruited from 56 classrooms in 34 schools (details can be found elsewhere: von Suchodoletz et al., 2015) who completed an additional data collection that included the measures used for the present analyses. The participants in Grade 1 were recruited from a database of families who participated in research when the children were younger (details can be found elsewhere: Gestsdottir et al., 2014) and had volunteered to be contacted for future research. The participants in Grade 8 were recruited specifically for this study.

The first-grade and third-grade students were tested at the local university, whereas the eighth-grade students were tested at their school. The children were administered the writing test first, followed by the computer version of the DCCS (presented on a laptop). It took between 30 and 45 min to complete the tasks. All of the participants received small incentives for their participation in the study.

#### Measures Attention Shifting

A computer-based version of the DCCS was used to measure attention shifting (DCCS; Diamond and Kirkham, 2005; Zelazo, 2006; Blankson and Blair, 2016). The DCCS has been shown to be a reliable and valid measure of attention shifting across a wide range of ages. For example, children's performance on the DCCS correlates with other measures of executive functions (Zelazo, 2006). The computer-based version of the DCCS has also been used with adult populations and has been proven to reliably capture individual differences in attention shifting (accuracy and reaction time) among adults (Diamond and Kirkham, 2005; von Suchodoletz et al., 2017).

Stimuli were presented on a laptop screen. The task required the participants to match a target stimulus presented at the top of the screen with two pictures that varied along two dimensions, i.e., color and shape, and appeared at the bottom corners of the screen. To match the pictures, the participants were instructed to press one of two yellow-marked keys on opposite sides of the laptop keyboard to indicate the location of their selection. In addition, a word (either color or shape) was presented at the top of the screen and spoken by a prerecorded voice to cue the participants to match the target picture with the correct corresponding picture on the bottom of the screen. Following a practice trial block, the participants were first asked to correctly sort the stimuli by one dimension (e.g., sort by shape; preswitch block) and then to switch and sort the stimuli by the other dimension (e.g., color; post-switch block). The final block consisted of mixed trials.

The participants' accuracy (i.e., percent correct) and reaction times (averaged for all correct trials) were recorded across the pre-switch, post-switch, and mixed-block. Trials in which the response was registered earlier than 200 ms or later than 3,000 ms after the onset of the stimulus were excluded from the analyses (Diamond and Kirkham, 2005). The mean percentage of correct responses, ranging between 85 and 95% across the participants (see **Table 1**), was similar to a study with undergraduate college students (Diamond and Kirkham, 2005). In the current study, attention shifting was measured in terms of inverse efficiency, i.e., average reaction time for all correct trials divided by accuracy (Spence et al., 2001; Schicke et al., 2009). Inverse efficiency scores provide a more psychometrically accurate representation of processing efficiency than using accuracy (i.e., proportion of correct responses) and reaction time as separate variables (Yang et al., 2014). The assumption underlying inverse efficiency scores is that "differences in reaction time performance would decrease if differences in accuracy [were] large but would remain the same if accuracy [were] identical" (Ding et al., 2014, p. 91). Inverse efficiency scores account for possible speed-accuracy trade-offs (i.e., slow responses, but less errors, and vice versa; Spence et al., 2001; Kitagawa and Spence, 2005; Holmes et al., 2007; Schicke et al., 2009). In the present data, a lower inverse efficiency score reflected better attention shifting skills.

#### Orthographic Competencies

The Hamburg Writing Test for first to ninth graders was used to measure orthographic competencies (German: Hamburger Schreibprobe 1–9; May, 2002). Age-appropriate versions for first, third, and eighth grade were used. The test requires the participants to write words and short sentences that are read aloud to them. The version for the eight graders also includes a text with mistakes to be corrected. The test has shown good re-test reliability (0.92–0.99) and high predictive validity (for example, correlations with school essay writing of r <sup>2</sup> = 0.78–0.82; May, 2002). The test provides a profile for each student of the general spelling skills and specific spelling strategies, including alphabetical, orthographical, and morphological strategies. General spelling skills were measured by the number of correctly written words and graphemes. The test comprises 14 words/61 graphemes (Grade 1), 38 words/191 graphemes (Grade 3), and 49 words/339 graphemes (Grade 8) to compute general spelling ability. In the present data, higher scores reflected higher general spelling skills (i.e., more correct words and graphemes).

The alphabetical strategy refers to all word positions that can be spelled correctly by applying phonological rules. The number of positions in the different versions is 15 (Grade 1), 20 (Grade 3), and 30 (Grade 8). The orthographical strategy was coded for all word positions, for which knowledge of orthographic units, i.e., abstract letter-by-letter strings (Frith, 1985), is required. The orthographical strategy is distinguished from the alphabetical strategy as these word positions cannot be

#### TABLE 1 | Descriptive statistics for variables of interest.


spelled correctly by applying alphabetical knowledge only (for example, a letter sound can be represented in several ways such as x can be written as x, chs, ks, cks, or gs). The number of positions in the different versions is 10 (Grade 1), 15 (Grade 3), and 41 (Grade 8). The morphological strategy is based on the number of correctly spelled critical morpheme positions. These are letter groups in a word that require morphological rather than phonological or orthographical knowledge, for example vowel mutations. Because it is the most advanced strategy, it was only coded for the students in Grade 8. The total number of critical morpheme positions in the test was 28. For analyses, the number of errors in applying each strategy were used as indicators for the specific writing and spelling strategies. Additionally, redundant elements (i.e., additional letters that indicate overgeneralization of alphabetic principles; for example, "ie" (Bield) instead of "i" (correct: Bild, in English: image, picture) were coded for the students in Grade 1 and Grade 3 and case-related errors (i.e., errors in capitalization of first letter in nouns and names) for the students in Grade 8, reflecting grade-appropriate expectations regarding the students' spelling. In the present data, higher scores indicated lower proficiency in applying specific spelling strategies, and more redundant elements and case-related errors in the students' writing samples.

#### Analytic Strategy

All analyses were conducted using the Bayesian approach with non-informative priors in Mplus 7.3 (Muthén and Muthén, 2014). The Bayes estimator compares an obtained value with a posterior probability distribution of predicted values (Kruschke, 2011). It is able to account for relatively small sample sizes and is robust to distributional assumptions of the estimated parameters of interest. Thus, it provides more trustworthy results than a traditional maximum likelihood estimator (Lee and Song, 2004; Muthén, 2010). In order to ensure model fit, we checked for convergence. Four chains were used in the Markov chain Monte Carlo estimation with a thinning option (20 draws) in order to control for autocorrelation. Good convergence is given, when the potential scale reduction factor keeps ranging around 1.00, while running the model with increasing numbers of iterations (Muthén, 2010). This was the case for all models analyzed and reported here. Meaningful estimates were indicated when the conditional confidence intervals of the fixed posterior distribution (Bayesian credibility intervals, BCI) for the estimates did not include zero. The BCIs can be interpreted similarly to those in traditional maximum likelihood estimation: A 90%-BCI refers to a significance level of p < 0.10, and a 95%-BCI to a significance level of p < 0.05.

The central goal of the present study was to test differences in the structural patterns of associations between two (i.e., between the boys and girls in Grade 3) or three subsamples (i.e., between the first, third, and eighth graders). Therefore, a multigroup approach with cross-group invariance testing was used. It allowed us to consider whether attention shifting contributed unique variance to different levels of spelling skills in each group separately. The advantage of cross-group invariance testing is that various structural parameters of interest (including means of predictors and outcomes, regression coefficients) in more than two subsamples can be tested against one another in one model. Compared to correlational analyses, this approach enables the investigation of relationships between multiple predictors and outcomes, accounts for possible co-variances between more than one indicator of interest, and considers several control variables that may potentially have a confounding effect on the relationship.

To examine cross-group invariance regarding age and sex in the prediction of orthographic competencies by the students' attention shifting (i.e., inverse efficiency score<sup>2</sup> ), a three-step procedure was used for a correct estimation. Following the literature on measurement invariance analyses with Bayes (c.f., Van de Schoot et al., 2012; Muthén and Asparouhov, 2013), we have started with a full invariant model that sets all parameter equal across groups. Second, a full non-invariant model was tested that allowed variation in all parameter across groups. At the same time, difference tests for structural parameters of interest were included in order to identify meaningful differences between the groups. If a parameter was tested to be meaningfully different, then it was set free to vary between the groups in a third partial non-invariant model. The models were compared to each other in order to identify the best fitting model. For each set of analyses, the results of the best fitting model are reported in the Results section.

In more detail, the described procedure contained the following steps: In step 1, multiple regression models with configural equality constraints were estimated (i.e., full invariant models), holding all structural parameters of interest equal across the groups (i.e., regression coefficients, variable means). In step 2, non-invariant parameters were identified in models completely freeing the previously used equality constraints for the same parameters (i.e., full non-invariant models). For the analysis of cross-group invariance regarding age, difference tests between each group's structural parameter and its average across the three age groups (i.e., grade 1, 3, and 8) were used for identification. For the analysis of structural invariance regarding sex within age groups, non-invariant parameters were identified with difference tests between the two sex groups (i.e., girls and boys). In step 3, models holding all but the non-invariant parameters identified in step 2 equal across groups (i.e., partial invariant models) were conducted and compared to the full invariant models of step 1 and the respective full non-invariant models of step 2. For comparison, the Deviance Information Criterion (DIC; Gelman et al., 2004) was used. That is, the Bayesian model comparison criterion that is defined analogously to the Akaike Information Criterion (AIC) generated in ML estimation. The DIC takes the complexity of the model into account, i.e., estimated as the effective number of parameters. Models with the smallest DIC values were preferred (Muthén, 2010). Thereby, the Bayes approach detects cross-group non-invariance similar to Wald statistics with maximum-likelihood estimation (Muthén and Asparouhov, 2013).

All of the analyses controlled for child age, sex (0 = girls, 1 = boys) and home language (0 = child speaks German as home language, 1 = child speaks another language than German as primary home language). All variables were standardized (i.e., age, attention shifting, orthographic competence indicators) or dummy coded (i.e., sex, home language) before being entered into the analyses.

## RESULTS

## Descriptive Statistics

Descriptive statistics indicated that there was considerable variability in school-aged children's attention shifting and orthographic competencies, both within each grade and between grades (**Table 1**). The data showed that older students' attention shifting skills were higher than those of younger students (indicated by the smaller inverse efficiency scores among eighth graders compared to third graders and first graders). A similar pattern was found for students' word-level spelling and their proficiency to apply specific spelling strategies to their writing: Older students scored higher on the spelling test than younger students.

## The Role of Attention Shifting for Orthographic Competencies: Testing Age Invariance across Grade Levels

The first research question (i.e., Are there differences in the relation between attention shifting and orthographic competencies across the age groups?) tested for cross-group invariance of the relation between attention shifting and spelling across grade levels. The first model was specified as a full invariant model and resulted in a DIC of 1066.07. Next, a full non-invariant model was run to identify meaningful differences between structural parameters of interest. The full non-invariant model resulted in a DIC of 1045.07. The differences in several structural parameters between the grade levels are reported in **Table 2**.

To further investigate these differences, a partial invariant model was run. The model released all parameters that were found to be different in the previous model but set all remaining parameters to be equal across grade levels. The DIC of the partial invariant model was lowest (DIC = 1043.88). Therefore, this model was preferred over the full non-invariant model. The partial invariant model revealed a meaningful relation across all grade levels (i.e., invariant regression coefficient) between the students' attention shifting and their proficiency in using the alphabetical strategy in their writing (**Figure 1**, bottom). For all of the students, better attention shifting (indicated by a lower inverse efficiency score) was related to fewer errors in alphabetical spelling. Furthermore, differences across grade levels were identified (as indicated by variant regression coefficients; **Table 3** and **Figure 1**). Meaningful relations emerged between the thirdgrade students' attention shifting and word-level spelling skills, and between attention shifting and the use of the orthographic

<sup>2</sup>All models were also estimated using reaction time as a measure of attention shifting. Results were found to be similar when using reaction time data.

TABLE 2 | Results of difference tests for structural age invariance analyses in the full non-invariant model.


IE, Inverse Efficiency. Structural invariance between age groups was tested using Bayes estimation. Analyses controlled for child age, sex, and migration background. Bold numbers indicate meaningful unstandardized coefficients (posterior standard deviation) with a Bayesian Credibility Interval (BCI) excluding zero. The coefficients display the difference estimates between the respective structural parameter and its average across the three age groups (i.e., difference value = group parameter – parameter average).

spelling strategy. Higher levels of the third graders' attention shifting skills were associated with more correctly written words and graphemes, and fewer orthographic errors. For the eighth graders, attention shifting was related to general spelling skills, with higher attention shifting skills being associated with more correctly written graphemes. The detailed model parameters (regression coefficients, posterior standard deviation, Bayesian credibility intervals) are reported in **Table 3**.

## The Role of Attention Shifting for Orthographic Competencies: Testing Sex Invariance within Each Grade Level

To answer the second research question (i.e., Does attention shifting relate to orthographic competencies equally for boys and girls within each age group?), separate sets of models were run to test structural sex invariance within Grade 1, Grade 3, and Grade 8. For the first-grade students, the full invariant model across sex resulted in a DIC of 1164.21. The corresponding full non-invariant model revealed a DIC of 1137.34, showing a meaningful difference for the regression coefficient between attention shifting and general spelling skills for boys and girls (i.e., number of correct graphemes; **Table 4**). The partial invariant model releasing the corresponding parameter showed a DIC of 1138.13. Thus, the full non-invariant model was preferred. The model indicated a meaningful relation between attention shifting and general spelling skills indicating that higher attention shifting was associated with fewer correctly written words and graphemes (**Table 5** and **Figure 2A**). However, the relation was only found for the boys in Grade 1 but not for the girls.

For the third-grade group, compared to the full non-invariant model (DIC = 2087.60) the full invariant model across sex showed a lower DIC (DIC = 2079.98). No meaningful differences were found in the difference tests of the full non-invariant model and, thus, the full invariant model was preferred (**Table 4**). Among third grade students of both sexes, attention shifting showed meaningful relations with general spelling skills (i.e., number of correctly written words and graphemes) and with students' proficiency in the use of specific spelling strategies (**Table 6** and **Figure 2B**). Higher levels of attention shifting were related with higher general spelling skills and with less alphabetical and orthographic errors in students' writing sample.

With regard to the eighth-grade students, the full invariant model across sex yielded a DIC of 1931.79. The full non-invariant model showed a DIC of 1943.89. The full non-invariant model, however, indicated that the boys compared to the girls had higher word-level spelling skills (i.e., produced more correct words and graphemes in their writing sample) and made less caserelated errors. In addition, the regression coefficients of attention shifting on the number of correct graphemes differed for the boys and the girls. Therefore, a partial invariant model was run that released the respective parameters. The model showed the lowest DIC (1928.56) and was therefore considered the preferred model. In the partial invariance model, the girls' but not the boys' attention shifting was meaningfully related with the number

of correct graphemes. Thus, among the eighth-grade students, higher attention shifting skills were related with higher general spelling skills at the level of the grapheme, but only for the girls. At the same time, a meaningful relation between attention shifting and case-related errors was found for both, the boys and girls (see **Table 7** and **Figure 2C**). Higher attention shifting was related with higher proficiency in capitalization for both sexes.

## DISCUSSION

The present study examined associations between attention shifting, word-level spelling skills and specific spelling strategies in a group of first, third, and eighth grade students. In general, attention shifting was related to spelling outcomes for all of the students. The associations were particularly strong among the third-grade students. In this age group, there were no sex differences in the relations between attention shifting and spelling outcomes. Among the first- and eighthgrade students, however, findings suggest sex differences in the relationship between attention shifting and general, i.e., word-level spelling. While for the eighth-grade girls, higher attention shifting skills were related to higher general spelling skills, the opposite was true for the first-grade boys, i.e., higher attention shifting skills were related to lower general spelling skills. Together, the findings add to the literature by suggesting that the pattern of associations between attention shifting and various orthographic competencies differs across age groups and by sex.

## Age-Related Similarities and Differences in the Pattern of Associations

The current study expanded on previous research by providing initial evidence of age-related similarities and differences in the pattern of associations between attention shifting, one core component of executive functions, and spelling, that depended on whether general (i.e., word-level) spelling skills or specific spelling strategies were examined. One obvious hypothesis is that shifting abilities should be equally important for wordlevel spelling across different stages of spelling development. This is because word-level spelling requires shifting between several mental tasks, including "listening to the dictation, writing words either by retrieving their orthographic form from memory or by applying phoneme-grapheme correspondence rules [...], and verifying their production" (Lubin et al., 2016, p. 453) that should not differ between beginning and proficient spellers. In our study, attention shifting was related to general spelling among the third-grade and the eighth-grade students (though the associations among the eighth graders were only at the level of the grapheme). However, we did not find that attention shifting was related to general spelling skills among the first-grade students in comparison with the other age groups.


It could be that our findings may reflect specifics of spelling instruction in schools. In Germany, instructional emphasis in the early elementary grades is on phonemic spelling with teachers predominantly using words that have consistent one-to-one grapheme-phoneme-correspondence (Valtin, 1997). This results in a high probability of correct spelling. Thus, first graders' spelling might not draw heavily on attention shifting. The relative contribution of attention shifting to spelling, however, might change when students enter Grade 3 and are expected to apply spelling rules in order to master the spelling of unfamiliar words that contain inconsistencies between sound and orthographic patterns (Valtin, 1997; Moll et al., 2009). That is the time when individual differences in spelling become more prominent as students wrongly apply specific orthographic regularities where they are not needed (Valtin, 1997; Moll et al., 2009). Our findings suggest that attention shifting skills might provide a potential explanation for individual differences in spelling among older students. This assumption is supported by previous work reporting that shifting abilities but not working memory and inhibition accounted for variance in fourth graders' spelling skills (Lubin et al., 2016). Although, our study is among the first to investigate the relation between attention shifting and spelling outcomes at different stages of spelling acquisition, longitudinal research following children from middle childhood into adolescence is needed to better understand the (possibly changing) role that attention shifting plays in word-level spelling.

With regard to specific spelling strategies, attention shifting was related to the alphabetical strategy for all of the students across grades. That is, independent of the students' developmental level of spelling proficiency, faster but accurate performance on the DCCS was associated with less errors in applying the alphabetic principle to one's writing. A possible explanation could be that shifting abilities influence how spelling-relevant information is processed (Buchholz and Davies, 2005). For both beginning and proficient spellers, the ability to understand and apply the alphabetic principle has been linked to phonological processing (Dich and Cohn, 2013; Moll et al., 2014; Yeong et al., 2014). Deficits in attention were associated with impairments in phonological processing skills (Facoetti et al., 2010). Across different stages of spelling acquisition, shifting abilities may be important for the alphabetic spelling strategy because of the relationship with phonological processing skills. Future research should thus include this construct when studying associations between attention shifting and spelling skills.

An unexpected finding was that attention shifting was related to the orthographic strategy only among the third grade students. In the German orthography, most spelling errors are caused by orthographic deficits (Moll et al., 2009). That is, "phonemegrapheme conversion results in phonologically adequate but orthographically incorrect spellings" (Moll et al., 2009, p. 4). Consequently, attention shifting should be equally important for orthographic processing at all stages of spelling acquisition due to its importance for one's ability to differentiate between various representations of letter-sound correspondences. However, our findings could point to age-related changes in the association between attention shifting and orthographic spelling. Shifting abilities undergo rapid developmental changes from middle

TABLE 3 | Favored structural partial invariant model regarding age for prediction of orthographic

 competences.



IE, Inverse Efficiency. Structural invariance between sex groups was tested using Bayes estimation. Analyses controlled for child age and migration background. Bold numbers indicate meaningful unstandardized coefficients (posterior standard deviation) with a Bayesian Credibility Interval (BCI) excluding zero. The coefficients display the difference estimates between girls' and boys' respective parameters (i.e., difference value = girls' parameter – boys' parameter).

childhood to early adolescence (Anderson, 2010). During the same period, children learn to apply orthographic knowledge to their spelling (e.g., Cunningham et al., 2002; Sprenger-Charolles et al., 2003; Roman et al., 2009; Yeong et al., 2014). Shifting abilities may be particularly relevant for orthographic skills during the initial phase of building up proficiency in orthographic processing and less relevant later in development. As discussed above, from the beginning of Grade 3 students are increasingly exposed to inconsistencies of the German spelling system while they still lack adequate orthographic knowledge to cope with these inconsistencies (Moll et al., 2009). One possible explanation for our finding is that at this stage of spelling acquisition students with low attention shifting may have difficulties in building up orthographic proficiency. However, the directionality of the examined associations could not fully be identified due to the cross-sectional nature of the data. Alternatively, attention shifting may be indirectly related to orthographic skills through word-specific knowledge. Moll and colleagues (Moll et al., 2009, 2014) argue that the capacity of one's orthographic lexicon is an important predictor of orthographically correct spellings. Attention shifting may be relevant for more specific mechanisms underlying orthographic spelling. For example, correct spelling requires recall activity that is related to shifting abilities because students need to switch between different levels of analyzing words (Aram et al., 2014; Lubin et al., 2016). Further research is needed to investigate the mechanisms that may explain age-related differences in the relation between attention shifting and specific spelling strategies.

## Sex Differences in the Associations between Attention Shifting and Spelling Outcomes

We found that the relation between attention shifting and general (i.e., word-level) spelling skills differed for boys and girls but only among first and eighth graders. In contrast, no sex differences were found in the association between attention shifting and specific spelling strategies. Among first-grade boys only, slower


and less accurate performance on the DCCS was associated with higher spelling skills at the word level (i.e., more correctly written words and graphemes). Our data does not explain why this was the case. Children with lower shifting abilities may benefit from the emphasis on words with consistent oneto-one grapheme-phoneme-correspondence which might not draw heavily on attention shifting. The emphasis on one-toone grapheme-phoneme-correspondence is typical of spelling instruction in the early elementary years in Germany (Valtin, 1997). The boys with lower shifting abilities in our first-grade group could thus still be able to produce correct spellings because errors in the words' phoneme-grapheme conversion are less likely. In the eighth-grade sample, higher attention shifting skills were related to the girls' (but not boys') higher general spelling skills. These findings speak to the well documented achievement advantage for girls in secondary school (Bos et al., 2007; Quenzel and Hurrelmann, 2010).

The present findings could reflect sex differences in the strategies that boys and girls apply when directing their attention (Sobeh and Spijkers, 2012, 2013). There is initial evidence that boys perform faster in attention shifting tasks whereas girls demonstrate better accuracy (Sobeh and Spijkers, 2012, 2013). Influenced by biological factors, "accuracy of performance seems to develop earlier than the speed of performance" (Sobeh and Spijkers, 2013, p. 332). Sex differences in developmental trajectories of attention strategies may give girls an advantage to apply their shifting skills in a way that benefits their spelling, whereas for boys this might not be the case. It may even adversely affect their spelling, in particular, during the early elementary years. However, more research is needed to disentangle possible age-related changes in the mechanisms underlying sex differences in attention shifting and its relations to achievement outcomes, such as spelling.

An alternative explanation for the detected sex differences in the associations could be a measurement artifact. In the present study, spelling was measured using a conventional paper-andpencil spelling test. Results of a study that compared spelling performance on paper-and-pencil tests and computerized tests suggest that boys perform better on computerized tests (Horne, 2007). Horne (2007) argued that using a computer enhances boys' motivation to engage with the test which results in more accurate performance. In addition, the sex differences could be due to differences in children's hand writing abilities which could not be tested in this study. Thus, our spelling measure may have underestimated boys' spelling level which might have resulted in the negative relation between attention shifting and spelling for the first graders.

#### Practical Implications

The present results have several implications for educational practice. First, students' spelling proficiency may be improved by enhancing teachers' awareness of the importance of attention shifting for spelling skills. Several studies have shown that teachers' pedagogical knowledge influences classroom practices and the quality of instruction which in turn has an effect on students' learning and performance (e.g., Metzler and Woessmann, 2012; Kunter et al., 2013; König and Pflanzl,

TABLE 5 | Favored structural full non-invariant

 model regarding sex for prediction of orthographic

 competences

 in grade 1.

FIGURE 2 | (A) Favored structural full non-invariant model regarding gender for prediction of orthographic competences in grade 1. (B) Favored structural full invariant model regarding gender for prediction of orthographic competences in grade 3. (C) Favored structural partial invariant model regarding gender for prediction of orthographic competences in grade 8. (A–C) Structural gender invariance was tested using Bayes estimation. Analyses control for child age, gender, and migration background. Solid lines indicate meaningful relations (standardized coefficients), each with a Bayesian Credibility Interval excluding zero (see Tables 1–4). Gray boxes highlight gender invariant relations.

2016). Second, efforts to improve students' spelling might benefit from a focus on attention shifting. Intervention studies of school-based programs reported improvements in students' executive functions with particular strong effects on children with executive function difficulties (e.g., Diamond et al., 2007; Flook et al., 2010; Diamond and Lee, 2011). Positive effects of reading-specific flexibility exercises (focusing on shifting attention between phonological and semantic dimensions) that were completed with students as part of regular classroom activities have been shown to improve elementary students' reading (Cartwright, 2006). Such programs might be particularly relevant for third grade students who have to master the transition from phonological to orthographic spelling. Thus, providing a learning environment with ample opportunities to learn and practice executive function skills may facilitate students' spelling acquisition. Finally, many educational tests use general (i.e., word-level) spelling scores to classify students into good and poor spellers. As a consequence, spelling instruction


a Bayesian

 Interval

 excluding zero.

represents the effect size for attention shifting on the respective outcome, with

 = 0.02 rated as small,

 = 0.15 as medium, and

 = 0.35 as strong effect.

has centered around the spelling of words. However, students might benefit from a focus on various specific spelling strategies after they have acquired foundational knowledge of letter-sound correspondences (Keunig and Verhoeven, 2008).

#### Limitations and Future Directions

Although, our study addresses several limitations of prior work by including a wide age range and using age-appropriate versions of the same tasks to measure attention shifting and spelling outcomes, some caveats should be noted when interpreting the results. Our results suggested age-related differences in the associations between attention shifting and spelling outcomes. Unobserved variables such as intelligence and socioeconomic status may have accounted for the associations between attention shifting and spelling outcomes. Limitations in the available data did not allow us to control for these variables in the present analyses. However, previous research suggests that executive functions predict academic outcomes above and beyond intelligence (Duckworth and Seligman, 2005; Lubin et al., 2016) and socioeconomic status (Moffitt et al., 2011).

A second limitation is that the cross-sectional design of our study did not allow us to follow individual students over time. Thus, the differences between age groups may be due to student characteristics specific to each age group that could not be controlled in the present analyses. Future research should use a longitudinal design to investigate developmental trajectories of the relation between attention shifting and spelling which could also give insights into possible bidirectional associations of these two developing skills from childhood to adolescence. There is emerging evidence of simultaneous growth and reciprocal relations between executive functions and literacy skills during the early childhood years (Bohlmann et al., 2015; Slot and von Suchodoletz, submitted), but research from middle childhood into adolescence is still scarce.

Additional limitations concern the small sample sizes for each age group, in particular, when analyzing sex differences within each sample. Further studies are needed to confirm the findings with a larger sample. Another limiting fact refers to the missing analogy in covered characteristics between the measures used to assess spelling and attention shifting. While the latter was assessed with a process-related measure (inverse efficiency score produced by percentage of correct trials and their reaction times of the DCCS), we had no information on, for example,

## REFERENCES


writing speed and error handling during the process of writing. A better congruence between measures should be a focus in future research. Finally, to get a more accurate picture of the relative contribution of attention shifting to academic outcomes, it would be beneficial to include other core executive functions (e.g., working memory and inhibition) as well as further outcome variables (e.g., reading skills).

## CONCLUSION

Together with previous research, the present cross-sectional findings emphasize the important role of attention shifting, one core component of executive functions, for German students' spelling skills in middle childhood and early adolescence. Efforts aimed at improving shifting abilities may help students to reach grade-level spelling proficiency. The findings are relevant for teacher education and professional development as they emphasize the necessity to enable teachers to tailor instructions to both reinforcing students' academic skills and their executive functions in order to improve school achievement. Finally, the study goes beyond previous research by providing an ageand sex-specific approach to the relation between attention shifting and spelling. Similarities and differences in the pattern of associations were identified that depend on students' age, sex, and specific spelling skill measured, thus, identifying possible developmental processes that should be examined by future research.

## AUTHOR CONTRIBUTIONS

AvS conceptualized and designed the study. The data collection was conducted by AvS and IS. AF performed the statistical analyses. AvS and AF composed the paper. IS contributed to the writing of the manuscript.

#### FUNDING

The research reported here was supported by Grant SU 696/1- 1 from the German Research Foundation (DFG) to AvS and by grants given to the Research Group "The Empirics of Education: Economic and Behavioral Perspectives" in the context of the German Initiative of Excellence at the University of Freiburg, Germany.


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2017 von Suchodoletz, Fäsche and Skuballa. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

## Executive Function Mediates the Relations between Parental Behaviors and Children's Early Academic Ability

#### Rory T. Devine\*, Giacomo Bignardi and Claire Hughes

Centre for Family Research, Department of Psychology, University of Cambridge, Cambridge, UK

The past decade has witnessed a growth of interest in parental influences on individual differences in children's executive function (EF) on the one hand and in the academic consequences of variation in children's EF on the other hand. The primary aim of this longitudinal study was to examine whether children's EF mediated the relation between three distinct aspects of parental behavior (i.e., parental scaffolding, negative parent-child interactions, and the provision of informal learning opportunities) and children's academic ability (as measured by standard tests of literacy and numeracy skills). Data were collected from 117 parent-child dyads (60 boys) at two time points ∼1 year apart (M Age at Time 1 = 3.94 years, SD = 0.53; M Age at Time 2 = 5.11 years, SD = 0.54). At both time points children completed a battery of tasks designed to measure general cognitive ability (e.g., non-verbal reasoning) and EF (e.g., inhibition, cognitive flexibility, working memory). Our models revealed that children's EF (but not general cognitive ability) mediated the relations between parental scaffolding and negative parent-child interactions and children's early academic ability. In contrast, parental provision of opportunities for learning in the home environment was directly related to children's academic abilities. These results suggest that parental scaffolding and negative parent-child interactions influence children's academic ability by shaping children's emerging EF.

Keywords: executive function, academic ability, parenting, scaffolding, longitudinal study

## INTRODUCTION

Meta-analytic evidence from longitudinal research demonstrates that early academic abilities, such as a rudimentary understanding of mathematics and basic literacy, provide an important foundation for later academic achievement (e.g., Duncan et al., 2007). Attempts to understand the sources of individual differences in these foundational abilities have generated a substantial body of developmental research such that extensive data is now available on the relations between early language skills, general intelligence, and rudimentary academic skills (e.g., La Paro and Pianta, 2000; Roth et al., 2015). In parallel, recent decades have seen a growth of interest in how children's early academic abilities relate to parental behaviors, on the one hand, and children's emerging executive functions (EF–the suite of cognitive processes involved in the control of thoughts and actions) (Blair and Raver, 2015) on the other hand. Integrating these twin strands of research, the present study sought to examine whether variation in children's EF might play a mediating role in the association between preschool parent-child interactions and early academic ability.

#### Edited by:

Mariette Huizinga, VU University Amsterdam, Netherlands

#### Reviewed by:

Annie Bernier, Université de Montréal, Canada Loren Vandenbroucke, KU Leuven, Belgium Hilde Colpin, KU Leuven, Belgium

> \*Correspondence: Rory T. Devine rtd24@cam.ac.uk

#### Specialty section:

This article was submitted to Developmental Psychology, a section of the journal Frontiers in Psychology

Received: 09 September 2016 Accepted: 21 November 2016 Published: 15 December 2016

#### Citation:

Devine RT, Bignardi G and Hughes C (2016) Executive Function Mediates the Relations between Parental Behaviors and Children's Early Academic Ability. Front. Psychol. 7:1902. doi: 10.3389/fpsyg.2016.01902

**79**

## PARENTAL INFLUENCES ON CHILDREN'S ACADEMIC ABILITY

Variation in children's early academic ability is linked to both domain-general parental influences (e.g., the emotional quality and level of cognitive support that parents provide) and domainspecific parental influences (e.g., activities targeted at literacy and numeracy) (e.g., Kluczniok et al., 2013). Perhaps one of the most widely-studied of these different parental influences on children's academic ability has been the home learning environment (HLE), a term that refers to the extent to which resources and informal learning opportunities are available in the home. Children's HLEs are often studied using interviews and observer ratings of the home environment, such as the Home Observation for Measurement of the Environment (HOME) (Bradley et al., 2003; Totsika and Sylva, 2004) and, more recently, self-report questionnaires (Melhuish et al., 2008). Pioneering longitudinal studies demonstrated that aspects of the HLE (such as the provision of structured activities) were positively related to cognitive development in the early years (Bradley et al., 1979). Follow-up studies revealed that children's HLE at age 2 predicted children's academic performance in reading and languages at age 10 (Bradley et al., 1988). Subsequent studies have demonstrated that the HLE is positively related to children's language skills (e.g., Son and Morrison, 2010), literacy (e.g., Hindman and Morrison, 2012), and adjustment in the classroom (e.g., Lamb Parker et al., 1999). Longitudinal evidence also shows that children's HLE in the preschool years is positively correlated with mathematics ability in early and middle childhood (Melhuish et al., 2008).

Alongside the availability of informal learning opportunities in the home, researchers have examined how the quality of parent-child interactions, through specific attempts to provide cognitive support, might foster children's academic ability. In seminal work that applied socio-cultural theories of cognitive development (Vygotsky, 1978) to understand the contribution of parental tutoring practices, Wood et al. (1976) argued that parents (or other skilled adults or peers) who tailor their support can "scaffold" children's ability to solve problems independently (Wood and Wood, 1996). The most effective way to do this was through use of the "contingency rule" (Wood and Wood, 1996). That is, when children struggle to complete a task parents should increase the level of support they provide and when children succeed parents should decrease the level of support they provide (Wood and Wood, 1996). Parents' use of the contingency rule is typically measured through detailed observations of sequences of task-related behavior during parentchild interactions (e.g., Meins, 1997; Carr and Pike, 2012). Since the late 1980s, studies of the correlates and consequences of variation in parental use of the contingency rule have shown associations with children's success both on the shared task (Pratt et al., 1988) and on related tasks completed independently (Conner et al., 1997). Crucially, the effects of parents' use of the contingency rule appear to extend beyond the immediate task context. Cross-sectional studies show that parental use of the contingency rule is related, in early childhood, to children's observed persistence, self-control and help-seeking behavior in the classroom (Neitzel and Stright, 2003) and, in middle childhood, to children's mathematics performance (Pratt et al., 1992) and teacher-rated academic competence (Mattanah et al., 2005). These findings suggest that parental use of the contingency rule during problem-solving tasks might benefit children's academic ability. However, longitudinal relations with measures of academic ability in early childhood have yet to be examined.

Alongside parents' cognitive support, global measures of the affective quality (e.g., warmth, positivity, responsiveness) of parent-child interactions appear positively related to: (i) preschool children's early academic skills (as measured by tests of language ability and parent-rated school-readiness) (Leerkes et al., 2011); (ii) literacy, mathematics and teacher-rated academic competence in middle childhood (e.g., NICHD Early Child Care Research Network, 2008); and (iii) academic achievement in adolescence (Jimerson et al., 2000). Conversely, negative parentchild interactions characterized by harshness, negative control, and negative affect are associated with teacher-ratings of poor academic adjustment (e.g., Pettit et al., 1997; Culp et al., 2000) and poor performance on standard tests of achievement in middle childhood (Harold et al., 2007).

As outlined above, there is good evidence that individual differences in children's academic abilities are associated with a variety of measures of the family environment including the quantity and quality of cognitive support on the one hand and the affective quality of interactions on the other. What is not yet understood, however, is what mechanisms underpin these associations. At least three different pathways between these distinct aspects of parental behavior and variation in children's early academic ability deserve note. First, the HLE might be related to early academic ability for the simple reason that frequent exposure to basic literacy and numeracy activities provides children with opportunities to practice in these domains (e.g., Kluczniok et al., 2013). Second, with regard to the relations between parental use of the contingency rule and children's academic ability, it is conceivable that parents who provide appropriate support continually challenge their children's nascent cognitive abilities. Third, turning to the affective quality of parent-child interactions, Blair and Raver (2015) have proposed a psychobiological framework that emphasizes the interplay between stress, early cognition, and academic ability. According to this account stress physiology mediates the impact of early stressful experiences (such as negative parent-child interactions) on cognitive development and early academic ability (Blair et al., 2011). While these three pathways may each exert a specific influence on distinct aspects of children's developing cognition, they are not mutually exclusive and may operate in concert. Indeed, existing studies either aggregate these different aspects of parental behavior or focus on a single measure of parental behavior. One drawback of this approach is that both the intervening processes and the relative salience of each of these measures in predicting children's academic abilities remain poorly understood. Addressing these gaps, a key goal of the present study was to elucidate the mechanisms by which parental behaviors relate to individual differences in children's academic ability.

## EXECUTIVE FUNCTION, ACADEMIC ABILITY, AND PARENTAL INFLUENCES

One way to understand better the relations between parental behaviors and children's academic ability is to extend the focus of research onto other, more fundamental, cognitive abilities that are related to both children's academic ability and parental behavior. Research interest in the relations between children's executive function (EF) and academic abilities has grown dramatically in recent years (e.g., Blair and Raver, 2015; Ursache et al., 2012). EF encompasses skills such as overriding entrenched habitual responses (or "inhibition"), updating information held in mind (or "working memory") and switching between tasks (or "cognitive flexibility") (e.g., Diamond, 2013). In adolescence and adulthood, studies of the structure of EF support a "unity and diversity" model. That is, each aspect of EF is comprised of variance that is specific to that component of EF and variance that overlaps with other aspects of EF producing distinct but correlated factors representing inhibition, working memory and flexibility (Miyake and Friedman, 2012). In early childhood however, EF studies support a "unity" model in which a single latent EF factor explains individual differences in performance across a diverse range of tasks (e.g., Wiebe et al., 2008).

A substantial body of evidence shows that there are significant associations between diverse measures of EF and objective tests of mathematics and literacy in early and middle childhood (e.g., Willoughby et al., 2012). EF makes a unique contribution to academic ability above and beyond language ability or general cognitive ability indicating that correlations between EF and academic ability cannot be explained by these factors (e.g., Espy et al., 2004; Blair and Razza, 2007). Moreover, longitudinal studies also demonstrate that EF in early childhood predicts later academic ability even when earlier measures of academic performance are taken into account suggesting that EF is linked to gains in academic ability (e.g., Clark et al., 2013; Fuhs et al., 2014; Nesbitt et al., 2015). Underscoring this point, intervention studies indicate that gains in EF result in improved academic abilities, suggesting causal relations between these variables in early childhood (e.g., Raver et al., 2011).

Mirroring the growing interest in parental influences on children's academic ability, researchers have also begun to elucidate the ways in which early family experiences shape children's EF (e.g., Müller et al., 2013). Just as academic ability has been linked to the quality and quantity of cognitive and emotional support that parents provide, the development of EF has also been studied in relation to a range of parental behaviors. Factors such as household routines and chaotic family environments show concurrent and longitudinal negative associations with EF in early childhood (e.g., Hughes and Ensor, 2009; Vernon-Feagans et al., 2016). Early literacy and numeracy activities place considerable demands on children's EF (Blair and Raver, 2015). For example, reading activities require children to shift their attention between phonemes and whole words (Blair and Raver, 2015). It is therefore conceivable that through frequent exposure to informal literacy and numeracy activities in the home, the HLE might be correlated with individual differences in EF.

At the level of parent-child interactions, cognitive aspects of parent-child interactions such as parental verbal scaffolding during problem-solving tasks in early childhood show both concurrent and longitudinal associations with EF in early childhood (Hughes and Ensor, 2009; Bernier et al., 2010; Hammond et al., 2012). There is also evidence that the affective quality of parent-child interactions in early childhood is related to children's EF. There are moderate concurrent and longitudinal associations between maternal depression and variation in children's EF in early childhood indicating that exposure to negative parental affect may adversely affect children's early cognitive development (e.g., Blair et al., 2011; Hughes et al., 2013). Crucially, both cognitive and affective dimensions of parental behavior show unique associations with EF that are independent of children's language ability or, in the case of longitudinal studies, children's earlier performance on measures of EF (e.g., Hughes and Ensor, 2009).

## DOES EF MEDIATE THE RELATION BETWEEN PARENTAL BEHAVIOR AND ACADEMIC ABILITY?

One interpretation of the common associations between parental behavior and both EF and children's academic ability is that the quantity and quality of parental cognitive support and/or the affective quality of parent-child interactions could foster cognitive development in a range of domains (e.g., EF, early literacy and math ability). According to this Domain General Model, high levels of parental cognitive support and low levels of negative parent-child interactions might combine to exert a general influence on children's cognitive development (see **Figure 1A**). Alternatively, according to the Domain Specific Model, different aspects of parental behavior may show specific associations with distinct aspects of children's cognitive abilities. For example, the HLE might show direct associations with children's academic ability while negative parent-child interactions might show unique associations with children's EF (see **Figure 1B**). Another possibility is that child EF may play a mediating role in the associations between different dimensions of parental behavior and children's academic ability (see **Figure 1C**). Indirect support for this Mediation Model comes from two reports based on data from the NICHD Study of Early Child Care that have measured constructs that are closely related to core domains of EF.

First, children's sustained attention and impulsivity at age 4.5 years partially mediated the relation between parenting quality (as measured by a composite index of physical and social resources in the home, observer ratings of parental sensitivity and cognitive stimulation) at 4.5 years and children's academic achievement (as measured by performance on standardized reading and mathematics tests) at age 6 (NICHD Early Child

Care Research Network, 2003). Second, in the same sample, children's performance on a test of planning ability (considered to assess multiple aspects of EF including inhibition, working memory and flexibility—Russell, 1996) at ages 6 and 8 mediated the relations between parenting quality at 4.5 years and children's later academic ability at 8 and 10 years by Friedman et al. (2014). Alongside these results, Fitzpatrick et al. (2014) found that more traditional measures of EF partially mediated the relation between socio-economic status (SES) and children's academic ability in a sample of children aged between 3 and 5 years of age. Together these findings suggest that aspects of children's home environments might encourage the development of EF which in turn enhances children's early academic abilities. That said, the available evidence does not specify which aspects of parental behavior (i.e., cognitive or affective) matter most for academic achievement. Moreover, it is unclear from existing work whether EF in particular (rather than general cognitive ability) accounts for the relations between parental behavior and children's academic ability.

## SUMMARY OF AIMS

The present longitudinal study had two primary aims. First, we sought to examine the independence and overlap in the relations between measures of parental behavior (i.e., the home learning environment, negative parent-child interactions, and parental scaffolding) and children's early academic ability. Our second aim was to examine the relations between parental behavior, children's EF and academic ability by testing the direct and indirect relations between these constructs (as shown in **Figure 1**). In each of our analyses we sought to examine the unique effects of parental behaviors on children's academic ability by controlling for individual differences in known correlates of academic ability such as early measures of verbal ability, general cognitive ability, and parental education.

## METHODS

## Participants

Parents and children were recruited from nurseries, shopping centers and playgroups in the East of England. To be included in the study children had to be aged either 3 or 4 years old, be native English speakers and have no reported history of developmental delay. One hundred twenty parent-child dyads took part in the first wave of laboratory visits (Time 1). Of this group 117 dyads (60 boys) agreed to be contacted for a follow-up study. Although socio-economically homogenous (81% of parents had completed an undergraduate degree), the sample were ethnically diverse (66% White British). Of these 117 families, 100% of the families were contacted at the second wave of visits. Two families were no longer eligible to participate as they had left the country. Of the eligible 115 families 103 (90%) completed the second visit (Time 2) approximately 13 months later, SD = 1.65 months, range: 11– 17 months. The average age of children was 3.94 years, SD = 0.53, range: 3–4.95 years, and 5.11, SD = 0.54, range: 4–6.10 years, at Time 1 and 2 respectively. Binary logistic regression revealed that although non-returners did not differ from those who returned for the second visit in age, gender, or general cognitive ability (as measured by the Object Assembly task), non-returners were marginally more likely to have low levels of parental education, OR = 3.05, B = 1.12, SE = 0.64, p = 0.08.

## Procedures

All procedures were approved by the local University Research Ethics Committee. Parents and children were invited to participate in two laboratory visits lasting up to 75 min in length (including time for information and consent, rest breaks and debriefing) approximately 1 year apart. Following written parental consent, children completed a battery of tasks designed to measure EF, general cognitive ability and early academic ability. Individual child testing lasted approximately 30 min. The children completed the task battery in a fixed counterbalanced format such that no two tasks from any domain were completed alongside one another. Children were provided with rest breaks and rewarded with stickers for the completion of each task. Parents completed a short questionnaire booklet in an adjoining room while children completed the task battery. Upon completion of cognitive testing, parents were observed interacting with their child during 5 min of structured play with a set of jigsaw puzzles. At the end of each session parents were debriefed and provided with £15 and a small gift for their child.

#### Measures

#### Early Academic Ability

Children completed two subtests from the Wechsler Individual Achievement Test (WIAT-II-UK) (Rust and Golombok, 2005) at Time 2 to provide a measure of early academic ability. The Word Reading subtest was designed to measure a range of early reading skills including phonological awareness, letter–sound awareness, and letter reading skills. The Mathematics Reasoning subtest was designed to measure children's ability to count, identify numbers and shapes and solve simple mathematical problems. For both tasks the items were presented on a color flipbook and 1 point was awarded for each correctly answered question. Children completed up to 47 items on the Word Reading subtest and up to 35 items on the Mathematics Reasoning subtest. Scores on the two WIAT-II-UK subtests were strongly correlated, r(101) = 0.73, p < 0.001, and so were standardized and averaged to create a single "Academic Ability" variable (α = 0.85).

#### Executive Function

Children completed a short battery of tasks designed to measure EF at Time 1 and 2. To index conflict inhibition the children completed the Happy/Sad Task (Lagattuta et al., 2011) at both time points. In this task children were shown two cards depicting either a yellow "happy face"or a yellow "sad face." First the children were asked to point to the happy face and then to the sad face. Following this the experimenter told the children that they would play a "silly game" so that when the experimenter said "happy" the child had to point to the sad face and when the experimenter said "sad" the child had to point to the happy face. The children received 4 training trials with feedback from the examiner. If the child made an error on one of these training trails, up to two further sets of 4 training trials were provided and the rules were re-stated. If children failed these training trials they were assigned a score of 0 and testing was discontinued. Children completed 20 test trials in a fixed order with no feedback. The total number of correct items was summed together.

Children also completed the Dimension Change Card Sort (DCCS) Task (Zelazo, 2006) at both time points. This task was designed to measure children's ability to switch between rules and administered according to Zelazo's (2006) protocol. The children completed the pre-switch and post-switch phases at Time 1 and 2 and the border game at Time 2 only. In each phase the children were shown two laminated cards (one depicting a blue rabbit and the other depicting a red boat) attached to two sorting boxes and were required to sort six cards depicting either a blue boat or red rabbit. Following a demonstration of how the cards should be sorted the children completed either six trials of the "color game" or six trials of the "shape game" (counter-balanced across participants). In the color game the children had to place up to three cards depicting the red rabbit next to the target card showing the red boat and up to three cards showing the blue boat next to the target card showing the blue rabbit. In the shape game the children had to place the sorting cards depicting the red rabbit next to the blue rabbit target card and the cards depicting the blue boat next to the red boat target card. This first game served as the pre-switch phase. All children passed the pre-switch phase (i.e., sorted 5 or more cards correctly). Following the preswitch phase, the children playing the color game proceeded to the shape game and vice versa. Before this post-switch phase the children were told that the rules had changed (and the new rule was repeated before each sort). Children were awarded 1 point for each correctly sorted card in the post-switch phase. At Time 2 those children who passed the post-switch phase (i.e., sorted 5 or more cards correctly) proceeded to the border game. In the border game the children completed a further 12 sorting trials using a third set of cards containing 6 normal sorting cards and 6 cards with a thick black border. Cards without a border were sorted according to one rule (e.g., shape game) and cards with a black border were sorted according to another rule (e.g., color game). Children were awarded 1 point for each correctly sorted card. Children who failed the post-switch phase were scored 0 on the border game.

To measure working memory the children completed the Self-Ordered Pointing Task (SOPT) (Cragg and Nation, 2007) at both time points. In this task the children were shown a color flipbook depicting an increasing number of colored pictures of single syllable concrete objects (ranging from 2 objects to 7 objects with two sets in each number) in one of 16 locations on the page. In each set care was taken to ensure that no two objects were taken from the same class of objects (e.g., fruits, toys, pets). Children were asked to point to a new picture on each page and were told that they could not select the same picture twice. For example the first page depicted two images (e.g., bowl, flag) and the second page depicted the same images but in different spatial locations. The number of repetition errors (i.e., repeated points to the same picture) were recorded and used as an index of working memory. These error scores were reflected to be consistent with the other EF measures.

At Time 2 the children also completed the Day/Night Task (Gerstadt et al., 1994) to measure conflict inhibition. This task was not completed at Time 1 because during pilot testing we found that the youngest children in our sample became too fatigued. The Day/Night task followed the same general procedure as the Happy/Sad task but instead of cards depicting happy and sad faces, the experimenter presented the children with two laminated cards depicting either the sun or the moon. Children were required to point to the picture of the sun when the experimenter said "night" and to the picture of the moon when the experimenter said "day." The children completed 20 trials and were awarded 1 point for each correct trial.

#### General Cognitive Ability and Verbal Ability

Children completed three subtests from the Wechsler Preschool and Primary Scale of Intelligence (WPPSI-III-UK) (Rust, 2003). To obtain an index of general cognitive ability the children completed the Object Assembly task at Time 1. In this task participants were required to assemble a set of puzzles showing cartoon images of objects (e.g., clock, bird, hotdog). Children received marks for each correctly aligned juncture in the first 90 s of each trial. The children completed up to 14 trials and the scores from each trial were summed together. At Time 2 children completed the Matrix Reasoning task. In this task the children had to complete a matrix by identifying the missing portion from a choice of 4 or 5 options presented in a color flipbook. Children completed up to 29 trails and scores from each trial were summed together. The Matrix Reasoning task could not be used at Time 1 as it is only suitable for use with children aged over 4 years (Rust, 2003). To measure verbal ability the children completed the Receptive Vocabulary task at Time 1. The children were shown a color flipbook depicting 4 images on each page and asked to point to the picture that matched the word uttered by the experimenter. Children completed up to 38 trials and were awarded 1 point for each correctly identified picture.

#### Parental Behavior

At Time 1 parents and children were recorded for 5 min playing together using wall-mounted unobtrusive digital cameras while the experimenters were in another room. Parents and children were provided with three jigsaw puzzles (a 6, 8, and 12 piece puzzle) from the Galt Velvet Puzzles Jigsaw set. The parents were instructed to work together with their child to complete as many of the three puzzles as possible within 5 min. The data from these videos were then coded off-line using two different coding schemes by different trained researchers naive to the participants' identities and test scores.

Negative Parent-Child Interaction was measured using items from the Parent-Child Interaction System (PARCHISY) coding scheme (Deater-Deckard et al., 1997). Raters scored parental behavior during the task on three 7-point rating scales (ranging from "none" to "exclusive/constant"): Negative control (i.e., use of physical control, use of criticism), negative affect (i.e., frowning, harsh tone of voice) and conflict (i.e., disagreement, arguing or tussling). Following training from an experienced rater 25 video clips were randomly selected for double coding. Intra-class correlations for each item were acceptable: Negative content, ICC = 0.89, negative affect, ICC = 0.75, and conflict, ICC = 0.74. The remaining clips were double-coded and scores were averaged across raters.

Parental Scaffolding was measured using a coding scheme developed by Wood and Middleton (1975) and refined by Meins (1997). This approach required coding each of the verbal and non-verbal task-related behaviors of parents and children during the 5-min observation. Parental interventions were assigned into one of five mutually exclusive categories ranging from more open-ended verbal suggestions to more specific physical demonstrations: Level 1 Orienting Verbal Suggestions (e.g., "Let's start with the corners"); Level 2 Suggestions about Specific Pieces or Locations or Actions (e.g., "Try turn that piece around"); Level 3 Verbal Solutions (e.g., "This piece goes here"); Level 4 Direct Physical Solutions (e.g., Caregiver hands child a piece for a specific location); Level 5 Physical Demonstrations (e.g., Caregiver assembles or dismantles parts of the puzzle). Children's responses were coded as either "success" (i.e., correct placement of the puzzle piece) or "failure" (i.e., incorrect placement of the piece). Following training, 25 clips were randomly selected and double coded. ICCs were acceptable for all codes: Level 1 interventions, ICC = 0.64, Level 2 interventions, ICC = 0.85, Level 3 interventions, ICC = 0.97, Level 4 interventions, ICC = 0.98, Level 5 interventions, ICC = 0.96, frequency of child successes, ICC = 0.99, and frequency of child failures, ICC = 0.94.

The sequences of parent-child codes were parsed into threeturn chains of parent interventions, child actions and parent responses. If multiple interventions preceded a child action only the highest level of intervention was selected (Wood and Middleton, 1975; Carr and Pike, 2012). These three-turn chains were used to analyse the contingency between parents and children during the task. We tallied the number of times that parents shifted "up" (i.e., moving from a less specific to more directive intervention level), shifted "down" (i.e., moving from directive to less specific intervention level) and remained at the same level of intervention ("no shift") after each child success and failure. Variation in parental scaffolding reflected parents' use of the contingency rule (Wood and Middleton, 1975; Meins, 1997; Carr and Pike, 2012), that is, the successful placement of a piece by a child should be followed by an intervention at the same or at a lower level of specificity and failure to place a piece correctly should be followed by an intervention that is one or two levels higher than the previous level of intervention. Contingency or "scaffolding" scores were calculated by summing the total number of times that parents shifted appropriately after success or failure and dividing this by the total number of parental interventions after each success or failure. Scores ranged from 0 (no evidence) to 1 (exclusive use of the contingency rule).

The Home Learning Environment (HLE) was measured at Time 2 using the Home Learning Environment Index (Melhuish et al., 2008). This seven item self-report questionnaire records the frequency with which parents and children engage in informal learning activities. Parents were asked whether or not they engaged in seven activities with their children (e.g., reading at home, teaching numbers, and counting) and then how often the engaged in each activity on a 7-point scale (ranging from "occasionally or less than once a week" to "7 times a week/constantly"). Parents indicating that they did not engage in the learning activity with their child received a score of 0 for that item. The internal consistency of the measure was acceptable (α = 0.73) and so the scores from each item were summed together. While there was insufficient time to administer this test at Time 1, longitudinal findings demonstrate that individual differences on measures of the HLE show remarkable stability across early childhood (e.g., Lehrl et al., 2016).

## RESULTS

## Analytic Strategy

We conducted our primary data analyses using MPlus Version 7 (Muthén and Muthén, 2012) using a robust maximum likelihood estimator which is suitable for non-normally distributed data and small sample sizes (Brown, 2015). For each of the 103 participants returning there were no missing data points for EF, general cognitive ability or academic ability at Time 2. To avoid loss of data we used a full information approach to analyzing the data so that all cases (N = 117) with data at Time 1 could be included in the analyses. Missing values were estimated in MPlus using the robust maximum likelihood estimator (Muthén and Muthén, 2012). MPlus does not impute data but instead estimates missing model parameters and standard errors using all of the available data (Enders, 2001; Asparouhov and Muthén, 2010). The full information approach can be used in regression models and is preferable to traditional approaches to handling missing data (e.g., list-wise deletion, mean substitution) because it produces less biased estimates and does not require that data are missing completely at random (i.e., missingness is unrelated to any other variable in the dataset or performance on the variable itself) (Enders, 2001; Acock, 2005).

Since the WIAT-II-UK was not age-appropriate for all the children at Time 1 we controlled for individual differences in early cognitive ability by regressing academic ability scores onto earlier measures of verbal ability, general cognitive ability (as measured by the Object Assembly task and Matrix Reasoning task) and EF as well as concurrent age. Structural equation modeling in MPlus allowed us to examine simultaneously the direct and indirect effects (via EF and general cognitive ability at Time 2) of each of the parental variables on academic ability (Cole and Maxwell, 2003; Preacher, 2015). We have provided a more detailed analysis of the longitudinal relations between parental behaviors and children's EF elsewhere (Hughes and Devine, under review). We evaluated the fit of our models using Brown's (2015) four criteria: A non-significant χ 2 test of model fit, comparative fit index (CFI) ≥ 0.95, Tucker Lewis index (TLI) ≥ 0.95, and root mean square error of approximation (RMSEA) ≤ 0.08. We evaluated the strength of correlations using Cohen's (1988) criteria: Small/weak (0.10), medium/moderate (0.30), and large/strong (0.50).

## Descriptive Statistics and Data Reduction

**Table 1** presents descriptive statistics for the key study variables. Our first step was to create composite scores in order to increase reliability (Rushton et al., 1983) and simplify our analyses. We conducted a series of CFAs to inform the creation of composite scores for different variables in our study. Each of the PARCHISY items were significantly inter-correlated, 0.36 < r < 0.54, all ps < 0.01. We tested a one-factor model in which each of the PARCHISY items loaded onto a single "negative parentchild interaction" latent factor. This model was "just-identified" (i.e., there were an equal number of model parameters and variances/co-variances in the sample matrix) and while model fit indices could not be calculated, parameter estimates could still be calculated and interpreted (Brown, 2015). We set the metric of the latent factor by fixing the loading of the first indicator to 1. The latent factor exhibited significant variance, unstandardized estimate = 0.45, p = 0.007. All item loadings were significant; Conflict Standardized Estimate = 0.80, p < 0.001, Negative Affect Standardized Estimate = 0.68, p < 0.001, Negative Control Standardized Estimate = 0.52, p < 0.001. Factor determinacy co-efficient values range from 0 to 1 and higher values (≥0.80) indicate higher internal consistency (Brown, 2015). The negative parent-child interaction latent factor had a factor determinacy co-efficient of 0.87.

Consistent with previous studies (e.g., Hughes and Ensor, 2005) the correlations between measures of EF were moderate at Time 1, 0.29 < r < 0.48, Mean r = 0.40, and weak to moderate at Time 2, 0.08 < r < 0.73, Mean r = 0.33. The SOPT error score at Time 2 was not correlated with any other measure of EF at Time 2 and so was not included in any further analyses. Drawing on the "unity" model of individual differences in EF (described earlier), we tested a model in which each of the three EF indicators at Time 1 loaded onto a one latent factor and each of the four EF indicators at Time 2 loaded onto another latent factor. The error terms for the two DCCS indicators at Time 2 were permitted to correlate. This model provided an acceptable fit to the data, χ 2 (12) <sup>=</sup> 16.47, <sup>p</sup> <sup>=</sup> 0.17, CFI <sup>=</sup> 0.97, TLI <sup>=</sup> 0.96, RMSEA = 0.06. All indicators loaded significantly onto the Time 1 EF latent factor; Happy/Sad Task Standardized Estimate = 0.78, p < 0.001, DCCS Standardized Estimate = 0.61, p < 0.001, SOPT Standardized Estimate = 0.50, p < 0.001. All but one of the Time 2 indicators loaded significantly onto the Time 2 EF latent factor; Happy/Sad Task Standardized Estimate = 0.67, p < 0.001, DCCS Border Game Standardized Estimate = 0.38, p < 0.01, DCCS Standardized Estimate = 0.24, p = 0.07, Day/Night Task Standardized Estimate = 0.60, p < 0.001. Both latent factors exhibited significant variance at Time 1, Unstandardized Estimate = 23.73, p < 0.001, and at Time 2, Unstandardized Estimate = 2.60, p < 0.01. The factor determinacy co-efficient was 0.87 for the Time 1 latent factor and 0.84 for the Time 2 latent factor.

## Relations between Parental Behavior, EF, and Academic Ability

**Table 2** shows the sample correlations between each measure of parental behavior. These show that negative parent-child interaction was weakly positively correlated with the HLE. Parental scaffolding and the HLE were unrelated. Each parental measure showed weak correlations with academic ability in

#### TABLE 1 | Descriptive statistics.


the expected directions. We calculated partial correlations controlling for individual differences in age and general cognitive ability (as measured by the Matrix Reasoning task) at Time 2. Academic ability was weakly correlated with each aspect of parental behavior: Negative parent-child interaction, pr(100) = −0.19, p = 0.05; parental scaffolding, pr(100) = 0.17, p = 0.09; the HLE, pr(100) = 0.27, p = 0.005. **Table 2** also shows the correlations between each measure of parental behavior and individual differences in EF at Time 2. Once again we examined these relations further using partial correlations controlling for individual differences in age at Time 2. EF remained significantly correlated with both negative parent-child interaction, pr(100) = −0.29, p = 0.003, and parental scaffolding, pr(100) = 0.29, p = 0.003, but showed a weak and non-significant correlation with the HLE, pr(100) = 0.13, p = 0.19.

## Direct and Indirect Effects of Parental Behavior on Academic Ability

We specified two longitudinal models to examine the direct and indirect effects (via EF and performance on the Matrix Reasoning task) of parental behavior on children's early academic ability. In the first model, academic ability was regressed onto measures of EF (at Time 1 and Time 2) and each measure of parental behavior. Note that, by regressing academic ability onto EF at Time 1 and Time 2 and regressing EF at Time 2 onto EF at Time 1, we were able to disentangle whether early EF made a unique contribution to later academic ability controlling for concurrent EF (at Time 2). In addition we controlled statistically for the influence of verbal ability and general cognitive ability (as measured by performance on the Object Assembly and Matrix Reasoning tasks), parental education (as measured by a dummy variable with 0 indicating no degree and 1 indicating achievement of an undergraduate degree), gender (using a dummy code of 0 for girls and 1 for boys), whether the child had started formal schooling at Time 2 (using a dummy code with 0 indicating no and 1 indicating yes), the interval between Time 1 and Time 2 (in months) and child age by regressing both academic ability and EF at Time 2 on these variables. Each of the predictor variables in our model were free to co-vary. This first model provided an acceptable fit to the data: χ 2 (3) <sup>=</sup> 0.89, <sup>p</sup> <sup>=</sup> 0.83, CFI <sup>=</sup> 1.00, TLI = 1.09, RMSEA = 0. Standardized path estimates for this model are shown in **Figure 2**. Unstandardized estimates and 95% confidence intervals for all model parameters are presented in **Table 3**. The overall model accounted for 76% of the variance in children's academic ability. EF at Time 1 and Time 2 were moderately and significantly related to academic ability uniquely accounting for 2 and 4% of the variance respectively.

Parental scaffolding and negative parent-child interaction uniquely accounted for 5 and 4% of the variance in Time 2 EF but only 0.1 and 0.2% of the variance in academic ability. Statistical tests of indirect effects revealed that EF at Time 2 mediated the relations between negative parent-child interactions and academic ability, B = −0.07, SE = 0.03, Z = −2.25, p =0.024, and between parental scaffolding and academic ability, B = 2.68, SE = 1.13, Z = 2.38, p = 0.017. These findings were confirmed by the non-significant direct path between negative parent-child interaction and academic ability, B = −0.06, SE = 0.05, Z = −1.18, p = 0.24, β = −0.05, and between parental scaffolding and academic ability, B = 0.17, SE = 2.34, Z = 0.07, p = 0.94, β = 0.01. EF did not mediate the link between the HLE and academic ability, B = 0.03, SE = 0.02, Z = 1.36, p = 0.17. Instead there was significant direct relation between the HLE and academic ability, B = 0.16, SE = 0.06, Z = 2.76, p = 0.005, β = 16. HLE uniquely accounted for 1% of the variance in academic ability.

To examine the specificity of EF as a mediator of the effects of negative parent-child interaction and parental scaffolding on


\*\*p < 0.01. \*p < 0.05. <sup>+</sup>p < 0.10. Vocab, Vocabulary; Negative Interaction, Negative Parent-Child Interaction; Contingency Rule, Parental use of Contingency Rule; HLE, Home Learning Environment; T1, Time 1; T2, Time 2.

academic ability, we tested a second longitudinal model in which general cognitive ability (as measured by the Matrix Reasoning task) was entered as a mediator between negative parent-child interaction, parental scaffolding and academic ability instead of EF. As before, we controlled statistically for the influence of general cognitive ability at Time 1, EF at Time 1 and Time 2, parental education, formal schooling, gender, and age by regressing the dependent variable and mediator on each of these covariates. This second model provided an acceptable fit to the data on three out of four indices: χ 2 (7) <sup>=</sup> 11.64, <sup>p</sup> <sup>=</sup> 0.11, CFI = 0.98, TLI = 0.90, RMSEA = 0.07. Examination of the tests of indirect effects revealed that general cognitive ability at Time 2 (as measured by the Matrix Reasoning task) did not mediate the relation between negative parent-child interaction and academic achievement, B = −0.01, SE = 0.02, Z = −0.63, p = 0.53, or the link between parental scaffolding and academic achievement, B = −0.83, SE = 0.73, Z = −1.13, p = 0.26. To summarize, our models revealed three key sets of findings. Firstly, the three different aspects of parental behavior were not significantly correlated with each other. Secondly, individual differences in EF (measured at both Time 1 and Time 2) showed unique relations with children's academic ability. Thirdly, EF mediated the links between negative parent-child interaction and academic ability on the one hand and between parental scaffolding and academic ability on the other hand. Variation in the HLE, however, was directly related to early academic ability.

## DISCUSSION

This longitudinal study of 117 parent-child dyads makes at least three contributions to the existing literature. First, supporting a differentiated model of parenting (e.g., Carr and Pike, 2012), different aspects of parental behavior were unrelated to each other and showed unique contributions to children's early academic ability. Second, our analyses showed that children's EF mediated the relations between parental scaffolding and negative parent-child interaction and children's early academic ability. Third, our results revealed that EF and not general cognitive ability (as measured by the Matrix Reasoning task) mediated the relations between these two aspects of parental behavior and children's academic ability.

With some notable exceptions (e.g., Hughes and Ensor, 2009; Bernier et al., 2010), existing studies of parental influences on children's academic ability and on children's EF have typically either focused on a single aspect of parenting or adopted a global approach by aggregating several domains of parental behavior into a single measure. While these studies have been valuable in highlighting the influence of parental behaviors on children's cognitive and academic abilities, progress in understanding the mechanisms underpinning these associations has been limited by the scarcity of studies seeking to disentangle the relations between different aspects of parental behavior and child outcomes.

In response to this challenge, we followed calls for finegrained analyses (e.g., Davidov and Grusec, 2006; Carr and Pike, 2012) by distinguishing three aspects of parental behavior (i.e., parental scaffolding, negative parent-child interaction and provision of opportunities for learning) that have been studied in relation to children's academic ability and EF. Our results showed that these three dimensions of parental behavior were unrelated, but each dimension exhibited weak associations with individual differences in children's academic ability (even when age and general cognitive ability were taken into account). It is possible that our measure of the HLE was unrelated to our measures of parental scaffolding and negative parent-child interaction because these constructs were measured in very different ways (i.e., observation vs. questionnaire). That said, our two observational measures were also unrelated to each other. It would therefore be valuable in future studies to include multiple indicators of each aspect of parental behavior to understand the structure of this differentiated model of parenting more fully.

The main goal of our study was to elucidate the mechanisms by which parental behaviors are related to children's early academic abilities. In doing so, we outlined three theoretical

models linking parental behavior, children's EF and academic ability. The first of these models, the Domain General Model, suggests that a range of parental behaviors will exhibit direct associations with a range of cognitive outcomes. That is, parents who provide high levels of cognitive support, frequent opportunities for engagement in informal learning activities and low levels of negative parent-child interaction, will have children who perform better across the board. The second of these models, the Domain Specific Model, proposes that specific parental behaviors will be directly linked with specific cognitive outcomes. For example, frequent engagement in informal literacy and numeracy activities will be associated with better academic performance and children exposed to parentchild interactions characterized by contingency and low levels of negativity will exhibit superior EF. The third model, the Mediation Model, suggests that parental behaviors indirectly

#### TABLE 3 | Unstandardized and standardized robust maximum likelihood parameter estimates for longitudinal mediation model 1.

#### TABLE 3 | Continued



(Continued)

(Continued)

#### TABLE 3 | Continued


\*\*p < 0.01. \*p < 0.05. <sup>+</sup>p < 0.10. On, Regressed onto; With, Correlated with.

influence children's academic ability via more specific cognitive mechanisms (e.g., EF or general cognitive ability). Our findings show that these different models are not mutually exclusive. The relations between two aspects of parental behavior (i.e., parental scaffolding and negative parent-child interaction) and children's academic ability were mediated by children's EF. In contrast, informal opportunities for learning (as measured by the HLE questionnaire) exhibited direct effects on children's academic ability. Importantly, for the first time, our findings showed that EF and not general cognitive ability played a specific role in the relation between parental scaffolding, negative parent-child interaction and children's academic ability.

Before discussing these findings, a number of limitations in our study deserve note. First, our longitudinal study involved just two time points. Numerous theorists have argued that two-wave or "half longitudinal" designs (in which the mediator is measured at the same time point as either the predictor or outcome variable) are a cost-effective way to examine mediation and are preferable to more widely-used cross-sectional designs (Cole and Maxwell, 2003; Little et al., 2007; Newsom, 2015; Preacher, 2015). Although the existing findings on the relations between parental behavior, EF and academic ability reported earlier involved multiple time points, the presumed mediator was either measured alongside the predictor (e.g., NICHD Early Child Care Research Network, 2003) or the outcome (Friedman et al., 2014). Future studies involving three (or more) time points in which the parental behaviors, EF and academic outcomes were measured at different time points would permit the underlying assumptions of stationarity and equilibrium to be tested formally (Cole and Maxwell, 2003). Second, our longitudinal study involved assessment of parental behavior at just one time point (i.e., parent-child interactions were studied at Time 1 only and parental reports of the HLE were gathered at Time 2 only) and so cross-lagged analyses to determine the direction of the association between parental behavior, EF and academic outcomes was not possible (Menard, 2002). Third, academic ability was measured at just one time point. Ideally, auto-regressive models require that the dependent variable should be measured on at least two occasions so that stability in the dependent variable can be accounted for (Hertzog and Nesselroade, 2003). However, we took steps to reduce potential confounds by including a range of covariates in our models and controlled for individual differences in earlier verbal ability, general cognitive ability and EF (as well as parental education, child age, and formal schooling) in each of our models.

Notwithstanding these limitations, our results complement those based on the NICHD study demonstrating that individual differences in EF mediate the relation between parental behavior and children's later academic achievement (NICHD Early Child Care Research Network, 2003; Friedman et al., 2014) and extend that work by disentangling the relative influence of different dimensions of parental behavior and demonstrating the specificity of EF as a mediator. Moreover, our findings are also consistent with a growing body of research showing that children's EF mediates the relations between harsh or insensitive parental behavior, maternal depressive symptoms and children's externalizing problems (Sulik et al., 2015; Roman et al., 2016). While not focused on academic outcomes these studies provide a template for future longitudinal research on parental behavior, EF and children's academic ability by: (1) spanning more than two time points so that formal tests of mediation can be carried out; (2) incorporating measures of each construct at every time point to unpack the temporal dynamics of these associations (Sulik et al., 2015); and (3) testing alternative mediators to determine the specificity of EF as a mediator (Roman et al., 2016).

Causal claims about the purported developmental relations between parental behavior, EF and children's early academic ability will be bolstered by intervention and genetically sensitive studies. There is now considerable evidence that parental behaviors can be modified through a range of interventions (e.g., Kaminski et al., 2008; Belsky and de Haan, 2011). Moreover, studies of the impact of school-based interventions to improve children's academic outcomes suggest that the effects of these programs are mediated by EF (Raver et al., 2011). Whether or not parent-focused interventions exert effects on child outcomes via EF remains to be seen but such evidence would provide support for any causal claims about the relations between parental behavior, children's EF and early academic ability. When parents and children are biologically related, longitudinal studies of parental effects on children's cognition are potentially confounded by genetic effects (Dale et al., 2015). Indeed a number of twin studies suggest that individual differences in EF show substantial heritability in middle childhood, adolescence, and adulthood (e.g., Polderman et al., 2007; Friedman et al., 2008). Moreover, a large-scale study using Genome-Wide Complex Trait Analysis (GCTA) has shown that genetic factors accounted for the relations between family socio-economic status (SES) and children's IQ at ages 7 and 12 (Trzaskowski et al., 2014) and between SES and children's educational achievement (Krapohl and Plomin, 2016). Genetically sensitive research designs (e.g., adoption studies) will help to disentangle genetic and environmental effects on children's EF and early academic ability. In addition to this work, investigations of potential moderating variables will also elucidate the mechanisms by which parental behaviors shape early academic abilities. For example, researchers have identified specific DNA polymorphisms related to the signaling of dopamine that moderate children's susceptibility to parental influences on a variety of cognitive and behavioral outcomes (Bakermans-Kranenburg and Van Ijzendoorn, 2011). It is conceivable that genetic factors might act to attenuate or strengthen the developmental relations between parental behaviors, children's EF and academic abilities.

#### SUMMARY AND CONCLUSION

We have shown that individual differences in children's EF (but not general cognitive ability) mediate the relations between each of two aspects of parental behavior (that is, "parental scaffolding" or the proclivity to modify instructions and support in response to children's behavior and "negative parentchild interaction" or the extent to which parents are critical,

#### REFERENCES


controlling and display negative affect on the other) and children's early academic ability. That is, parental scaffolding and negative parent-child interaction appear to influence children's academic abilities by helping or hindering children's emerging EF. In contrast, parental provision of opportunities for learning in the home environment is directly related to children's academic abilities. Future studies on the relations between parental behaviors, children's EF and early academic abilities will benefit from adopting multi-wave longitudinal and training designs as well as a find-grained approach to studying the relative salience of different aspects of parental behavior.

### ETHICS STATEMENT

University of Cambridge Psychology Research Ethics Committee. Participants' parents/caregivers were issued with leaflets explaining the nature of the study in detail. We provided parents/caregivers with our contact information so they did not have to make a choice to participate immediately. Parents had the opportunity to ask questions before consenting to participate. Upon agreeing to participate, parents provided written consent before taking part in a testing session. Children were closely monitored for signs of distress and discomfort during the testing procedures. Breaks were provided and children were issued with small rewards (e.g., stickers) and praise regardless of their task performance.

## AUTHOR CONTRIBUTIONS

RD and CH shared responsibility for the conception, design, interpretation, and write up of this study. In addition RD undertook all data collection and data analysis. GB shared responsibility with RD for coding parent-child interactions.

## FUNDING

This study was funded by the UK Economic and Social Research Council (ES/JO21180/1) and the Isaac Newton Trust, Cambridge.


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2016 Devine, Bignardi and Hughes. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

## Keeping the Spirits Up: The Effect of Teachers' and Parents' Emotional Support on Children's Working Memory Performance

Loren Vandenbroucke<sup>1</sup> \*, Jantine Spilt<sup>2</sup> , Karine Verschueren<sup>2</sup> and Dieter Baeyens<sup>1</sup>

<sup>1</sup> Parenting and Special Education Unit, Faculty of Psychology and Educational Sciences, KU Leuven, Leuven, Belgium, <sup>2</sup> School Psychology and Child and Adolescent Development, Faculty of Psychology and Educational Sciences, KU Leuven, Leuven, Belgium

Working memory, used to temporarily store and mentally manipulate information, is important for children's learning. It is therefore valuable to understand which (contextual) factors promote or hinder working memory performance. Recent research shows positive associations between positive parent–child and teacher–student interactions and working memory performance and development. However, no study has yet experimentally investigated how parents and teachers affect working memory performance. Based on attachment theory, the current study investigated the role of parent and teacher emotional support in promoting working memory performance by buffering the negative effect of social stress. Questionnaires and an experimental session were completed by 170 children from grade 1 to 2 (Mage = 7 years 6 months, SD = 7 months). Questionnaires were used to assess children's perceptions of the teacher–student and parent–child relationship. During an experimental session, working memory was measured with the Corsi task backward (Milner, 1971) in a pre- and post-test design. In-between the tests stress was induced in the children using the Cyberball paradigm (Williams et al., 2000). Emotional support was manipulated (between-subjects) through an audio message (either a weather report, a supportive message of a stranger, a supportive message of a parent, or a supportive message of a teacher). Results of repeated measures ANOVA showed no clear effect of the stress induction. Nevertheless, an effect of parent and teacher support was found and depended on the quality of the parent–child relationship. When children had a positive relationship with their parent, support of parents and teachers had little effect on working memory performance. When children had a negative relationship with their parent, a supportive message of that parent decreased working memory performance, while a supportive message from the teacher increased performance. In sum, the current study suggests that parents and teachers can support working memory performance by being supportive for the child. Teacher support is most effective when the child has a negative relationship with the parent. These insights can give direction to specific measures aimed at preventing and resolving working memory problems and related issues.

Keywords: working memory, executive functioning, parent–child interaction, teacher–child interaction, emotional support

#### Edited by:

Barbara McCombs, University of Denver, USA

#### Reviewed by:

Claudio Longobardi, University of Turin, Italy Emily Grossnickle Peterson, Georgetown University, USA

\*Correspondence: Loren Vandenbroucke loren.vandenbroucke@kuleuven.be

#### Specialty section:

This article was submitted to Educational Psychology, a section of the journal Frontiers in Psychology

Received: 28 October 2016 Accepted: 20 March 2017 Published: 04 April 2017

#### Citation:

Vandenbroucke L, Spilt J, Verschueren K and Baeyens D (2017) Keeping the Spirits Up: The Effect of Teachers' and Parents' Emotional Support on Children's Working Memory Performance. Front. Psychol. 8:512. doi: 10.3389/fpsyg.2017.00512

## INTRODUCTION

fpsyg-08-00512 April 1, 2017 Time: 16:55 # 2

The ability to regulate and control one's behavior, thoughts and emotions, also referred to as executive functioning (EF), is essential in making goal-directed behavior possible (Best and Miller, 2010; Zelazo and Carlson, 2012; Diamond, 2013). Three cognitive processes are considered to form the base of EF, namely working memory, inhibition and cognitive flexibility (Miyake et al., 2000; Huizinga et al., 2006; Best and Miller, 2010; Blair et al., 2011; Zelazo and Carlson, 2012; Diamond, 2013). Previous research has shown the importance of EF in variety of life domains, including education (Diamond, 2013). For example, children with well-developed EF have more positive work habits, higher engagement in learning, lower levels of inattention, positive relationships with classmates and higher academic achievement (Brock et al., 2009; Welsh et al., 2010; Best et al., 2011; Vuontela et al., 2013). Because of the importance of EF, understanding which factors influence EF performance can provide useful insights for the prevention and intervention of EF difficulties and related problems. Recent research indicates that positive interactions with both parents (Blair et al., 2011; Hughes, 2011) and teachers (Berry, 2012; Hamre et al., 2014; de Wilde et al., 2015) can promote EF quality. However, little is known about why this is the case. This study examines the role of parents and teachers as external stress regulators by means of offering emotional support to children in a stressful situation, as one particular mechanism through which positive parent–child and teacher–student interactions can promote children's EF performance. The study focusses on a particular aspect of EF, namely working memory. This component of EF starts to develop very early and forms an important base for other EFs, such as cognitive flexibility or planning (Diamond, 2013). Additionally, of the three core EFs, working memory has been most consistently linked to children's general development and learning (Bull and Lee, 2014; Vandenbroucke et al., 2017).

## Working Memory and Its Development

Working memory is a limited capacity, multicomponent memory system that is capable of holding and processing information over a short period of time (Baddeley, 1986). For example, working memory is used when trying to follow multi-step instructions, which requires remembering and updating information while completing the task. Working memory is essential in a large number of activities and has often been linked to learning and learning-related behavior (e.g., Gathercole et al., 2007; De Smedt et al., 2009; Alloway and Alloway, 2010; Zheng et al., 2011; Fitzpatrick and Pagani, 2012; Desoete and De Weerdt, 2013).

Working memory starts to develop in the first year of life and continues to develop at least until adolescence (Gathercole et al., 2004; Reznick et al., 2004; Conklin et al., 2007; Diamond, 2013). The development is characterized by alternating periods of rapid and more continuous growth, with a first important developmental spurt occurring between the ages of 2–8 (Hongwanishkul et al., 2005; Ganea and Harris, 2013; Kibbe and Leslie, 2013; Moher and Feigenson, 2013). This developmental pattern clearly shows parallels with the development of the prefrontal regions of the brain (Anderson, 2002). However, despite the clear importance of biological maturation processes in working memory development, the frontal brain regions and its related cognitive processes are characterized by plasticity and are sensitive to environmental stimulation, especially during periods of rapid growth (Anderson, 2002; Huttenlocher, 2002). The current study focusses on children at the beginning of primary school (ages 6–8), an age group that falls within the first period of strong development.

## Adult–Child Interactions at Home and at School as Developmental Contexts

The role of environmental factors for working memory performance and development has been far less researched compared to biological aspects (Hughes, 2011). Most studies available to date focus on the home environment and parent– child interactions. These studies show that positive factors in the home environment can promote working memory development, while negative factors can hinder the development of this core EF (see Hughes, 2011 for a short overview). The quality of the interaction between parents and their children is one such important promoting factor within the home environment. For example, the affective quality of parent–child interactions has an influence on working memory as indicated by studies showing that higher levels of parental support (Schroeder and Kelley, 2009), maternal sensitivity and autonomy support (Bernier et al., 2010) and maternal positive engagement (Rhoades et al., 2011) predict higher working memory performance. On the other hand, more negative intrusiveness by the mother predicts lower working memory performance (Rhoades et al., 2011). In sum, parents who interact with their children in a positive and supportive way can promote their children's working memory development, while negative interactions can hinder this development.

More recently, researchers started focusing on the role of the school and classroom environment as an important developmental context for EF and working memory. Particularly, the affective quality of teacher–student interactions is an important influencing factor for working memory in children. The quality of the teacher–student relationship has mainly been viewed from an attachment perspective, which focusses on the importance of closeness, conflict and dependency in the relationship for children's development (Verschueren and Koomen, 2012; Settanni et al., 2015). A study of Hamre et al. (2014) showed, for example, that in classes with more sensitive teachers, children performed better on a working memory task. Another study suggests that the affective quality of the dyadic teacher–student relationship, rather than classroom level interactions, is important for later performance on an EF task including a working memory component (Cadima et al., 2016). Teacher–student closeness appears to be positively related to children's working memory (Cadima et al., 2016), while conflict has a negative association with working memory performance (de Wilde et al., 2015). Overall, the higher the levels of positive affect between a child and its teacher, the better the child's working

memory performance and the higher the levels of negative affect between a child and its teacher, the worse children's working memory performance.

Despite the increasing evidence for the importance of parent–child and teacher–student interactions for working memory performance our understanding is still limited. First, previous studies examining how parent–child and teacher–child interactions relate to working memory are correlational in nature. As a consequence, it is unclear whether this relationship is causal or that additional variables confound this relationship. The current study attempts to contribute to this gap by experimentally manipulate emotional support and examine the effect of this manipulation on children's working memory. Second, little is known about the mechanisms underlying this relationship. The current study therefore explores the role of one plausible mechanism, offered by the attachment-theory, namely the buffering effect of parents and teachers emotional support when the child experiences distress.

## The Buffering Role of Adult–Child Attachment Relationships in Stressful Situations

Attachment refers to the deep and enduring affectionate bond between a child and a significant adult (Bowlby, 1969). In the early years of life children form an attachment bond with their primary caregivers (Bowlby, 1969). Evidence now suggests that other significant adults, such as teachers, can also function as an attachment figure (Commodari, 2013). Verschueren and Koomen (2012) argue that the bond between a child and its teacher cannot be considered fully equal to the bond between a child and its primary caregiver as it is (in most cases) not enduring and exclusive and the teacher's role is primarily instructional rather than focused on emotional investment. Yet, there are similarities between the parent–child and teacher–child bond, including the importance of sensitivity in predicting the quality of this bond (Ahnert et al., 2006; Verschueren and Koomen, 2012), the display of attachment-related behaviors of the child toward the adult, and the occurrence of similar classifications of attachment-related behaviors (Ahnert et al., 2012). Teachers can thus be seen as ad hoc attachment figures (Verschueren and Koomen, 2012).

When children form a positive bond with significant adults, characterized by high levels of warmth and low levels of conflict, they will display two types of attachment behaviors. Both may enhance working memory performance and development. First, as children feel confident and have trust in their caregivers, they will explore their environment independently and engage more in stimulating and challenging activities at home or in the classroom (O'Connor and McCartney, 2007; Roorda et al., 2011; Commodari, 2013). The caregiver functions as a secure base. This is likely to provide children with more frequent and more challenging opportunities to practice their working memory skills. Second, during moments of distress the child will return to the caregiver and look for comfort, which will reduce the child's levels of stress (Verschueren and Koomen, 2012; Commodari, 2013). The caregiver functions as a safe haven. Both the quality of parent–child and teacher–student relationships have been previously linked to stress and stress regulation (Blair et al., 2011; Ahnert et al., 2012), while other studies have shown a negative impact of stress on working memory performance and development (e.g., Evans and Schamberg, 2009; Blair et al., 2011; Hanson et al., 2012). Parents and teachers can thus function as external stress regulators and as such provide children with a more appropriate environment for working memory development.

Although these attachment mechanisms are plausible and some studies partially provide support for them, no study has, to our knowledge, directly tested such mechanisms for EF. The current study therefore attempts to broaden our understanding in these underlying processes by directly examining one potential mechanism, namely parents and teachers as an external stress regulators (safe haven mechanism).

## Current Study

The aim of the current study is to enhance our understanding of the association between parent–child and teacher–student relationships, on the one hand, and working memory performance, on the other. In an experimental design, the effect of parents and teachers emotional support on children's working memory performance is investigated, while examining the buffering of stress as a potential underlying mechanism. Specifically, after stress is induced through an experimental manipulation, children will hear a neutral message (weather report) or a supportive message of an unfamiliar person, a parent or the teacher. It is expected that stress will result in decreased working memory performance when children hear a neutral message (Hawes et al., 2012). A supportive message from parents and teachers is hypothesized to decrease the induced stress and therefore a stable working memory performance is expected in these conditions (Blair et al., 2011; Ahnert et al., 2012). Such a buffering effect is not expected when children hear a supportive message from a stranger, as the effect is expected to result from the interpersonal bond, rather than the positive nature of the message. Additionally, it can be expected that the positive effects of parent and teacher support will be more pronounced when children have a positive relationship with the parent or teacher, as children then rely more on the parent or teacher for comfort when distressed (a safe haven; Roorda et al., 2011; Verschueren and Koomen, 2012).

## MATERIALS AND METHODS

## Participants

Seven regular schools for primary education, located in three provinces in Belgium, agreed to participate in the current study. In these schools, the teachers of all first and second grade classrooms were asked for their collaboration in the current study. This resulted in 18 participating classrooms (66.7%). Fifteen classrooms (83.3%) had a female teacher. Teachers handed out information letters and informed consents to the parents. Written informed content was obtained from 205

parents (56.6% participation rate). Consent was provided by the primary caregiver. If parents were divorced and had a co-parenting arrangement, both parents gave their consent for participation. Due to time constraints data could not be fully collected for all children. Therefore, the experiment was conducted in a subsample of children, which were randomly selected. In the end, 170 children participated in the experiment. There was no drop-out during the experiment: children who started the experimental session, always finished it.

The sample consisted of 43 first grade children (6 classrooms), 100 s grade children (10 classrooms) and 24 children in mixed grade classrooms (2 classrooms). Children were between 6 years 3 months and 9 years 1 month (M = 7 years 6 months, SD = 7 months) when the experiment was conducted. Background characteristics of the sample were reported by the parents (cf. 2.3.1) and an overview can be found in **Table 1**. The sample is representative for the average population in Flanders with regard to the parents' employment status (5.1% unemployment, 73.3% employment; Eurostat 2015). However, the sample includes more highly educated primary caregivers than the population in the region of Flanders (37.2%; Eurostat 2015) and most families have a higher monthly net income compared to the average in Flanders (2689,58 euros; Statistics X 2014). The current sample mostly consisted of typically developing children (n = 165), though parents of 22 children

TABLE 1 | Distribution of background characteristics of the participants who completed the experiment (n = 170).


reported psychosocial problems of their child. From these, six children were reported to have a disorder; three children with an Attention-Deficit/Hyperactivity Disorder (ADHD) and three children with an Autism Spectrum Disorder (ASS). None of the parents reported physical health problems or medication use that could influence data collection.

## Instruments

#### Demographics

Parents filled out a self-constructed questionnaire to report on a number of background characteristics of the participating child and their family. First, parents provided socioeconomic information by indicating the caregivers' educational level, occupational status and monthly net income. The educational level was recoded into low-educated (i.e., a degree of secondary education at most) and highly educated (i.e., at least a Bachelor's Degree). Occupational status was recoded into full-time working (i.e., working at least 75%), part time working (i.e., working less than 75%), voluntarily not working (i.e., housewife or houseman, on pension, maternity leave and temporary career breaks for more than 3 months) and involuntarily not working (i.e., in search of employment or unfit for work). Family monthly net income was categorized as below 1000 euros, between 1000 and 2000 euros, between 2000 and 3000 euros, between 3000 and 4000 euros, between 4000 and 5000 euros and above 5000 euros. Second, parents gave information about the physical and psychosocial health and medication use of the participating children. Finally, the nationality and mother tongue of the participating child and the caregivers was reported.

#### Teacher–Child and Parent–Child Relationship

To assess children's perception of the quality of their relationship with the teacher, the Young Children's Appraisals of Teacher Support (Y-CATS; Mantzicopoulos and Neuherth-Pritchett, 2003; Spilt et al., 2010) was used. This scale consists of 27 statements about the relationship between the child and the teacher. The researcher reads each statement and the child places the card with the statement in a safe when it is true and in a trashcan when it is untrue. This approach is first practiced with two example items: one that is clearly true ('my teacher is bigger than me') and one that is clearly untrue ('my teacher has blue hair'). The Y-CATS has three subscales, namely warmth (11 items, e.g., 'My teacher says nice things about my work'), conflict (10 items, e.g., 'My teacher gets angry with me') and autonomy support (6 items, e.g., 'My teacher lets me do things I like'). Scores are calculated for each scale by summing the scores of the respective items. The Dutch version of the Y-CATS has an acceptable to satisfactory internal consistency in previous studies, with Cronbach's alphas of 0.65, 0.72, and 0.61 for warmth, conflict and autonomy support, respectively (Spilt et al., 2010). In the current study, items 23 and 27 (Warmth Subscale), 22 (Conflict subscale) and 3 (Autonomy Support subscale) were deleted because of negative or extremely low item-rest correlations. The final Cronbach's alphas in the current study of the subscales were 0.90, 0.79, and 0.52 respectively. The internal consistency of Autonomy Support was unsatisfactory in the current sample

and could not be further raised by deleting specific items. This subscale was therefore not used in further analyses. Additionally, a dichotomous score was calculated categorizing each participant as low or high on each subscale. Children were categorized as high with a score higher than four for both warmth and conflict. This means that for at least half of the items, presence was indicated by the child (i.e., the item was put in the safe).

Children's perception of their relationship with their primary caregiver was assessed with the Parent–Child Interaction Questionnaire-Revised child version (PACHIQ-R; Lange et al., 2002). The original scale consists of 25 statements which children have to evaluate on a 5-point scale. However, because of the young age of the children in the current sample, the same administration procedure was used as with the Y-CATS, reducing the response possibilities to a true or false choice. Children completed the questionnaire for the parent who indicated to be the primary caregiver (83% mothers). The items of the PACHIQ-R child version were originally found to be best described in two subscales, an Acceptance scale (8 items, e.g., 'If I'm sad about something, my mother comforts me') and a Conflict resolution scale (17 items, e.g., 'Most of the times, I do what my mother asks'). However, given the changes in procedure and the younger age sample the structure of the questionnaire was reexamined in the current sample. To this end, Exploratory Factor Analysis was conducted, using Parallel Analysis (Horn, 1965) to determine the number of factors to extract. This method compares the observed eigenvalues of the factors with the eigenvalues of a series of simulated data matrices with the same characteristics. This method is more conservative than the 'eigenvalue-greater-than-one' criterion and results less often in an overestimation of the number of factors to be extracted. The default number of 100 simulations and 95th percentile of the eigenvalues were used. Results indicated a three-factor structure was more appropriate for the current sample. The first subscale was Warmth in the parent-child relationship (9 items; e.g., 'When I do something for my mother, I can tell that she likes it'). The second subscale was Conflict (9 items, e.g., 'Whatever my mother tells me, I do what I want'). Sensitivity was the final subscale (6 items; e.g., 'When I am sad, my mother comforts me'). A score was calculated for each subscale by summing the items of the respective scale. Item 19, belonging to the Sensitivity subscale, was deleted due to a low correlation with the rest of the scale. Cronbach's alphas in the current study were acceptable to good (0.81, 0.69, and 0.60). Scores were calculated for each subscale by summing the score on each item. Again, a low-high dichotomization was made. Children who scored higher than four on warmth, higher than four on conflict and higher than three on sensitivity were categorized as high on the respective subscale. As only eight children were categorized in the low sensitivity group, parent-sensitivity was excluded from further analysis.

#### Working Memory

To assess working memory a backward version of the Corsi blocks test (Milner, 1971) was used. Children were presented with a wooden board with nine irregularly spaced blocks. The experimenter tapped a series of blocks, at a rate of one block per second, and the child was asked to repeat the sequence in the reverse order. A standardized procedure was used. After verbal instructions given by the researcher and two practice items, children started the test with the reproduction of a sequence of two blocks. After four correct items, difficulty was increased with one block, until a maximum of nine blocks per sequence was reached. When a child was unable to reproduce three sequences of the same difficulty the test ended and the researcher continued with the rest of the experiment. Two parallel sets of items were used, one with items from the WMTB-C (Gathercole and Pickering, 2000) and one with items from the Automated Working Memory Assessment (AWMA; Alloway, 2007). The difficulty of items (based on the number of crossings that in the pathway of the sequence; Busch et al., 2005) was evaluated in advance and both set of items were comparable in difficulty. Order of the two sets of items was counterbalanced; half of the children received the WMBT-C items as pre-test and half of the children received AWMA-items as pre-test. A span score was recorded as the highest number of blocks that could be reproduced by the child in reverse order. An item score was calculated as the number of sequences correctly reproduced by the child. Both scores were highly correlated (r = 0.92), therefore, in further analysis, the item score was used as a measure of working memory performance. This type of score is often used for tasks measuring working memory performance (Gathercole and Pickering, 2000; Alloway, 2007).

#### Stress Induction

To induce stress, the Cyberball paradigm was used (Williams et al., 2000). This paradigm simulates online social exclusion and causes mild general distress and increased physiological arousal (Abrams et al., 2011; Kelly et al., 2012). Children are told they will play a ball throwing game online with two other children. In reality the two other players are not real. The game is programmed in such a way that the participant is included during the first 18 throws, when each player receives the ball one third of the throws. However, he or she is excluded by the two fictive players during the last 20 throws. All players are represented by avatars and fictive names are mentioned for the two opponents with whom the participant is playing the game (one boy's name and one girl's name). For ethical reasons, all children play an inclusion version at the end of the experiment, with 18 trials and each player receiving the ball one third of the time. Although Cyberball is known as a mild stressor, previous research showed that this manipulation of social exclusion induces sufficient distress to negatively impact working memory performance in children (Hawes et al., 2012). After the game, children indicated how often they received the ball from the other players (never, sometimes, often, or always) as a manipulation check.

#### Emotional Support

Emotional support offered by the parent or teacher was manipulated by means of an audio recording. An audio message has previously been used in attachment research and has an effect on children's oxytocin levels, which are related to the display

of attachment related behaviors (Seltzer et al., 2010). Children either heard a weather report, a supportive message from an unknown person, a supportive message from their parent or a supportive message from their teacher. The content of the three supportive messages was standardized (Appendix A). All messages lasted approximately 30 s. The message provided by the parent was always a message from the primary caregiver as indicated by the parent(s) (75% mothers). Teachers that provided the message were primary teachers (86% female) who taught all courses to the children, whit the exception of physical education and religion. A blocked randomization was used for assigning children to the four conditions, to ensure that conditions were equally divided over schools, classrooms and gender (Suresh, 2011). At the end of the experiment the child indicated how much he or she liked receiving the audio message (not at all, not really, doesn't matter, somewhat or very much).

#### Procedure

This study was approved by the Social and Societal Ethics Committee of the University of Leuven. In the first part of the study children completed two questionnaires to assess their perception on the relationship with their parent and teacher. The assessment was completed during an individual session of approximately 20 min in a quiet room at school. The researcher read the statements of the questionnaires out loud and the child indicated whether they were true or false. On the same day demographic questionnaires were given to the parents. Parents returned the completed questionnaire 1 week later. In the second part of the study, the experiment was conducted during an individual session with the child. On average the experimental session was completed 26 days after the administration of the child questionnaires. The experimental session lasted approximately 30 min and was conducted in a quiet room at school. During this session, children first completed a working memory task (pre-test). This was followed by a stress induction through a computer game and a manipulation check. After the game, children heard one of four audio messages: a weather report, a supportive message of an unknown, a supportive message of a parent or a supportive message of the teacher. This audio message was used to manipulate the emotional support offered by the parent or teacher. The stranger condition was added in order to distinguish whether the effect on working memory was due to the positive tone of the message or the positive interpersonal relationship with the person giving the support. A parallel version of the working memory task was then used to assess post-test working memory performance. For ethical reasons, the session finished with a non-stressful version of the computer game and children were debriefed about the true meaning of the game. None of the children refused to play the final game. Children received an age-appropriate reward for their participation in the study.

#### Analyses

Descriptive statistics were calculated for the working memory outcomes and the manipulation checks for both the stress induction and the audio message. Additionally, t-tests, ANOVAs and correlational analyses were conducted to examine whether gender, Corsi test version, socioeconomic background (parents educational level, working status and family income) and age were significantly related to pre-test working memory scores. Finally, before conducting the main analyses, it was examined whether pre-test working memory significantly varied between classrooms, which would indicate multilevel analysis would be needed to control for children being nested within classrooms. A two-level null random intercepts model was calculated in MLWin 2.1 (Rasbash et al., 2009), showing that there was only significant between-subject variance (σ = 0.67, SE = 0.08, χ <sup>2</sup> = 74.95, p < 0.001) and no significant between classroom variance (σ = 0.08, SE = 0.05, χ <sup>2</sup> = 2.382, p = 0.123). Traditional analysis were thus preferred above multilevel analysis. These preliminary analyses were followed by the main analyses. Repeated measures ANOVAs were conducted to examine the effect of parent and teacher support after stress induction on changes in working memory performance. Pre- and post-test scores of the Corsi task were used as within-subject variable and condition as between-subject factor. Analyses were controlled for relevant background characteristics of the participants. Finally, additional repeated measures ANOVAs were conducted adding the quality of the parent–child and teacher–student relationship as dichotomous between-subject factors. This allowed us to examine whether the effect of the conditions depended on this relationship quality. All analyses are conducted in SPSS (IBM Corp., 2013).

## RESULTS

## Descriptives

**Table 2** shows the means and standard deviations of the scales measuring the parent–child and teacher–student relationship and of the working memory outcomes, as well as the correlations between these variables. Parent–child and teacher–student warmth were highly correlated, while a medium correlation existed between parent–child and teacher–student conflict. **Table 3** shows the descriptive statistics for the working memory outcomes in the different conditions. There are no significant differences between the conditions in pre-test scores.

As a manipulation check, after the Cyberball game children were asked how often they had received the ball (never, sometimes, often, or always). Most children indicated they received the ball sometimes (86.5%), often (8.2%) or never (3.5%) and thus experienced exclusion to some extent. However, three children (1.8%) indicated they always received the ball. These three children were removed for further analyses.

Additionally, a manipulation check was conducted to examine to what extend the children liked the audio message they received. As expected, the supportive message of the parent, teacher or stranger was liked very much (52.5, 48.8, and 35.7% respectively) or somewhat liked (40.0, 34.9, and 45.2%) by most children. The weather report was somewhat liked (31.8%) or did not really matter (43.2%) for most children. This indicates that the supportive message was successful and positively received by the children.


TABLE 2 | Means and standard deviations of and correlations between the parent–child and teacher–child relationship scales (non-dichotomized), and working memory outcomes (n = 170).

<sup>∗</sup>p < 0.05, ∗∗p < 0.01, ∗∗∗p < 0.001.

TABLE 3 | Descriptive statistics of the working memory outcomes within and across conditions.


## Preliminary Analyses

Children's working memory performance was not related to gender. Additionally, both versions of the Corsi task could be considered parallel versions, as indicated by the lack of a significant difference in working memory score at pre-test. Finally, age was significantly correlated with the pre-test working memory score (r = 0.38; p < 0.001). Child gender and order of the Corsi tests was therefore not taken into account, whereas all analyses controlled for age effects.

With regard to children's socioeconomic background, the educational level of the primary caregiver was related to working memory at pre-test [t(147) = −4.10; p < 0.001], with children of highly educated parents performing better. Similarly, a positive relationship was found between families' monthly net income and pre-test working memory performance (Spearman ρ = 0.23, p = 0.005). Finally, the work status of the primary caregiver was related to the working memory score [F(3,145) = 3.21; p = 0.025]. Children of which the primary caregiver worked fulltime (M = 16.95) or stayed at home voluntarily (M = 16.24) outperformed children of parents who were unemployed or unfit for work (M = 12.00). These characteristics were added as control variables in further analyses. Educational level and work status of the second caregiver were not related to working memory.

## The Effect of Emotional Support

Using repeated measures ANOVA, the changes in working memory performance from pre- to post-test in the different conditions were tested, while controlling for age, primary caregiver education level, work status and family income. No significant time × condition interaction was found, [F(3,135) = 0.85, p = 0.471] indicates that the change in working memory from pre- to post-test did not differ between the conditions.

## Moderating Effect of Parent–Child and Teacher–Student Relationship Quality

Additional repeated measures ANOVAs were performed in order to examine whether the effect of emotional support on working memory was moderated by the parent–child and teacher– student relationship quality. To this end, the dichotomized warmth and conflict scales were entered as between-subject variables.

Results show changes when adding the quality of the parent–child and teacher–student relationships. First of all, the change in working memory from pre- to post-test became significant [F(1,110) = 5.80, p = 0.018, η <sup>2</sup> = 0.050], showing a small drop in working memory performance across conditions, after stress was induced.

Additionally, several relationship variables interacted with working memory performance. First, a time × parent– child conflict interaction [F(1,110) = 6.99, p = 0.009, η <sup>2</sup> = 0.060] showed that children who experienced high parent–child conflict showed a decrease in working memory performance after stress induction, while children experiencing low levels of parent–child conflict did not. Second, a significant time × teacher warmth × teacher conflict interaction was found [F(1,110) = 5.21, p = 0.024, η <sup>2</sup> = 0.045], shown in **Figure 1**. For children experiencing low levels of teacher– student conflict (**Figure 1A**), a decrease in working memory could be seen when there were low levels of teacherstudent warmth, while working memory was stable when there were high levels of teacher–student warmth. When children experienced high levels of conflict (**Figure 1B**), working memory performance was stable irrespective of the levels of teacher– student warmth.

Finally, two interactions were found indicating that the effect of the conditions on working memory performance depended on the quality of the child–parent interaction. First, there was a medium sized time × condition × parent conflict interaction [F(3,110) = 2.99, p = 0.034, η <sup>2</sup> = 0.075]. As can be seen in **Figure 2**, the audio message made almost no differences when children experienced low levels of conflict with

FIGURE 1 | Changes in pre-posttest working memory score for children experiencing high and low levels of teacher-student warmth in combination with low levels of teacher–student conflict (A) or high levels of teacher–student conflict (B).

the parent (**Figure 2A**). However, when children experienced high levels of conflict with the parent their performance decreased after hearing a supportive message from a parent or from a stranger, whereas it increased when hearing a supportive message from the teacher (**Figure 2B**). Post hoc analysis indicate that for children experiencing high levels of parent–child conflict, there were no differences in working memory performance at pre-test, while at post-test the difference between children supported by teachers and children supported by parents was just above significance [t = −8.50, 95% CI = [−17.08; 0.08], p = 0.052]. For children experiencing low levels of parent-child conflict, post hoc analysis revealed no differences at both pre- and post-test. Finally, a similar result was found for parent–child warmth, with a three way time × condition × parent–child warmth interaction [F(3,110) = 3.78, p = 0.013, η <sup>2</sup> = 0.093]. Children experiencing high levels of warmth seemed not to be affected by the different audio messages (**Figure 3B**). Children experiencing low levels of warmth from the parent experienced a negative effect of parental support, while teacher support resulted in increased working memory performance (**Figure 3A**). Post hoc analysis indicated that for children experiencing high levels of parent–child warmth, there were no differences between conditions at pre- and post-test. For children experiencing low levels of parent-child warmth, children in the teacher support condition scored significantly lower at pre-test compared to the children in the parent support condition (t = 5.24, 95% CI = [0.75; 9.74], p = .024) and these differences were no longer visible at post-test (t = 1.15; 95% CI = [−3.09; 5.39], p = 0.585).

## DISCUSSION

Previous research has shown that a positive parent–child or teacher–student affective relationship can support EFs and working memory. Whereas these previous studies were all correlational in nature, the current study attempted to experimentally demonstrate the effect of parent and teacher support on working memory performance. Additionally, this study examined whether the effect of parent and teacher emotional support can be seen as a stress-buffering effect. This is, to our knowledge, the first study that tries to uncover the reason why parents and teachers can promote working memory performance through a positive relationship.

## The Effects of Stress on Working Memory Performance

It was expected that after a stress inducing game, children's working memory performance would decrease if they heard a neutral message afterward. In contrast to what we had expected (based on Hawes et al., 2012), there was no general negative effect of stress on working memory performance as shown by a drop in working memory in the weather report condition. As a consequence, the effects of emotional support of parents and teachers that were observed, cannot be linked to the underlying stress mechanism, which this study was trying to test.

A decrease after stress induction was observed in specific subgroups of children, namely children who experienced low levels of parent–child conflict, low levels of parent– child warmth, and low levels of teacher–student warmth especially in combination with low levels of teacher–student conflict. Children may be differentially susceptible to stressors and this can be influenced by different factors, such as genetics (Ising and Holsboer, 2006), gender (Hawes et al., 2012) or the quality of the parent–child and teacher– student relationship (Bernier et al., 2010; Ahnert et al., 2012).

It should be noted that the neutral message may have distracted children and reduced children's stress levels even though it was used as a control condition. Alternatively, if children did not experienced the exclusion from Cyberball, they may have had an increase in working memory due to a learning effect. This means that a stable working memory performance after the Cyberball game might indicate a negative effect of stress if it was compared to a no stress condition. In both cases the true impact of stress and working memory might be underestimated in the current design. The addition of an objective stress measure (e.g., skin conductance or a salivary cortisol measure) or a no-stress condition may help to assess the true effect of stress on working memory performance.

## Effect of Parent and Teacher Support on Working Memory Performance

When children have a positive relationship with their parent, no clear effect of parent support was found. Results do suggest that when children have a more negative relationship with their parent (low warmth, high conflict), support offered by the parent has a negative effect on working memory performance. On the other hand, support offered by teachers has a positive effect on working memory performance when children have a negative relationship with their parent. As a result children who had a negative relationship with their parent and who heard a supportive message from the teacher outperformed or caught up with children who heard a supportive message from the parent at post-test. This indicates that teacher support might compensate for the adverse effects of a negative parent–child relationship. Such a compensating effect has previously been shown for children's behavior with high levels of teacher warmth related to decreases in children's aggressive behavior only for children who were insecurely attached to their mother (Buyse et al., 2011). In their review McGrath and Van Bergen (2015) indicate different explanations for the fact that a positive teacherstudent relationship may compensate for other risk-factors such as a negative parent-child relationship. One possibility is that when children receive adequate support from the teacher, they will form a less negative internal working model and thus have less negative beliefs about the world and the self (Buyse et al., 2011; McGrath and Van Bergen, 2015).

The lack of effect of emotional support for children who do have positive parent–child relationships may indicate that when children are used to positive stimulation from the teacher, they need a stronger reinforcement than a short audio message to see an effect on working memory performance. Another possibility is that children with negative parent–child relationships rely more on the teacher for helping to regulate their stress levels and emotions and that these children are more easily affected by positive support from their teacher (McGrath and Van Bergen, 2015). This result is in line with broader research

indicating limited effects of teacher–child relationships on children's behavior when there is already a positive parent–child relationship (e.g., Buyse et al., 2011). With this respect our results support the academic-risk hypothesis (Hamre and Pianta, 2001) stating that the quality of teacher–child relationships are most important for those children at risk for negative school adjustment, because they have more to gain or to lose than other students (Roorda et al., 2011).

Finally, the negative effect of parent support for children with negative parent–child relationships was an unexpected finding that warrants some attention. This might be explained by the fact that children build internal working models of attachment, mental schemes containing information about social relationships, based on experiences with early attachment figures (Dykas and Cassidy, 2011). Children use these internal working models to store information about previous social experiences and to form expectations about how future social experiences will be like. When children do not have a positive relationship with their parent they are likely to form an insecure attachment script or a negative internal working model. As a result, they are more likely to interpret social information, such as an audio message from the parent, in a negative way or they completely ignore it (Dykas and Cassidy, 2011). Also, children who have a negative bond with their parent in general respond to distressing situations with maladaptive coping strategies, which can further enhance negative feelings that are already present (Grossmann and Grossmann, 1991). Hearing a supportive message from the parent may thus have further increased children's stress levels.

An important note should be made with regard of the impact of the observed effects. Children who experience high parent–child conflict can processes one additional item in working memory after hearing a supportive message from their teacher. In developmental research examining growth in working memory, such an increase corresponds to approximately 2 years of development (Alloway, 2011). Although effect sizes indicate small to medium effects, it should thus be taken into account that in practice the impact of the environment is substantial and might have considerable implications for children's learning.

## Strengths and Limitations

The current study contributes to the literature in several ways. First, whereas previous studies had established relationships between parent–child and teacher–student relationship quality and working memory performance, none of the previous studies has done so in an experimental design. The current study is therefore the first that can show a causal effect of parent and teacher emotional support on working memory performance. Second, research examining the parent and teacher influences on EF has evolved independently and it was therefore previously unclear what the relative contribution of both is. The current study showed that parent and teacher influences interact with each other.

Some limitations of the current study warrant attention when interpreting the findings of the study. First, the main limitation of the current study is that, due to the lack of a no-stress condition or an objective stress measure, the effect of stress on working memory is hard to interpret. As a consequence we cannot link the effect of emotional support from parents or teachers directly to children's stress levels. Based on previous research it is assumed that the Cyberball manipulation provides mild distress (Abrams et al., 2011; Kelly et al., 2012), though this did not clearly come forward in the current design. During the experiment large differences were observed in children's response to the stress induction. Objective stress measures (e.g., skin conductance or salivary cortisol) and a no-stress condition would be helpful in directly linking the parent and teacher support to the proposed stress mechanism. However, irrespective of the lack of a clear stress effect, there are clear effects of emotional support on working memory performance, which is on its own a new and important insight when examine the role of parents and teachers in children's EF performance and development. Second, it should be noted that although a limited number of statistical models were run in the current study, this did result in multiple individual tests. The results should thus be interpreted with caution and p-values should always be interpreted in combination with effect sizes. Third, the current study examined the acute effect of stress induction and parent and teacher support for working memory performance. Questions remain about whether parent and teacher support have effects in the long run through the buffering of the negative effects of stress on working memory. Finally, it should be noted that although the current study points out the importance of parents and teachers as safe havens, this does not exclude other potential mechanisms through which parents and teacher can influence working memory performance and development. Future research should therefore also consider the role of, for example, children's increased exploration of the environment (parent and teacher as secure base; O'Connor and McCartney, 2007) and modeling (Rimm-Kaufman et al., 2009), direct stimulation (McNamara and Scott, 2001; Morrison and Chein, 2011) or scaffolding (Bibok et al., 2009; Hughes, 2011) by both parents and teachers.

## CONCLUSION

The current study shows that parents and teachers can have a substantial influence on children's working memory performance by offering adequate emotional support. Although further research is needed to examine the underlying mechanisms of these effects, this thus confirm the idea that cognitive processes, such as working memory, do not merely depend on maturation, but can also be supported or hindered by environmental factors. Both clinicians (e.g., those providing working memory trainings) and teachers should thus not only pay attention to the cognitive stimulation of children, but should recognize the importance of affective factors, such as the affective quality of relationships with significant others. Being attentive to the emotional environment in which children grow up might be an important element that can complement current attempts in the prevention and intervention of working memory problems.

## AUTHOR CONTRIBUTIONS

fpsyg-08-00512 April 1, 2017 Time: 16:55 # 11

LV was responsible for the preparation of the study, datacollection and the writing of the current manuscript. JS designed the study, supported the data-collection and gave feedback on the analysis, the interpretation of the data and the current manuscript at multiple occasions. KV supported the data-collection and gave feedback on the analysis, the interpretation of the data and the current manuscript at multiple occasions. DB supported the datacollection and gave feedback on the analysis, the interpretation of the data and the current manuscript at multiple occasions

## REFERENCES


## ACKNOWLEDGMENTS

The authors thank the master students that aided in the datacollection of the study described in this manuscript.

### SUPPLEMENTARY MATERIAL

The Supplementary Material for this article can be found online at: http://journal.frontiersin.org/article/10.3389/fpsyg. 2017.00512/full#supplementary-material



**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2017 Vandenbroucke, Spilt, Verschueren and Baeyens. This is an openaccess article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

## Executive Function in Adolescence: Associations with Child and Family Risk Factors and Self-Regulation in Early Childhood

Donna Berthelsen<sup>1</sup> \*, Nicole Hayes1,2, Sonia L. J. White<sup>1</sup> and Kate E. Williams<sup>1</sup>

<sup>1</sup> School of Early Childhood and Inclusive Education, Queensland University of Technology, Brisbane, QLD, Australia, <sup>2</sup> Mater Research Institute, University of Queensland, Brisbane, QLD, Australia

Executive functions are important higher-order cognitive skills for goal-directed thought and action. These capacities contribute to successful school achievement and lifelong wellbeing. The importance of executive functions to children's education begins in early childhood and continues throughout development. This study explores contributions of child and family factors in early childhood to the development of executive function in adolescence. Analyses draw on data from the nationally representative study, Growing up in Australia: The Longitudinal Study of Australian Children. Participants are 4819 children in the Kindergarten Cohort who were recruited at age 4–5 years. Path analyses were employed to examine contributions of early childhood factors, including family socio-economic position (SEP), parenting behaviors, maternal mental health, and a child behavioral risk index, to the development of executive function in adolescence. The influence of children's early self-regulatory behaviors (attentional regulation at 4–5 years and approaches to learning at 6–7 years) were also taken into account. A composite score for the outcome measure of executive function was constructed from scores on three Cogstate computerized tasks for assessing cognition and measured visual attention, visual working memory, and spatial problem-solving. Covariates included child gender, age at assessment of executive function, Aboriginal and Torres Strait Islander status, speaking a language other than English at home, and child's receptive vocabulary skills. There were significant indirect effects involving child and family risk factors measured at 4–5 years on executive function at age 14–15 years, mediated by measures of self-regulatory behavior. Child behavioral risk, family SEP and parenting behaviors (anger, warmth, and consistency) were associated with attentional regulation at 4–5 years which, in turn, was significantly associated with approaches to learning at 6–7 years. Both attentional regulation and approaches to learning were directly associated with executive functioning at 14–15 years. These findings suggest that children's early self-regulatory capacities are the basis for later development of executive function in adolescence when capabilities for planning and problem-solving are important to achieving educational goals.

Keywords: early childhood, parenting, self-regulation, executive function, attention regulation, approaches to learning, adolescence

Edited by:

Dieter Baeyens, KU Leuven, Belgium

#### Reviewed by:

Joana Cadima, University of Porto, Portugal Annemie Desoete, Ghent University, Belgium

> \*Correspondence: Donna Berthelsen d.berthelsen@qut.edu.au

#### Specialty section:

This article was submitted to Developmental Psychology, a section of the journal Frontiers in Psychology

Received: 27 February 2017 Accepted: 16 May 2017 Published: 02 June 2017

#### Citation:

Berthelsen D, Hayes N, White SLJ and Williams KE (2017) Executive Function in Adolescence: Associations with Child and Family Risk Factors and Self-Regulation in Early Childhood. Front. Psychol. 8:903. doi: 10.3389/fpsyg.2017.00903

## INTRODUCTION

fpsyg-08-00903 May 31, 2017 Time: 15:55 # 2

Young people who make a successful transition to secondary school, in terms of academic and social adjustment, are also likely to be on track for successful school completion. Currently, there is significant research interest in the contributions of selfregulation and executive function to school achievement for children and adolescents (Blair and Diamond, 2008; Best et al., 2011; Blair and Raver, 2015; Jacob and Parkinson, 2015). The contribution of these abilities to later developmental outcomes is increasingly understood through integration of knowledge across the neurosciences and developmental psychology (Zhou et al., 2012; Diamond, 2013). Executive function, the specific outcome of interest in these analyses, can be defined as higher-order cognitive abilities which are important in goal-directed behavior and which are associated with brain functioning in the prefrontal cortex (Miller and Cohen, 2001; Dumontheil, 2016). Research on the development of executive functions across childhood and adolescence has delivered broad understandings about brainbehavior relationships. This includes knowledge about how different components of executive function mature at different rates and how specialization of brain structure and function in adolescence enables more effective and efficient executive functioning (Davidson et al., 2006). The analyses presented in this paper explore relations between young children's early family experiences and the self-regulatory behaviors of attentional regulation and approaches to learning, and the development of executive function in mid-adolescence.

Adverse life experiences affect the development of selfregulation and executive function across childhood and adolescence (McEwen and Gianaros, 2010; Sheridan et al., 2012). For example, childhood disadvantage has been found to predict deficits in cognitive processes through the neurological effects of chronic stress (Blair et al., 2011; Evans and Fuller-Rowell, 2013). The experience of chronic stress shapes subsequent stress response physiology in children, leading to higher levels of reactivity and negatively impacting brain development affecting self-regulation and executive function (Evans, 2003). Across early childhood, brain structure and function develop rapidly as children begin to face higher demands for self-regulatory behavior, especially when they make the transition to school (Ursache et al., 2012; Zelazo and Carlson, 2012). Overall, there is increasing knowledge that early life conditions associated with disadvantage affect the development of children's cognitive processing through childhood and adolescence (Hackman and Farah, 2009; Hackman et al., 2010, 2015).

Early childhood is an optimal period in which early interventions may deliver greater social and individual benefits for long-term development (Heckman, 2006). The early identification of children for whom there are developmental concerns about regulation of behavior, including executive function, is an important research and policy concern across national contexts. For example, since 2009, the Australian Government has conducted a triennial national census of children's developmental competencies in the first year of school. The Australian Early Development Census (AEDC; Australian Government, 2016) provides national indicators across developmental domains in which self-regulatory behaviors are included. The census identifies the number of children in communities who are 'vulnerable,' 'developmentally at risk,' or 'on track' in language and cognitive skills, communication and general knowledge, physical health and wellbeing, social competence, and emotional maturity. In 2015, it was found that 1 in 5 Australian children were vulnerable in one or more developmental domains and differences in vulnerability were apparent for children with different demographic profiles. This national policy recognizes the importance of readiness to learn when children begin school. It is important that children acquire the necessary skills for cognitive and emotional control in order to become successful learners through the school years (Duncan et al., 2017).

## Self-Regulatory Development during Early Childhood

In these analyses, measures of attentional regulation and approaches to learning that are behaviors associated with selfregulation, are included as possible mediating variables in exploring the longitudinal relations between early childhood disadvantage and family risk factors and adolescent executive function. From a neurological perspective, abilities to control and direct attention that develop across infancy and childhood are the basis of self-regulation (Rothbart et al., 2011; Petersen and Posner, 2012). Increased rapprochement between theories of attentional development and theories of temperament has advanced conceptualizations about the development of selfregulation. Through infancy, there is a transition from attentional reactivity to more voluntary attentional control (Rueda et al., 2004). From 4 to 6 years, increased maturation of the prefrontal cortex provides increased connectivity between neural networks as the basis for attentional regulation. Reactivity and selective attention comprise a dynamic system between the individual's biological propensities to react and the exercise of attentional control (Ristic and Enns, 2015).

Attentional regulation includes capacities to selectively attend to specific stimuli, inhibit prepotent responses, and monitor actions (Petersen and Posner, 2012). Attentional regulation enables individuals to focus on relevant information to achieve important goals. When children begin school, there are higher demands on attentional regulation and impulse control. These qualities are linked to children's early academic competence (McClelland et al., 2007; Nesbitt et al., 2013; Blair and Raver, 2015). Williams et al. (2016b) reported that early attentional regulation prior to school, and at school entry, were linked to math achievement at 8–9 years. Longer-term effects of early attention regulation on educational outcomes has been reported by McClelland et al. (2013) who reported that attention spanpersistence at aged 4–5 years was predictive of math and reading achievement at age 21 years and college completion at 25 years.

'Approaches to learning' has been used as a descriptive term for children's early self-regulatory skills in the classroom. The construct, approaches to learning (Kagan et al., 1995), has been used in research to describe and measure learningrelated, regulatory behaviors that children exhibit when taking

part in classroom activities. These behaviors include attention, initiative, persistence, and engagement (Li-Grining et al., 2010; Bulotsky-Shearer et al., 2011; Sasser et al., 2015). If children begin school with behaviors that support engagement, effort, and active participation, successful academic outcomes are much more likely (Fantuzzo et al., 2005; Ziv, 2013).

## Executive Function in Adolescence

The outcome measure in these analyses is executive function which is conceptualized as a single executive control mechanism accounting for high-order thinking. While other areas of the brain are now also implicated in executive functioning, Miller and Cohen (2001) assumed that areas of the prefrontal cortex, associated with executive function, served a particular function to support:

the active maintenance of patterns of activity that represent goals and the means to achieve them. They provide bias signals throughout much of the rest of the brain, affecting not only visual processes but also other sensory modalities, as well as systems responsible for response execution, memory retrieval, and emotional evaluation, etc. The aggregate effect of these bias signals is to guide the flow of neural activity along pathways that establish the proper mappings between inputs, internal states, and outputs needed to perform a given task (p. 171).

Anderson (2003) noted, while executive function may be conceptualized as a single central control mechanism, it is also understood as involving multiple processing systems that are inter-related and inter-dependent. Miyake et al. (2000) investigated the internal factorial structure of executive function across nine tasks to document three distinct but overlapping components of executive function (response inhibition, updating working memory, and set shifting) which has been an influential framework in developmental studies, although in the neurosciences there are broader conceptualizations. In a systematic review of the research literature, Packwood et al. (2011) mapped 68 components of executive function described across 60 studies. Using latent semantic analysis and hierarchical cluster analysis, these researchers identified 18 components that, in turn, represented five sets of complex executive functions involving planning, working memory, set-shifting, inhibition, and fluency.

Adolescence is a period of development that begins at the onset of puberty and spans the second decade of life (Blakemore et al., 2010). While magnetic resonance imaging techniques have found that total brain volume reaches adult levels by puberty (Dumontheil, 2016), brain functions continue to develop and show age-related improvements and differentiation of functions through neural specialization (Luna et al., 2015). Through maturational processes in adolescence, brain processing is seen to become more efficient and effective, despite some recognized vulnerabilities specific to adolescence related to risky behaviors associated with emotional control (Steinberg, 2008). Attentional skills and working memory mature further across adolescence as more complex skills evolve that enable performance monitoring, feedback learning and relational reasoning (Crone and Dahl, 2012). Increased capabilities to integrate more contextual information from experience are also evident in adolescence which permit increased cognitive flexibility for decision-making in accomplishing novel tasks (Steinbeis and Crone, 2016).

## Ecological and Child Factors Influencing the Development of Executive Function

Socio-economic disparities in the measured qualities of executive functions emerge in infancy and across early childhood (Noble et al., 2007; Hackman and Farah, 2009; Blair et al., 2011; Rhoades et al., 2011; Raver et al., 2013) as well as in neurological studies of brain structure and function (Sheridan et al., 2012; Noble et al., 2015). It is less clear if socio-economic disparities in neurological function that have emerged in childhood are maintained over time or if effects are attenuated when children begin school or if family socio-economic circumstances change (Hackman et al., 2015; Duncan et al., 2017).

These analyses consider early family risk factors of maternal mental health, parenting behaviors, and child early behavioral risk as possible influential processes on the development of executive function. A substantial literature has documented links between economic disadvantage and heightened parental depression (Lorant et al., 2003) that, in turn, can impact on parenting and children's development (Olson et al., 2011). In a review of previous research by Fay-Stammbach et al. (2014), four dimensions of parenting were identified that may impact on the development of executive function: parental home stimulation to support child learning; maternal support and autonomy; parental sensitivity (versus hostility); and control and discipline strategies. Parenting may also be affected by child characteristics, including gender and temperament. Belsky et al. (2007) and Belsky and Pluess (2009) proposed that children differ in their sensitivity to environmental contexts and some children are more reactive to either positive and negative environments which impacts on their behavioral responses. Emerging evidence on such differential susceptibility provides some support that heightened child reactivity can also add stress to the family environment (Raver et al., 2013; Obradovic et al., 2016).

Child behaviors associated with poorer self-regulation at 4–5 years include sleep problems, emotional dysregulation, and inattention/hyperactivity. Early childhood behavioral sleep problems have been linked with poorer attentional regulation (Williams and Sciberras, 2016; Williams et al., 2017) and executive function development over time (Bernier et al., 2013); and also poorer academic functioning (Quach et al., 2009). A recent analysis found that at 4–5 years, children with unresolved behavioral sleep problems, combined with above average levels of emotional dysregulation and poor attention were at higher risk for poor school adjustment (Williams et al., 2016a). Taken together, these findings suggest a link between these early problem behaviors and self-regulation and executive function development over time. Two potential mechanisms or a combination of both mechanisms underpin this link. First, these early problem behaviors may signal an underlying neurological vulnerability for poor self-regulatory functioning. Second, responses by caregivers that fail to resolve early behavioral sleep issues and support positive self-regulation may result in an exacerbation of these problems across childhood. Early sleep problems lead to emotional dysregulation which impacts on attentional regulation, disrupting the development of important brain structures that support executive function (Williams et al., 2017).

## The Current Study

fpsyg-08-00903 May 31, 2017 Time: 15:55 # 4

The current study considers the influence of a range of early childhood and family risk factors on the development of executive function in adolescence. While much is known about the impact of family risk on the development of self-regulation and executive function through early childhood, there are fewer studies that have considered how early ecological risk factors and early self-regulatory skills, such as attentional regulation and approaches to learning, may influence the longer-term development of executive function in adolescence.

Path models are developed to explore the direct effects of family socio-economic circumstances, child behavior problems, and maternal parenting behaviors of anger, warmth and consistency, when children are aged 4–5 years, on executive function at 14–15 years. Second, an indirect effects model is developed to examine associations between early ecological risk and executive function in adolescence, through children's level of attentional regulation at age 4–5 years and their approaches to learning at 6–7 years, when children begin school.

## MATERIALS AND METHODS

These analyses use data from Growing Up in Australia: The Longitudinal Study of Australian Children (LSAC) which commenced in 2004. This cohort study tracks a nationally representative sample of Australian children. It is funded by the Australian Government through a partnership between the Department of Social Services, Australian Institute of Family Studies, and Australian Bureau of Statistics. Ethics approval for the conduct and processes within the study is granted by the Australian Institute of Family Studies Ethics Committee. Detail on LSAC study design, sample information, and implementation is reported in a range of sources (Sanson et al., 2002; Soloff et al., 2005; Gray and Smart, 2009; Edwards, 2012).

The longitudinal Study of Australian Children employs a cross-sequential longitudinal design to follow two cohorts of approximately 5,000 children, aged 0–1 years and 4– 5 years. A two-stage clustered sampling design was used to recruit children into the study. Across Australia, 330 postcodes were randomly selected and children for both cohorts were randomly selected from these postcodes. Stratification was used to ensure the number of children in each state/territory and within and outside each capital city was proportionate to the population of children in these areas, except for remote and very remote communities. The sampling frame was derived from the Medicare Australia database held by the Health Insurance Commission which administers this universal health insurance scheme. In 2004 when LSAC commenced, more than 90% of all children born were likely to be registered on the Medicare database by 4 months and 98% by 12 months. Primary data collection occurs through biennial home visits and the study participants include the child, parents (resident and nonresident), and teachers. In these analyses, data are utilized from Wave 1 (2004) when children were 4–5-years-old, Wave 2 (2006) when children were 6–7-years-old, and Wave 6 (2014), when children were 14–15-years-old.

## Sample Selection for Current Study

The current analyses include participants from the 4,983 families initially recruited for the Kindergarten Cohort (4–5 years) in 2004. The current analytic sample was restricted to families for whom the primary parent interviewed at Wave 1 was female and who was a biological or adoptive parent. The resultant sample size was 4819 children and families.

#### Child Characteristics

49.1% (n = 2365) of the children are female; mean age at Wave 1 was 57 months (SD = 2.64); 3.6% (n = 175) had Aboriginal or Torres Strait Islander status; and 12.3% (n = 595) spoke a language other than English at home. Compared with the full Kindergarten cohort sample, the selected sample were slightly younger at each wave of data collection than children in excluded families.

#### Maternal Characteristics

2.8% of mothers (n = 133) had Aboriginal or Torres Strait Islander status and 15.4% (n = 742) had a non-English speaking background. At Wave 1, when children were 4-years-old, mothers ranged in age from 19 to 52 years with a mean age of 34.6 years. There were 41% of mothers who had not completed high school and 44.4% of mothers had completed a tertiary degree, of at least Bachelor level. Compared with the full Kindergarten cohort sample, mothers in the analysis sample were slightly less likely to be Aboriginal or Torres Strait Islander or speak a non-English language at home; and on average had a slightly higher socioeconomic position (SEP) at Wave 2 data collection.

#### Measures

At Wave 1, when the child was 4–5 years, parental data were from in-home interviews and self-complete questionnaires. Ecological risk measures are: family SEP, child behavior risk index, maternal mental health, and self-report measures for parenting anger, warmth, and consistency. Covariates in the analyses included child sex, age at assessment of executive function, Aboriginal or Torres Strait Islander status, language other than English at home, and a score on a receptive vocabulary measure at age 4–5 years. Additionally, a parent-reported measure for child attentional regulation at age 4–5 years and a teacher-report measure on approaches to learning when children were 6–7 years old were included. From Wave 6, when children were 14–15 years old, data were included from a direct child assessment for executive function using a composite measure derived from three computerized tasks.

#### Socio-Economic Position

Socio-economic position is a derived variable within the LSAC dataset that combines parental report for socio-demographic items for the child's household: parental occupational prestige, parental education level, and household income (Blakemore et al., 2009). It is weighted according to household composition (e.g., single-parent household; two-parent household). It has an approximate mean of zero and a standard deviation of one. Higher scores indicate higher family SEP.

#### Child Behavior Risk Index

fpsyg-08-00903 May 31, 2017 Time: 15:55 # 5

This index was the sum of dichotomized scores on three measures: sleep problems (0 = no; 1 = yes), emotional dysregulation (0 = no; 1 = yes), and inattention/hyperactivity symptoms (0 = no; 1 = yes).


#### Maternal Mental Health

The Kessler K6 measure, used to assess psychological symptoms, was developed for the United States National Health Interview Survey (Kessler et al., 2002). Mothers rated six items about their current psychological well-being across the previous 4 weeks: nervous; hopeless; restless or fidgety; everything was an effort; so sad that nothing could cheer you up; and worthless. Items were rated on a 5-point scale (1 = all of the time to 5 = none of the time). An overall score was calculated by summing and averaging the total score resulting in a score ranging from zero to five (α = 0.84). Higher scores indicate poorer mental health.

#### Parenting Anger

Anger was measured using four items adapted from the National Longitudinal Study of Children and Youth (Statistics Canada, 2000). Mothers rated their feelings of anger or frustration toward the child (e.g., How often are you angry when you punish this child?) on a 5-point scale (never or almost never, rarely, sometimes, often, always or almost always).

#### Parenting Warmth

Warmth was measured using six items from the Child Rearing Questionnaire (Paterson and Sanson, 1999). Mothers rated their expression of physical affection and enjoyment of the child (e.g., How often do you have warm, close times together with this child?) on a 5-point scale (never or almost never, rarely, sometimes, often, always or almost always).

#### Parenting Consistency

Consistency was measured using four items adapted from the National Longitudinal Survey of Children and Youth 1998–1999 (Statistics Canada, 2000). Mothers rated the extent to which they followed through with behavioral consequences for the child (e.g., How often does this child get away with things that you feel should have been punished? - reverse coded). Items are rated on a 5-point scale (1 = never/almost never to 6 = all the time).

For each of the three parenting constructs, a weighted score was used in the analyses computed from the proportionally adjusted factor score regression weights reported in the LSAC Parenting Measures Technical Report (Zubrick et al., 2014). Higher scores indicate higher maternal anger, warmth, and consistency, respectively.

#### Attentional Regulation (4–5 years)

At Wave 1 data collection, parents completed four items from the persistence subscale of the Short Temperament Scale for Children (Fullard et al., 1984). Items (e.g., When this child starts a project such as a puzzle he/she works on it until it is completed even if it takes a long time) are rated on a 6-point scale (1 = almost never to 6 = almost always). The scores on this scale were summed to create a total score (α = 0.78) with higher scores indicating stronger attentional regulation skills.

#### Approaches to Learning (6–7 years old)

At Wave 2 data collection, teachers completed six items from a subscale of the Social Skills Rating Scale (SSRS) (Gresham and Elliott, 1990). The response scale ranges from 1 = never to 4 = very often. The items rate children's attentiveness, task persistence, eagerness to learn, learning independence, flexibility, and organization. The scale score was the mean of the six items (α = 0.92) with higher scores indicating more positive approaches to learning.

#### Executive Function (14–15 years)

Three computer-based tasks from the Cogstate Assessment Battery (Cogstate, n.d.) were completed by the LSAC study child during the in-home interview at Wave 6 data collection. LSAC interviewers were trained to deliver the tasks from Cogstate protocols. Participants are encouraged to work as quickly as they can and be as accurate as possible.

• The **Identification task** is a choice reaction time task that measures visual attention across multiple trials. The subject is required to decide as quickly as possible whether a

playing card that is presented face up on the screen is red (YES button) or not (NO button). The cards displayed are either red or black joker playing cards and 30 trials are completed within approximately 2 min. The primary outcome measure is speed of performance, calculated by computing the mean of the log10 transformed reaction time for each correct trial response.


#### Covariates Included in the Analyses

Covariates included in the analyses included child gender (0 = male, 1 = female); child age in months (at 14–15 years data collection; Wave 6); Aboriginal or Torres Strait Islander status (ATSI; 0 = no, 1 = yes); language other than English at home (LOTE; 0 = no, 1 = yes); and a continuous measure of receptive vocabulary assessed when the child was 4–5 years of age, using an adapted version of the Peabody Picture Vocabulary Test (PPVT-III; Dunn and Dunn, 1997) developed for LSAC (Rothman, 2005).

## Data Analysis

#### Executive Function Scoring

Data that did not meet completion or integrity checks on any task were treated as missing data. The Identification and One-Back tasks required participants to complete 75% of test trials to receive a score. On the Groton Maze Task, all five trials were required to be completed. Performance integrity was based on an accuracy score for the Identification and One-Back tasks. Accuracy of performance was computed by taking the arcsine square root of the proportion of correct responses for each task (Integrity failure: Identification task = > 80% of trials; One-Back task = > 70%). For the Groton Maze task, performance integrity failure was defined as >120 errors. An additional filter was also applied to the data for each task in which scores below/above three standard deviations were not included. A composite score for executive function was constructed using the three measures, following procedures described in Maruff et al. (2013). For each task, the mean and standard deviation were computed and standardized. A composite score was computed by averaging the standardized scores for the three tasks; re-standardized using the mean and SD for the composite score; transformed once more so that each had a mean of 100 and a standard deviation of 10, and multiplied by −1 so that higher scores indicated more competent performance. If data on any individual task was missing, the composite score was not computed.

#### Missing Data

The degree of missing data varied by data collection wave as well as by the method used for data collection. Variables collected at Wave 1 using the parent self-complete questionnaire (i.e., measures of emotional dysregulation, inattention/hyperactivity symptoms, maternal mental health, and attentional regulation) had up to 16% of cases with missing data. At Wave 2, the measure on the teacher questionnaire, approaches to learning, had 27% of cases with missing data (38% of these because of participant dropout between Wave 1 and Wave 2; 62% due to teacher non-response). The composite measure for executive function had 45% of cases with missing data (64% of these because of participant drop out between Wave 1 and Wave 6; 36% due to incomplete data). Cases with complete data across all study variables represented a non-random sample of the complete sample for the Kindergarten Cohort: at Wave 1, families had a higher SEP, F(1,4801) = 126.31, p < 0.001; were less likely to be Aboriginal or Torres Strait Islander, χ 2 (1, N = 4917) = 27.43, p < 0.001; or have language other than English at home, χ 2 (1, N = 4819) = 68.68, p < 0.001; at Wave 6, children were slightly older than the children with incomplete data, F(1,3434) = 9.89, p < 0.01.

Although missingness was related to the identified sociodemographic variables, it was assumed as missing at random (MAR), that is, not systematically related to the variable value that could have been provided, at least for the substantive variables of interest (Enders, 2010). Multiple imputation in Mplus, Version 7 (Muthén and Muthén, 1998–2012) was employed to create 40 imputed datasets in line with the recommended number for the level of missing data in this study (Graham et al., 2007). The imputation model used all the variables included in the current analyses, as well as a range of auxiliary variables, including additional sociodemographic information (maternal cultural background; SEP at Wave 6 data collection; child age in months across all six waves of data collection); maternal-reported Attentional Regulation at age 6, 8, 10, 12, and 14–15 years; teacher-report data on the measure of Approaches to Learning at age 8, 10, and 12 years; SDQ hyperactivity/inattention symptoms at 6, 8, 10, 12, and 14–15 years; and teacherratings of the child's literacy achievement at age 14–15 years (using scores on the Academic Rating Scale, National Center for Educational Statistics, 2002). All results presented here are pooled results across the 40 imputed datasets, achieved through the TYPE = IMPUTATION analysis available in MPlus Version 7. The analytic models were also run with the nonimputed dataset and there were no substantial differences in findings.

#### Analytic Approach

fpsyg-08-00903 May 31, 2017 Time: 15:55 # 7

Path analyses were used to estimate the direct and indirect effects of hypothetically casual relationships among the variables of interest using Mplus Version 7. **Model 1** was an unadjusted direct effects model that examined the direct effects of ecological risk variables when children were 4–5 years (i.e., SEP; child behavioral risk index; maternal mental health; maternal parenting – anger, warmth, consistency) on executive function, at age 14–15 years. **Model 2** was a fully adjusted direct effects model that included paths from each covariate (child gender; child age in months at 14–15 years; Aboriginal or Torres Strait Islander status; language other than English at home; and child PPVT at 4–5 years of age) to the outcome variable of executive function. For **Model 3** all direct and indirect paths were modeled simultaneously. This was a fully adjusted, indirect effects model which included the mediating variables of child attentional regulation (at age 4–5 years) and approaches to learning (at 6–7 years) on relations between early ecological risk and adolescent executive function. In this model, covariates were also assessed in relation to the outcome measure of adolescent executive function (as per Model 2), and each of the mediating variables introduced in Model 3.

Model fit was assessed by three indices: χ 2 test, RMSEA, CFI. Multiple indices of fit were examined because the chi-square overall goodness-of-fit test statistic is adversely affected by a large sample size (Byrne, 2012). Therefore, a range of other fit indices are usually included to assess model fit (Bentler, 2007). Model fit was also considered using the Comparative Fit Index (CFI) and root mean square error of approximation (RMSEA). For the CFI, a suggested cut-off criteria of values close to or higher than 0.95 have been suggested when using continuous data (Hu and Bentler, 1999). The RMSEA is an absolute fit index which is sensitive to the number of parameters estimated in the model (Steiger, 2009) and the recommended cut-off value for RMSEA is proposed as close to, or lower than 0.06.

## RESULTS

Descriptive statistics, including bivariate correlations between continuous variables used in these analyses are presented in **Table 1**. Correlations were in the expected directions and almost all were significant due to the large sample size. All early childhood ecological risk variables measured at 4–5 years were significantly correlated with executive function, measured at 14–15 years but were small in magnitude. Approaches to learning at 6–7 years was more strongly correlated with executive function (r = 0.22; p < 0.01) in comparison to the ecological risk variables. Overall, the ecological risk variables had strong significant correlations with attentional regulation ranging in size from r = 0.14 (p = 0.01) for SEP and maternal warmth to r = −0.32 (p = 0.01) with child behavior risk.

## Path Models

#### Model 1

This model tested the direct relations between early ecological risk variables and executive function in adolescence. There were significant small negative associations between the child behavior risk index and executive function at 14–15 years (β = −0.10), indicating a higher behavioral risk score at 4–5 years was associated with poorer executive function at 14–15 years; and a significant but small positive association between SEP and executive function scores at 14–15 years (β = 0.09). There were no significant associations between maternal mental health and the three parenting measures (anger, warmth and consistency) and executive function at 14–15 years. The model accounted for 3% of variance in adolescent executive function. This model was 'just identified' as the number of data points equaled the number of parameters to be estimated, meaning interpretation of fit indices is not possible because [χ 2 (0) = 0, p = 1; CFI = 1; RMSEA = 0].

#### Model 2

The second model tested the direct relations between early ecological risk and executive function, adjusted for child characteristics as covariates in the model. Child gender (β = −0.15), home language other than English (β = 0.26), and early receptive vocabulary skills (β = 0.14) at 4–5 years were all significantly associated with executive functioning at 14–15 years. Aboriginal and Torres Strait Islander status and age at assessment on executive function were not significantly associated with executive function. The associations between the child behavior risk and executive function at 14–15 years (β = −0.09), and between SEP at 4–5 years and executive functioning at 14–15 years (β = 0.06) remained significant when controlling for child background factors, although effects were slightly attenuated. This model was also 'just identified' meaning interpretation of fit indices is not possible, [χ 2 (0) = 0, p = 1; CFI = 1; RMSEA = 0].

#### Model 3

The third model tested the relations between early ecological risk and executive function in adolescence with mediating variables of attentional regulation at 4–5 years and approaches to learning at 6–7 years included, and controlling for child characteristics. The standardized regression coefficients are presented in **Figure 1**. There were statistically significant small associations between child behavioral risk (β = −0.24), SEP (β = 0.07), maternal anger (β = −0.08), maternal warmth (β = 0.10), and maternal consistency (β = 0.05) and attentional regulation measured contemporaneously at 4–5 years. The direct associations between child behavioral risk and executive function (β = −0.04), and between SEP and executive function (β = 0.03) were no longer significant. Maternal mental health was not significantly associated with attentional regulation at 4–5 years or executive functioning at 14–15 years. Attentional regulation at 4–5 years was significantly associated with approaches to learning at 6–7 years (β = 0.18). Attentional regulation at 4– 5 years (β = 0.10) and approaches to learning at 6–7 years (β = 0.18) were both directly associated with executive function at 14–15 years.

Overall, the model accounted for 10% of variance in executive function at 14–15 years; 15% of variance in attentional regulation at 4–5 years; and 14% of variance in approaches to learning at 6–7 years; and. The model was an adequate

TABLE 1 | Descriptive statistics and correlations for continuous variables in the analyses.


All correlations which are equal to or above 0.03 are statistically significant at p < 0.05. Correlations which are equal to or above 0.05 are statistically significant at p < 0.01.

fit to the data [χ 2 (8) = 68.61, p < 0.001, RMSEA = 0.04, CFI = 0.95]. The standardized direct, indirect and total effects for each pathway were modeled simultaneously and these effects are presented in **Table 2**. While the total effects for the significant early ecological risk variables are relatively small, the strongest contributions indicated by the total effects on executive function are family SEP, child behavior risk, and attentional regulation.

TABLE 2 | Standardized direct, indirect and total effects for full SEM model with executive function as outcome.


<sup>∗</sup>p < 0.05; ∗∗p < 0.01.

### DISCUSSION

These analyses explored developmental pathways between ecological risk in early childhood and executive function in adolescence. Measures of attentional regulation and approaches to learning were also included in the path models as possible mediating variables between early risk and later executive function skills. In utilizing data from an Australian national study, this research provided opportunity to validate findings from studies conducted in other national contexts about the relations between early risk and the development of executive function across childhood.

In the initial analytic model that examined direct pathways from early childhood to adolescence, higher child behavior risk (i.e., sleep problems, emotional dysregulation, and hyperactivityinattention problems), lower SEP and child behavior risk were associated with poorer executive functioning in adolescence. This finding aligns with previous studies indicating that early childhood disadvantage and behavior risk impacts on later cognitive control abilities (Evans and Fuller-Rowell, 2013). When the model was adjusted with the covariates related to child characteristics, these direct associations between family socio-economic circumstances and child behavior risk and executive function remained significant. Being male, speaking a language other than English at home, and higher receptive vocabulary scores at age 4–5 years were associated with higher performance on executive function. These specific child characteristics also remained influential on executive function performance in the full, indirect effects model.

When the measures for attentional regulation at 4–5 years and approaches to learning at 6–7 years were also included in the model, attentional regulation had unique and direct effects on adolescent executive function, even when the more proximal variable of approaches to learning measured at 6–7 years was included. Attentional regulation and approaches to learning mediated the relation between early ecological risk and executive function. In relation to the covariates, being female and having higher receptive language competence was associated with higher attentional regulation and being female and speaking a language other than English at home was related to higher scores on approaches to learning. Identifying as Aboriginal or Torres Strait Islander was associated with lower ratings on approaches to learning.

There were no significant direct pathways between maternal mental health and executive function or between the parenting variables and executive function, but indirect paths from these early parenting factors to executive function through attentional regulation and approaches to learning were found in the final model. This indicates that the proximal processes of maternal well-being and parenting practices measured in early childhood had primarily influenced the development of early self-regulatory skills of attentional regulation and approaches to learning at the beginning of school and was a basis for more competent executive function in adolescence.

## Supporting the Early Development of Self-regulatory Skills

The indirect pathways through which ecological factors operated on early self-regulatory skills, and then on executive function are of particular interest. An implication is that interventions aimed at improving adolescent executive function would be best targeted toward improving attentional regulation and approaches to learning in early childhood, rather than waiting until adolescence to intervene. Intervention efforts have focused on improving executive function in adolescence, especially for managing specific cognitive and academic tasks (Jacob and Parkinson, 2015). However, the focus on early self-regulatory skills may yield more and earlier benefits to disadvantaged children, because these skills promote earlier academic success and engagement at the beginning of the school years which is likely to have lasting positive benefits. Further studies that contribute to enhanced understanding about the development of self-regulatory skills in early childhood can provide information about the 'when' and 'how' of appropriate intervention.

Montroy et al. (2016) reported considerable heterogeneity in the development of self-regulation through ages 3–7 years, using data collated for 1,386 children who participated in three United States studies. For the majority of children, the overall pattern in the development of behavioral self-regulation was a period of rapid development across the preschool year (4– 5 years), although the trajectories varied as to when a period of rapid development began and in the rate of growth across the preschool year. This rapid spurt in development during the preschool year was also dependent on the level of behavioral self-regulation that children demonstrated when they entered preschool. Additionally, 20% of the children did not achieve the necessary gains in behavioral self-regulation across the preschool

year. Some of these children, at age 6–7 years, were only exhibiting self-regulation skills at the mean level which their peers had achieved at age 4–5 years. In particular, this latter group of children may be children exposed to stressful and adverse family environments and for whom the necessary parenting supports were not available from an early age.

## Child Characteristics: Executive Function, Attentional Regulation and Approaches to Learning

The child characteristics, as covariates included in the modeling, yielded some important associations with executive function and with the mediating variables of attentional regulation and approaches to learning. Child characteristics included in the analyses were gender, Aboriginal and Torres Strait Islander Status, speaking a language other than English at home, and receptive vocabulary scores at age 4–5 years.

With respect to the influence of gender, there is an interesting crossover in the findings. While boys performed more competently than girls on the executive function tasks in adolescence, girls had significantly higher attentional regulation at age 4–5 years, as well as higher teacher ratings for approaches to learning at 6–7 years. These early gender differences with respect to the advantage held by girls during childhood are evident across other studies on the development of self-regulatory skills. Boys appear to lag behind girls in the development of early selfregulation (Kochanska et al., 2001; Matthews et al., 2009, 2014). This suggests that additional supports for boys may be necessary in the early childhood years in order to address gender differences in self-regulatory competence. Suggested explanations for the gender difference have included that boys are more susceptible to adverse environmental conditions than girls and that parents and teachers hold higher expectations for girls for self-regulation than for boys (Montroy et al., 2016). These hypotheses have not been explored extensively in research, including whether gender differences in self-regulation are maintained or diminish beyond the early childhood years.

However, boys significantly outperformed girls on executive function in adolescence. One possible explanation for this finding may be related to the mode of delivery of the executive function tasks as a computer-based assessment and how that mode of assessment might differentiate performance by gender, given boys may have different levels of experience with computer gameplaying, as a contextual experience (Desai et al., 2010). Jerrim (2016) conducted cross-national analyses of 2012 data from the Program for International Assessment (PISA) for 15 year olds. The analyses involved more than 200,000 adolescents from 32 countries who completed their mathematics assessment through paper-based and computer-based modes of delivery, as a basis for decision-making on changing the mode of delivery. Jerrim (2016) reported that the gender gap varied significantly across the majority of countries, in favor of boys. The average mathematics score for boys was considerably higher than for girls under both assessment modes but the gender gap favoring boys was considerably larger for the computerbased assessment across 20 countries, including Australia. This suggests that the computer-based mode of assessment for adolescent executive function in the current study may account for at least a portion of the gender variance in favor of boys.

Other analytic work with PISA data by Jerrim (2014) also informs interpretation of the current finding with respect to children who spoke a language other than English at age 4–5 years (i.e., indicating a different cultural background to the majority English-speaking Australian population). These children had better performance on executive function as well as higher teacher ratings on approaches to learning, Jerrim investigated why children of East Asian descent in Australia, who were born and raised in Australia and who were second-generation immigrants, outperformed their Australian peers who were not from immigrant families. The East Asian population constitutes the highest proportion of non-English speaking immigrants in Australia. The 2012 PISA data for 15-year-old adolescents for mathematics assessments were examined, as well as a range of other child-report data gathered in PISA assessment including measures of academic motivation, academic effort, time spent studying out of school, work ethic, and a self- control scale. The second-generation East Asian immigrants outperformed their Australian peers in mathematics by more than 100 PISA test points (i.e., equivalent of two and a half years of schooling). Jerrim proposed that a combination of family investments made by parents for their children contributed to this outcome. These factors included family selection of high quality schools, family values placed upon education, family investment in out-ofschool tuition, and the adolescents' high work ethic and high aspirations for their future education, reflecting self-regulatory behaviors.

The LSAC measure for receptive language at 4–5 years was also influential on executive function performance and on parentreported attentional regulation. Language development is an important child characteristic known to affect the development of self-regulation, although expressive language is most often assessed rather than receptive language, as in the LSAC study. Language competence gives children abilities to organize and categorize information that enable more efficiency in retaining and processing incoming information. However, more research is needed to better understand the relations between the development of language, self-regulation and executive function over time (Bohlmann et al., 2015). Language also is a tool to deal with abstract ideas and propositions in abstract thinking and relational reasoning that is important to executive function in adolescence (Crone and Dahl, 2012; Steinbeis and Crone, 2016).

## Implications for Prevention of Poor Self-Regulation in Early Childhood

In these analyses, the indirect pathways operating from ecological risk through early self-regulation skills to executive function, indicated that children who already exhibited behavioral risk (sleep problems, emotional dysregulation, hyperactivityimpulsivity), whose families had lower socio-economic status,

and for whom there may have been maternal mental health issues and poorer parenting, had poorer self-regulation skills (attentional regulation and approaches to learning) in the early childhood years. As Montroy et al. (2016) noted the developmental trajectories for behavioral self-regulation from 3 to 7 years are important. At age 4–5 years and, even before age 4, sufficient family supports can be provided to ensure that children begin school with requisite skills to attend and engage productively in classroom activities. Teachers in early childhood classroom are also in a position to first recognize children's inabilities to focus attention, follow instructions, and persist in completing tasks when they begin school. These self-regulatory behaviors are malleable and can be addressed with the right supports for children, their families, and teachers.

The Australian Government initiative to identify the incidence and prevalence of vulnerable children in the first year of school using data from the AEDC (Australian Government, 2016) is an important first step but more understanding is needed on how to use this data to target the most vulnerable children for intervention and family support programs who have problems with language, cognitive, and communication skills, and who lack social competence, and emotional maturity. For example, Goldfeld et al. (2017) in an analysis of AEDC identified that mental health competence is unequally distributed across the Australian child population at school entry and is strongly predicted by measures and correlates of disadvantage. It is important to intervene early with children who demonstrate early behavior risk at 4 years, including sleep problems, emotional dysregulation (high reactivity) and hyperactive-impulsive behaviors as measured in the current study as part of child behavioral risk. Other research (Williams and Sciberras, 2016; Williams et al., 2017) indicates the reciprocal relations among these behaviors from an early age. Sleep problems across the early childhood period, in particular, may drive and exacerbate emotional and attentional dysregulation. Interventions that address early sleep problems could be explored in order to reduce children's behavior risk when beginning school and may have downstream benefits for executive function development.

## Strengths and Limitations

A strength of this research lies in the use of longitudinal data from a large national study. The analyses also used different sources of data that included parent report, teacher report, and direct child assessment. However, the national representative sample does not represent a low income or disadvantaged population in line with more specific US studies that have used highly selected samples from disadvantaged populations or samples with wide income diversity (Bradley et al., 2001). The relatively advantaged population in the current study may explain the smaller estimates and effect sizes in the associations between socio-economic status and adolescent outcomes for executive function. The causal relationships between socioeconomic status and executive function have not yet been fully explored and this may only be possible with well-designed intervention studies.

Furthermore, it is acknowledged that the parent-report measure of attentional regulation when the child was 4–5 years had a degree of conceptual and measurement commonality with an item used in computation of the Child Behavior Risk Index. This item was based on parent-report on the SDQ subscale scores for inattention/hyperactivity symptoms, for which a clinically significant cut-point (≥90%) was used to create a binary item indicating high risk. This was summed with other binary risk items similarly constructed for sleep problems and emotional dysregulation. In comparison, the attentional regulation measure comprised a summary score for four items rated on a 6-point scale that focused on persistence and employed positively framed items about attentional behaviors (e.g., When this child starts a project such as a puzzle he/she works on it until it is completed even if it takes a long time).

Additional limitations of the study include a lack of finegrained measurement of self-regulatory behaviors in childhood which would usually include measurement of inhibitory control (Rhoades et al., 2009) and working memory (Simmering, 2012). Furthermore, the components of the model of executive function used in this study were somewhat different from components assessed in many other child development studies that have a strong focus on inhibitory control, including using effortful control as a primary theoretical model (Zhou et al., 2012). The measures of executive function available in this secondary dataset had less focus on emotional control involved in solving complex and novel tasks.

While the benefits of secondary data analysis with large longitudinal datasets include access to large samples with multiple time points of data collection, these advantages are often offset by the possible breadth and depth of measurement. Future studies could include more breadth of measurement of self-regulation and executive function, at more frequent time points, across childhood and adolescence. Such studies will be able to explicate the nature of developmental pathways involving ecological risk, self-regulatory behaviors and executive function in adolescence.

## CONCLUSION

Executive function is a set of neurocognitive processes that allow individuals to achieve short- and long-term goals, particularly when they are required to adjust their thinking and their actions as environmental demands change (Crone and Dahl, 2012). The development of executive function and associated self-regulatory skills across childhood and adolescence are important to later successful adjustment and achievement (Moffitt et al., 2011; Diamond, 2013). In these analyses, while the effects of the early ecological risk on the development of executive function were relatively small, they operated through children's early self-regulatory behaviors of attentional regulation and approaches to learning, at the beginning of the school years. The research findings have identified possible directions for early intervention to enhance self-regulatory competence in early childhood in

order to ensure later capabilities for executive control in adolescence.

### AUTHOR CONTRIBUTIONS

DB, NH, SW, and KW contributed to the initial development of the theoretical models. NH, SW, and KW contributed to different parts of the data preparation and data analysis. DB drafted the manuscript, with input from NH, SW, and KW. All authors approved the final version of the manuscript.

### REFERENCES


#### ACKNOWLEDGMENTS

This paper uses unit record data from Growing Up in Australia: The Longitudinal Study of Australian Children (LSAC). The LSAC study is conducted in partnership between the Department of Social Services (DSS), the Australian Institute of Family Studies (AIFS) and the Australian Bureau of Statistics (ABS). The findings and views reported in this paper are those of the authors and should not be attributed to the DSS, AIFS, or the ABS.



public safety. Proc. Natl. Acad. Sci. U.S.A. 108, 2693–2698. doi: 10.1073/pnas. 1010076108



**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2017 Berthelsen, Hayes, White and Williams. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

## History of Childhood Maltreatment and College Academic Outcomes: Indirect Effects of Hot Execution Function

#### Marilyn C. Welsh\*, Eric Peterson and Molly M. Jameson

School of Psychological Sciences, University of Northern Colorado, Greeley, CO, United States

College students who report a history of childhood maltreatment may be at risk for poor outcomes. In the current study, we conducted an exploratory analysis to examine potential models that statistically mediate associations between aspects of maltreatment and aspects of academic outcome, with a particular focus on executive functions (EF). Consistent with contemporary EF research, we distinguished between relatively "cool" EF tasks (i.e., performed in a context relatively free of emotional or motivational valence) and "hot" EF tasks that emphasize performance under more emotionally arousing conditions. Sixty-one male and female college undergraduates self-reported childhood maltreatment history (emotional abuse and neglect, physical abuse and neglect, and sexual abuse) on the Childhood Trauma Questionnaire (CTQ), and were given two EF measures: (1) Go-No-Go (GNG) test that included a Color Condition (cool); Neutral Face Condition (warm); and Emotion Face condition (hot), and (2) Iowa Gambling Task (IGT), a measure of risky decision making that reflects hot EF. Academic outcomes were: (1) grade point average (GPA: first-semester, cumulative, and semester concurrent with testing), and (2) Student Adaptation to College Questionnaire (SACQ). Correlational patterns suggested two EF scores as potential mediators: GNG reaction time (RT) in the Neutral Face condition, and IGT Block 2 adaptive responding. Indirect effects analyses indicated that IGT Block 2 adaptive responding has an indirect effect on the relationship between CTQ Total score and 1st semester GPA, and between CTQ Emotional Abuse and concurrent GPA. Regarding college adaptation, we identified a consistent indirect effect of GNG Neutral Face RT on the relationship between CTQ Emotional Neglect and SACQ total, academic, social, and personal–emotional adaption scores. Our results demonstrate that higher scores on a child maltreatment history self-report negatively predict college academic outcomes as assessed by GPA and by self-reported adaptation. Further, relatively "hot" EF task performance on the IGT and GNG tasks serves as a link between child maltreatment experiences and college achievement and adaptation, suggesting that hot EF skills may be a fruitful direction for future intervention efforts to improve academic outcomes for this population.

Keywords: child maltreatment, executive functions, academic achievement, academic adaptation, college students

Edited by:

Dieter Baeyens, KU Leuven, Belgium

#### Reviewed by:

Brian Kavanaugh, Alpert Medical School, Brown University, United States Michiel Robert De Boer, VU University Amsterdam, Netherlands

> \*Correspondence: Marilyn C. Welsh marilyn.welsh@unco.edu

#### Specialty section:

This article was submitted to Developmental Psychology, a section of the journal Frontiers in Psychology

Received: 31 January 2017 Accepted: 13 June 2017 Published: 05 July 2017

#### Citation:

Welsh MC, Peterson E and Jameson MM (2017) History of Childhood Maltreatment and College Academic Outcomes: Indirect Effects of Hot Execution Function. Front. Psychol. 8:1091. doi: 10.3389/fpsyg.2017.01091

## HISTORY OF CHILD MALTREATMENT AND COLLEGE ACADEMIC OUTCOMES: MEDIATION BY EXECUTION FUNCTION

College students with a developmental history of child maltreatment represent a significant subset of the university population. We believe these students may comprise an important but overlooked group for targeted interventions aimed at promoting educational success. Although we are not aware of any definitive epidemiological studies, current research suggests that the base rate of maltreatment history among university students ranges from the mid-20% to more than 40% (e.g., Duncan, 2000; Freyd et al., 2001; Gibb et al., 2009). Across several semesters of research in our laboratory, recruiting from a mid-sized university in the western United States, we have consistently yielded a base rate of approximately 30% students with a maltreatment history. Presumably, differences in base rate across the extant studies reflect characteristics related to each individual college setting (the average socioeconomic status of students, etc.) or factors specific to mechanism of recruitment (specific trauma measures employed, sampling methods, etc.). Moreover, all studies of childhood history of maltreatment rely on self-report or clinical interview measures (Roy and Perry, 2004), which results in an unavoidably heterogeneous population of students with regard to the timing, nature, and severity of childhood trauma. Nevertheless, college students reporting childhood maltreatment represent an important, understudied subgroup of the general student population that are at increased risk for a range of cognitive, emotional, behavioral, and psychiatric sequelae (e.g., Mersky and Topitzes, 2010).

Whereas the negative impact of child maltreatment on academic performance in children has been well documented (e.g., Perzow et al., 2013; Kiesel et al., 2016), an examination of college achievement and adaptation in students with a history of maltreatment has not been a priority in research. In the only longitudinal study to date, Duncan (2000) followed 210 college freshmen, 36% who were identified as having experienced child abuse (emotional, physical, and/or sexual). Four years later, those students reporting multiple forms of abuse or sexual abuse only were significantly less likely to be still enrolled in college than non-victims, and PTSD symptoms during the second week of freshman year interacted with abuse to predict attrition. A small empirical literature suggests that college students reporting a history of childhood trauma also report lower levels of college adaptation and adjustment (Banyard and Cantor, 2004; Elliot et al., 2009; Maples et al., 2014). In a rare study of grade point average (GPA) as a college outcome, Jordan et al. (2014) reported that women who had been sexually assaulted as adolescents not only entered college with lower high school GPAs, but also earned lower GPAs by the end of their first year in college. This emerging literature suggests adverse college adaptation and achievement in students reporting a history of maltreatment; however, very little is known about which specific maltreatment sequelae mediate poor adaptation to the college environment. Identification of such variables would provide potential directions for effective intervention to enhance the chances of academic success and its resultant health and economic benefits (e.g., Leonhardt, 2014; Pew Social Research Center, 2014).

While it is clear that maltreatment history confers risk for poor outcome, the problem of heterogeneity among the maltreatment group presents an inherent challenge for the identification of individuals who may carry the greatest risk. This reflects the potential range of different negative experiences that may have influenced retrospective self-report, but it also reflects the varying degrees of individual resilience. Given the multifactorial relationship between developmental history and outcome, we examine current phenotype in an effort to determine which individuals who report maltreatment may be at the greatest risk for poor outcome. This study represents the first in the published literature to examine a novel potential mediator in the pathway between a self-reported history of maltreatment in childhood and academic achievement and adaptation in college: the indirect effects of executive function (EF) processes, and more specifically "hot" EFs.

Executive function (including planning, working memory, inhibition, and flexibility) is a particularly relevant cognitive domain to examine for three reasons. First, evidence shows that stress early in life has deleterious effects on the development and function of the prefrontal cortex (e.g., McEwen, 2008), the brain system that mediates EF (e.g., Kane and Engle, 2002). Performance in traditional cool EF has been particularly associated with the dorsolateral prefrontal cortex. However, since the seminal work of Bechara et al. (2000) and Bechara (2004), researchers have examined the role of orbitofrontal and ventromedial regions of prefrontal cortex in contexts that involve managing heightened emotional arousal (Goel and Dolan, 2003). Given our interest in hot executive processes, the evidence of functional (Etkin and Wager, 2007) and structural (Gold et al., 2016) imaging of ventromedial impairment associated with trauma is particularly relevant. Consistent with these early neurocognitive impacts, deficits in EF processes mediated by the prefrontal cortex have been demonstrated in adolescents, college students, and adults with self-reported maltreatment histories (Spann et al., 2012; Nikulina and Widom, 2013; Kirk-Smith et al., 2014; Mothes et al., 2015; Vasilevski and Tucker, 2016). For example, college women reporting a history of repeated childhood sexual abuse exhibited performance deficits on a modified Go-No-Go task (Navalta et al., 2006), one of the measures used in the current study.

Second, the types of self-regulatory behaviors that are subsumed within the construct of EF have been linked empirically to success in school for children (e.g., St Clair-Thompson and Gathercole, 2006; Masten et al., 2012; Willoughby et al., 2016) and have recently been the target of interventions to improve academic performance (Diamond and Lee, 2011; Diamond, 2012). Although there are established links between individual differences in EF and academic achievement and adjustment for children, parallel research with the college population is relatively sparse. Individual differences in EF, such as attentional control, planning, self-monitoring and self-regulation have been found to predict college achievement in terms of credits achieved (Baars et al., 2015) and academic task completion

(Rabin et al., 2011; Gustavson et al., 2015). Thus, deficits in EF related to childhood maltreatment and the link between EF and college success, clearly point to examining EF skills as a potential mediator in this novel investigation of college students.

Finally, we have suggested (Peterson and Welsh, 2014; Welsh and Peterson, 2014) that the examination of EF in real-world settings such as the college environment would benefit from the use of tasks that are specifically designed to measure these skills in more arousing contexts, referred to as hot EF (Zelazo and Carlson, 2012; Peterson and Welsh, 2014; Welsh and Peterson, 2014). Evidence from the trauma literature makes clear that individuals with a maltreatment history are more likely to have particular difficulty with emotion regulation (e.g., Cloitre et al., 2005; Etkin and Wager, 2007), which could easily disrupt executive processes in realworld contexts. One promising approach to examining hot executive processes involves adapting a traditional executive instrument (e.g., Stroop Interference, Go-No-Go) to enable an examination of the potential role of "heating" (i.e., adding an emotional or motivational component), as we did in the current study. Such a task manipulation can allow for a comparison of task performance between relatively unheated (i.e., cool) and heated conditions. For the study of maltreatment history, task manipulations that involve replacing neutral stimuli with trauma-relevant, emotionally valenced stimuli may be particularly fruitful. Two research groups manipulated the Stroop task of inhibition and flexibility by introducing threat word stimuli (Fontenot et al., 2015) and emotion face stimuli (Caldwell et al., 2014), finding poorer performance by young adults reporting a history of maltreatment particularly in the heated conditions. Cromheeke et al. (2014) administered the Spatial Emotional Match to Sample task, a working memory test heated with the use of neutral and emotion faces, to adult women with differing trauma histories. Women with child or adult histories of sexual or physical abuse exhibited specific impairments on the more difficult task conditions and only with the happy emotion faces. In the current study, we heated the traditionally cool EF task, Go-No-Go, by including arousing face stimuli to investigate whether history of childhood maltreatment would specifically co-vary with performance in these conditions.

As reviewed above, the literature suggests that individuals with a child maltreatment history are at risk for both negative college outcomes and deficits in EF, and a specific focus on hot executive processes may be particularly relevant to the study of cognitive sequelae of maltreatment.

The college academic setting places demands on emotion regulation (in testing settings, interpersonal contexts, etc.) and it seems likely that hot EF tasks may be sensitive to deficits associated with emotion regulation in college students, as it is in children (e.g., Woltering et al., 2015). In summary, a small extant literature suggests that a history of childhood maltreatment confers risk for college outcomes; however, it remains a relatively understudied area of inquiry. Research demonstrates that the prediction of college academic achievement and adaptation is a multifactorial enterprise, such that any single variable will predict only a small amount of variance in these outcomes (e.g., more proximal variables such as motivation and metacognition predicting GPA yielded correlations ranging from 0.13–0.39; Komarraju and Nadler, 2013). Even a study that examined the degree to which intelligence predicted college GPA yielded correlations ranging from 0.15–0.21 (Murray and Wren, 2003). Not only is the examination of academic outcomes in this potentially vulnerable subgroup of college students relatively rare, we know of no published study of EF processes as a potential mediator of this association.

## Purpose and Research Questions

Our study addresses the question: which EFs have an indirect effect on the relationship between a self-reported history of exposure to different types of child maltreatment (emotional, physical, and sexual abuse, as well as neglect) and college GPA and adaptation? Recent studies have targeted a range of interesting adulthood outcomes of maltreatment, such as mental health, risk taking, social relationships, cognition, and academic performance (e.g., Duncan, 2000; Higgins and McCabe, 2000; Banyard and Cantor, 2004; Cromheeke et al., 2014). However, very few published studies have tested mediational models linking maltreatment history to a given outcome (e.g., Allwood and Bell, 2008; Bachrach and Read, 2012). We are not aware of any published studies that have addressed the central aim of the current exploratory, descriptive study: EF processes as mediators between experiences of child maltreatment and academic outcomes in college students.

## MATERIALS AND METHODS

## Participants

The sample included 64 undergraduate students (17 males, 47 females; M = 19.33, SD = 2.21) who volunteered to participate through the Psychological Sciences participant pool. These participants volunteered for an earlier screening session with the study name "Stressful Life Experiences, Cognition, Emotion, and Academic Adaptation." In this earlier session, we administered a maltreatment screen using the Child Trauma Questionnaire (see descriptions below), the Vocabulary subtest of the WAIS-IV, and a demographics survey that included years of maternal education (as a proxy for SES; see Zhang and Wang, 2004; Classen and Hokayem, 2005; Baum and Ruhmb, 2009). Screening participants also were given the opportunity to provide informed consent for us to contact them back for participation in a future study (i.e., the current study) and to allow us to monitor their academic progress longitudinally (i.e., recording academic outcomes such as grade point average, GPA, at the conclusion of each semester). The only exclusionary criteria applied in this invitation were: (1) not born in the United States (given potential differences in self-reporting child maltreatment history), (2) no consent given for follow-up, and (3) validity problems on the Childhood Trauma Questionnaire (see description, below). The descriptive statistics for age, gender, Vocabulary subtest raw score, and years of maternal education are displayed in **Table 1**.

#### TABLE 1 | Descriptive statistics for variables used in subsequent analyses.


SACQ scales: Academic Adjustment; Social Adjustment; Personal/Emotional Adjustment; Attachment (to college); Total Adjustment. GNG Accuracy scores for the No Go conditions. IGT scores reflect number of adaptive choices – number of maladaptive choices.

## Materials and Procedures

#### Childhood Trauma Checklist (CTQ)

The CTQ (Bernstein et al., 2003) is a retrospective self-report of childhood and adolescent abuse and neglect. The measure demonstrates adequate reliability for each scale (Bernstein et al., 1994), and has been validated against clinician's reports of history of childhood maltreatment that included evidence from Child Protective Services and court records (Bernstein et al., 1997). Across 28 items, respondents rate the frequency of occurrence for each item, ranging from 1 (never true) to 5 (very often true). The CTQ yields five clinical scales, three of which assess abuse (Emotional, Physical, and Sexual) and two that assess neglect (Emotional and Physical). The total CTQ score was used as an overall child maltreatment score, while summed scores from each subscale were used to indicate degree of severity for each type of maltreatment. The CTQ includes three validity-check questions regarding the overall quality of family life during the participant's childhood and a total score of 3 or 4 indicates a potential bias toward social desirability in the form of over-reporting high-quality childhood experiences. Therefore, participants who scored a 3 or 4 in the original screening sample were not invited to participate in the current study.

#### Go-No-Go (GNG)

This classic EF task assesses conflict monitoring and inhibitory control, and can be easily manipulated to include both "cool" stimuli (e.g., colors, shapes) and "hot" stimuli (faces) to examine the involvement of both dorsolateral and ventromedial prefrontal cortical regions (e.g., Hare et al., 2008). In each trial, participants see a target that varies (i.e., a blue shape versus a yellow shape). For one stimulus (the "Go" condition) participants are instructed to make a reaction time (RT) button press response; for the other stimulus (the "No-Go" condition), participants must withhold a response. We adapted the traditional Go-No-Go paradigm to include three blocks: one cool (stimuli were colors, red versus blue); one somewhat heated (stimuli were neutral faces, male versus female); and a third, hot (emotion faces, male and female

faces displaying anger or fear). All participants completed the blocks in the following order: Color, Neutral Face, Emotion Face. For each block, there were 15 practice trials and 120 test trials with 66.6% Go trials in the Color block and 75% Go trials in the Neutral Face and Emotion Face blocks (Casey et al., 2011). The stimulus was presented for 1 s (unless terminated earlier by a participant response) and the inter-stimulus interval was 1 s. Go versus No-Go stimuli were counterbalanced across participants such that in the Color block, half of the participants made a reaction time response to the blue circle (i.e., go trials), and the other half responded to the yellow circle as the Go stimulus. In both the Neutral and Emotion Face blocks, half of the participants responded to the male face and half to the female face, with the constraint that the gender of the Go stimulus be equal across both male and female participants. For the Emotion Face block, half of the Go stimulus faces displayed an angry emotion and half displayed a fearful emotion. The source of the face stimuli was the NimStim Face Stimulus Set<sup>1</sup> . Participants were instructed to respond as quickly as possible to the designated Go stimulus while maintaining reasonable accuracy and to withhold their response to the No-Go stimulus. The dependent measures were RT to the Go stimuli and accuracy of responding (percentage correct) to both the Go stimuli and the No-Go stimuli in each of the three blocks of trials. Within the Emotion Faces block, RT and accuracy were examined separately for the angry and fear faces.

#### Iowa Gambling Task (IGT)

This published, standardized, computerized task (Chan et al., 2008) represents a simulated card game to examine risky decision making and learning from feedback. It has been cited in the literature as a hot EF task (e.g., Zelazo and Carlson, 2012) and performance has been associated with the orbitofrontal/ventromedial region of the prefrontal cortex (Bechara et al., 1994; Damasio, 1999). In their review of the IGT, Buelow and Suhr (2009) conclude that the task is a valid instrument for identifying decision making deficits (i.e., risky decision making) in a range of at-risk and clinical populations, such as substance abusers, pathological gamblers, Obsessive-Compulsive Disorder, Schizophrenia, and Attention Deficit Hyperactivity Disorder. The task presents the participant with 100 trials in which he or she selects a card from one of four decks and each selection is followed by a hypothetical monetary gain or loss, or both. Two of the decks (A and B) yield an initial rapid gain followed by loss across the decks while the other two decks (C and D) confer slow gains resulting in an overall positive outcome. Therefore, the IGT includes 100 selections from four decks, determined by the participant. The scoring typically divides the 100 selections in five blocks of 20 selections each, which differ across participants depending on their deck selections. The score indicates the number of adaptive (less risky) choices minus the number of maladaptive (more risky) choices for each of five blocks of 20 trials. A positive score reflects relatively more adaptive choices on that block. Participants began the task with a hypothetical \$2000 such that they could finish above or below this level. To further heat the task to elicit hot EF, participants were told that they if they finished the task with more than \$2000 (i.e., a positive outcome) they would receive a state lottery \$1 scratch ticket.

#### Student Adaptation to College Questionnaire (SACQ)

This normed and standardized, 67-item, Likert-scale self-report instrument (Baker and Siryk, 1999) assesses overall adjustment to college (total score), as well as adjustment in four specific areas: Academic Adjustment, Personal–Emotional Adjustment, Social Adjustment, Attachment (to the institution). The survey is administered in 15–20 min and has been successfully used to identify college students who are at risk for attrition (Credé and Niehorster, 2012). Norms are based on a sample of more than 1,300 male and female college freshmen and stratified by semester of attendance (first and second semesters in college).

#### Grade Point Average (GPA)

Three GPA indices were taken directly from the participants' official academic transcripts: (1) GPA earned in their first semester at the university; (2) GPA earned during the semester in which the testing took place; and (3) Cumulative GPA across all semesters at the university.

#### Overall Procedure

All participants were recruited from a larger Screening Session (N = 120). Participants who scored in the moderate to severe range in any one of the five CTQ subscales (approximately 33% of the screening sample) and participants scoring at lower levels on CTQ subscales were invited back participate in a lab visit, which included the IGT, GNG, and other tasks and questionnaires, as part of a larger study. Of the 80 participants targeted for the lab visit, 16 did not participate in the lab visit because they could not be contacted, could not be scheduled, declined to participate, or did not show up for the scheduled session. The CTQ was individually administered during a single test session of 1 h. The GNG, IGT, and SACQ were administered in individual lab visits approximately 2 h in length. The GPA information was retrieved from the university academic records system approximately 2 weeks after the end of the semester.

#### Data Analysis

To answer the main questions about the indirect effects of potential mediating EF variables on the relationship between self-reported history of child maltreatment and college outcomes, indirect effects analysis using bias-corrected bootstrapping via Hayes (2013) PROCESS macro for SPSS was utilized; this procedure analyzes the confidence intervals to determine indirect effects of mediating variables. As Preacher and Selig (2012) state, "bias-corrected bootstrapped confidence intervals have fairly accurate Type I error rates and higher power when compared to competing methods" (p. 81), and other researchers have demonstrated that bias-corrected bootstrapping procedures require the smallest sample size of several methods of analyzing indirect effects to achieve comparable levels of power (Fritz and MacKinnon, 2007). While others have found increases in Type I error rates with bias-corrected bootstrapping in small sample sizes (e.g., Fritz and MacKinnon, 2007; Fritz et al., 2012),

<sup>1</sup>http://www.macbrain.org/resources.htm

this statistical approach is still considered to be the strongest and most robust method of analyzing indirect effects. Hayes (2009) referred to it as one of the more valid and powerful methods for testing intervening variable effects. In fact, Fritz and MacKinnon (2007) provide the recommendation that researchers use the bias-corrected bootstrap test for mediation analyses because of its increased power. Further, to determine which EF variables might have an indirect effect on these relationships, correlation analyses were conducted between predictor, potential mediating, and outcome variables, and the magnitude of the relationships was examined for potential mediators. While we also considered significance of the relationships, the magnitude of the correlations was of more import because of the exploratory and novel nature of this research.

### RESULTS

## Descriptive Statistics

The means and standard deviations of all variables included in subsequent analyses are included in **Table 1** below. While demographic variables likely influence the relationships between history of child maltreatment and college outcomes, the relatively small sample size and exploratory nature of this study resulted in a decision to not statistically control for these variables.

## Associations between Childhood Trauma, Academic Outcomes, and Executive Functioning Performance

Because the nature of this research is exploratory, correlational analyses were conducted to guide our decision making on variables with potential indirect effects. This is considered to be steps one and two (i.e., show that the predictor and outcome are correlated; show that the predictor and mediators are correlated) of conducting traditional mediational analyses (see, for example, Baron and Kenny, 1986). Based on the results of these analyses, five potential mediators were found: GNG Neutral Face RT, GNG Fear Face RT, GNG Anger Face RT, IGT Block 2 adaptive reasoning, and IGT Block 3 adaptive reasoning. Mediators explored in the indirect analyses satisfied the requirement of associations with both the predictor (CTQ) and outcome (academic) variables.

Potential mediator, IGT Block 2 adaptive responding, correlated significantly with CTQ Emotional Abuse, Emotional Neglect, and CTQ Total Score (r = −0.28, r = −0.26, r = −0.28, respectively). Moreover, IGT Block 2 adaptive responding correlated significantly with cumulative GPA, concurrent GPA, and first semester GPA (r = 0.30; r = 0.36, r = 0.27, respectively). IGT Block 3 adaptive responding correlated significantly with CTQ Emotional Abuse (r = −0.26), as well as with cumulative GPA and first semester GPA (r = 0.29; r = 0.30).

Other potential mediators were identified in the GNG task. GNG Neutral Face RT correlated with CTQ Emotional Abuse (r = −0.23), CTQ Emotional Neglect (r = −0.23), and CTQ Total (r = −0.22), and GNG Fear RT correlated with CTQ Emotional Abuse (r = −0.29). Additionally, GNG Neutral Face RT correlated significantly with SACQ Academic Adjustment, Personal Emotional Adjustment, Attachment, and Total Score (r = 0.34; r = 0.35, r = 0.26; r = 0.39), as well as with all three GPA measures (r = 0.32 with cumulative, r = 0.29 with concurrent, and r = 0.32 with first semester). GNG Fear Face RT correlated significantly with SACQ Personal Emotional Adjustment (r = 0.25) as well as concurrent GPA (r = 0.25). GNG Anger Face RT correlated significantly with CTQ Emotional Abuse (r = −0.29), CTQ Total (r = −0.25), SACQ Personal Emotional Adjustment (r = 0.29), and SACQ Total (r = 0.26).

#### Indirect Effects Analysis

Based on the correlational patterns, GNG Neutral Face, Fear Face, and Anger Face RTs, as well as IGT Blocks 2 and 3 adaptive responding were tested as mediators of the relationship between the CTQ scores and academic measures (i.e., first semester GPA, cumulative GPA, GPA concurrent with testing semester, and total and subscale scores of the SACQ) using indirect effects analysis. See **Figure 1** for the general mediation model being tested.

First, bivariate correlational analyses demonstrated that the CTQ Emotional Abuse (EA) and Total scores were marginally associated with concurrent and first-semester GPA, respectively (**Table 2**), thus, we examined the indirect effects of the potential mediators with the outcome of academic achievement. There was a significant indirect effect of CTQ Total Score on first semester GPA through IGT Block 2 adaptive responding, b = −0.005, BCa CI [−0.015, −0.0007]. This represents a small effect, η <sup>2</sup> = 0.093, BCa CI [0.013, 0.231]. There was also an indirect effect of CTQ Emotional Abuse subscale score and Spring 2016 (concurrent with testing) GPA through IGT Block 2 adaptive responding, b = −0.02, BCa CI [−0.043, −0.002], which also represents a small effect, η <sup>2</sup> = 0.094, BCa CI [0.01, 0.232] (see **Figure 2**).

Though correlations indicated that GNG Neutral Face RT, GNG Fear RT, and GNG Anger RT may serve as mediators through which the relationships between CTQ Total and CTQ Emotional Abuse and first semester GPA are effected, the indirect effects analyses did not support these potential mediators.

Next, we examined the indirect effects of these mediators with the academic measure of total and subscale scores of the SACQ as the outcome variable. The bivariate correlational analyses demonstrated that the CTQ Emotional Neglect (EN) score was related to aspects of college adaptation (**Table 2**), and that GNG Neutral Face RT, and IGT Blocks 2 and 3 were potential mediators of this association. Regarding college adaptation, a consistent indirect effect of GNG Neutral Face RT on the relationship between CTQ Emotional Neglect and SACQ total (b = −1.314, BCa CI [−2.977, −0.102]), academic (b = −0.473, BCa CI [−1.138, −0.009]), social (b = −0.268, BCa CI [−0.809, −0.017]), and personal-emotional adaptation (b = −0.341, BCa CI [−0.829, −0.011]) scores was found. Measures of effect size for all indirect effects were small (η <sup>2</sup> = 0.083, 0.073, 0.05, and 0.07, respectively) (see **Figure 3**).

Though correlations indicated that IGT Block 2 might mediate the relationship between CTQ EN and SACQ Total scores, the indirect effects analysis failed to support this by showing no effect of IGT Block 2 on this relationship. Indirect effects analysis also

TABLE 2 | Direct effects of childhood maltreatment measures on college adaptation measures.


p-values are included parenthetically; statistically significant correlations are denoted with <sup>∗</sup> .

failed to support IGT Block 3 as a mediating variable in any analyses.

## DISCUSSION

The purpose of this study was to add to a small, but emerging, literature regarding the extent to which a selfreported history of child-maltreatment (including emotional and physical abuse and neglect, and sexual abuse) predicts college outcomes in terms of GPA and self-reported adjustment (academic, social, etc.). While college students with trauma histories have been studied extensively, the focus has been mainly on mental health and other life adaptation outcomes, with surprisingly little attention paid to a critical milestone of emerging adulthood that has enormous public health value: success in college. We studied a volunteer sample of college undergraduates, as opposed to a sample of clinically diagnosed young adults (Radomski et al., 2016) or college students with documented PTSD symptoms (e.g., Kaysen et al., 2014). We believe this to be the first study to examine cognitive mediators between the self-reported maltreatment history and academic outcomes, in the form of hot and cool EF processes.

In our relatively heterogeneous volunteer sample we found that higher total scores on the CTQ, and on the Emotional Abuse and Neglect scales in particular, predicted poorer outcomes in terms of both GPA and self-reported adaptation. These findings are consistent with recent studies involving college students with a maltreatment history that have identified difficulties with adjustment to college (Banyard and Cantor, 2004; Elliot et al., 2009; Maples et al., 2014) and achievement as measured by

GPA (Jordan et al., 2014), as well as increased risk for attrition (Duncan, 2000).

Our primary goal was to examine a mediational pathway from child maltreatment to college academic outcomes through EF, assessed in relatively cooler and warmer testing contexts. In light of the literature on EF deficits in children experiencing maltreatment, evidence that such deficits continue into young adulthood (Navalta et al., 2006), and documented deficits in adults with a trauma history (Navalta et al., 2006; Spann et al., 2012; Mothes et al., 2015; Vasilevski and Tucker, 2016), we wished to examine the indirect role of EF in predicting academic outcomes in this sample. We believe the evidence for impaired emotion regulation (Cloitre et al., 2005; Etkin and Wager, 2007) suggests that hot EF may be particularly vulnerable in this population. Aligned with these expectations, we identified small, but significant, indirect effects of EF processing in relatively hot conditions on the pathway between history of childhood maltreatment and academic outcomes in this sample of college students. Further, different hot EF tasks predicted different measures of academic success. This is the first published study that we are aware of to report indirect effects of EF processes between childhood maltreatment and adult college performance.

With regard to academic achievement, adaptive responding on the IGT mediated both the pathways between CTQ Total score and first-semester GPA, and CTQ Emotional Abuse score and GPA concurrent with testing. In the original IGT, a classic hot EF task, participants respond to feedback regarding gains and losses of hypothetical money in an arousing testing context. We further heated the task by providing a state lottery scratch ticket to participants who managed to complete the task with winnings, rather than an overall loss. During the second block of 20 IGT trials, we observed that individuals with higher scores on the CTQ (more reported overall maltreatment or specific emotional abuse) were less likely to shift their risky decision making to adaptive responses than individuals reporting lower levels of maltreatment. Our finding that deficits in hot EF in the form of IGT performance correlate with both reports of childhood emotional abuse and college GPA is a novel contribution to literature. It will be of interest to examine further how adaptive responding, reflecting learning from positive and negative feedback, is impacted by experiences of maltreatment during childhood, and may therefore impact success in academic settings.

With regard to self-reported adaptation to college, our findings suggest a different EF mediator, reaction time to Go stimuli in the Neutral Face Condition of the Go-No-Go Task of inhibition. This heated condition served as the significant indirect effect, rather than the cool Color Condition. We predicted that potential EF mediators would be found among our heated EF measures (IGT, Neutral and Emotion Face IGT reaction time and accuracy), and we did find evidence of this in the form of IGT adaptive responding and Go-No-Go Neutral Face reaction time. Individuals with higher scores on CTQ Emotional Neglect exhibited faster reaction times to the Neutral Face Condition of the Go-No-Go Task, and faster reaction times to this condition also predicted less positive adaptation to college in the academic, social, and personal–emotional domains. For a few reasons, we believe the use of faces as a heated stimulus makes good sense for our central research question. First, central importance of faces for social interaction is well appreciated. From early infancy, humans are "wired" to attend to faces and yet the development of face expertise continues into adulthood (Gliga and Csibra, 2007; Germine et al., 2011). In the maltreatment literature, many studies point to atypical face processing development (Pollak and Sinha, 2002). There is certainly evidence that development in a maltreatment milieu can be associated with relatively more exposure to negative facial expressions (Pollak and Sinha, 2002). Given the evidence for amygdala participation of processing all face emotions (though negative faces, in particular, Adolphs, 2002) and also just attention to faces in general (e.g., looking at the eye region of another, Adolphs, 2008), it also makes sense that individuals with maltreatment history may experience relatively greater arousal to faces. An important question raised by our results concerns the effect for neutral faces despite the lack of an effect for emotion face stimuli. First, we consider the neutral face stimuli that yielded an effect.

The notion that even neutral faces can serve as a potential threat stimulus such that face attention might yield individual differences in amygdala arousal was highlighted by a study by Schwartz et al. (2003). Following up on a longitudinal

temperament investigation, this research team implemented a paradigm that involved viewing novel and familiar faces in an fMRI context. Performance was compared across two groups of young adults: one group had been classified as behaviorally inhibited at 2 years old (more likely to show fear to novelty) whereas the other group had been classified as behaviorally uninhibited (less likely to show fear to novelty). As young adults, the inhibited group showed greater amygdala activation to both familiar and novel faces with relatively stronger activation to novel faces. Thus, this study of normal individual differences in temperament makes clear that even neutral faces serve as an arousing for individuals with a more reactive amygdala.

We had originally hypothesized that the relatively hotter Emotion Face condition would yield relatively greater mediation (i.e., emotion greater than neutral). However, as demonstrated by the temperament study (reviewed above), it is a mistake to think of non-emotion faces as neutral stimuli. As powerful social stimuli, neutral faces are capable of generating arousal and they have been used effectively for eliciting individual differences in amygdala arousal. In our study design, all participants performed the neutral face block before the emotion face block. With this point in mind it is interesting to consider some additional alternatives. First, it is possible that for all participants the effect of the emotion content of the second block of faces was dampened by habituation. A second intriguing possibility is that for the purposes of our goal, non-emotion face stimuli had more power to discriminate between the groups. Following the literature, it is reasonable to hypothesize that all participants regardless of maltreatment history would show some arousal to threatening faces (e.g., Adolphs, 2002, 2008). Thus, it may be that the most sensitive stimuli for discriminating groups were the non-emotion faces. A third possibility that may be difficult to test concerns the lack of clarity among the findings using emotion faces with trauma groups. While there is evidence of atypical emotion face processing, there are conflicting findings as to which emotional displays (e.g., angry, fearful, happy) are more arousing and disruptive to performance (e.g., Caldwell et al., 2014; Cromheeke et al., 2014). Therefore, had we conducted separate conditions with a range of emotion faces (e.g., happy, fearful, angry, disgust), we may have observed specific indirect effects of a particular emotion face stimuli in this sample. It should be noted such a comprehensive emotion comparison would not be feasible in a single study such as ours, both because of the likely problem of habituation, discussed above, and because such a Go-No-Go paradigm with so many blocks would be excessive for a single participant.

Four particular limitations in our work deserve consideration. One limitation reflects difficulties inherent in the study of child maltreatment in an adult non-clinical sample. Following many other studies with college students we used a validated self-report measure, the CTQ, to examine individual differences in maltreatment history. We should emphasize that limitations associated with adult retrospective self-report of maltreatment status are limitations that challenge the broader field of researchers interested in exploring the impact of deleterious childhood events on adult adaptation. Adult self-report measures are likely to identify traumatic experiences among a larger percentage of the population of interest; however, it is also true that alternative measurement instruments (i.e., informant studies) are likely to miss cases (Stoltenborgh et al., 2014; Waxman et al., 2014). This issue may be especially clear when one considers emotional abuse in childhood, which is now recognized as a significant risk for adult outcome (e.g., Spertus et al., 2003). Unlike the kinds of child abuse that are likely to yield legal and medical evidence, emotional abuse may be unobserved and, therefore, require subjective participant report. We should note that when we administered a second instrument, the Trauma Symptom Checklist, both in our previous studies and in this present sample we found evidence that responses on the CTQ co-varied in predictable ways with trauma symptoms. Although, we did not seek out participants who were clinically diagnosed or in treatment (currently or in the past), we cannot determine the proportion of our volunteer sample that may fit this description. We assume that our sample is heterogeneous with respect to the specific profiles of trauma history (e.g., age and frequency of exposure), early risk and protective factors both at the levels of the individual (e.g., genetic vulnerability) and the context (e.g., neighborhood effects, etc.). Further, we assume risk and protective factors also vary in each individual's current lives (e.g., possible adult trauma, other life stressors). It is important to stress that we cannot determine the degree to which our sample is biased based on our recruitment from an introductory psychology course participant pool. It is conceivable that students recruited from this course differ from other students at the university in ways we cannot account for. For example, introductory psychology courses may appeal more strongly to students who carry psychiatric or environmental risk. We are not aware of any college studies of maltreatment that satisfy methodological requirements to draw epidemiological conclusions. Our continued work in this area is designed to add more measures, such as clinical interviews, to better characterize our sample beyond the CTQ. The CTQ is a survey that is validated and well established in the literature, but still suffers from some degree of measurement error that could either depress or inflate correlations.

A second limitation is that our sample was majority female. To a large extent, this mirrors the demographics of the university population from which it was drawn; however, it may be the case that the title of our study selectively attracted female participants over males. It should be noted that a substantial proportion of studies examining history of childhood maltreatment involve exclusively female samples (e.g., Aspelmeier et al., 2007; Elliot et al., 2009; Kendra et al., 2012); nevertheless, our findings may not generalize to a more diverse population of individuals with maltreatment histories. To address this limitation, our future research will broaden our recruitment of participants. For example, we plan to include a sample of student veterans, a population that may be characterized by trauma at two developmental stages (Orcutt et al., 2003) and that is at high risk for academic difficulties and attrition (Rumann and Hamrick, 2007; DiRamio et al., 2008). The above limitations regarding the homogeneity of the sample highlight the potential problem of random versus non-random measurement error, likely inherent in most research in this domain. Non-random

error causes underestimation of relationship, and presently we cannot disambiguate the degree to which our error reflects random or non-random factors. Recruitment of larger and more heterogeneous samples will allow for a more robust indirect effects mediational model to be identified with attention to potential model fitting errors.

A third limitation concerns our use of only two hot EF tasks in this study. It is important to note that we selected the two tasks carefully to be consistent with the relatively new and emerging hot EF literature. The IGT is considered to be a classic hot EF task as it is sensitive to the function of the ventromedial prefrontal cortex (Bechara et al., 1994; Damasio, 1999) and requires the execution of EF skills such as inhibition and flexibility under conditions of heightened arousal (i.e., incentives). The GNG is a well-established measure of inhibition that has been similarly "heated" with arousing stimuli (e.g., human faces) in past research (e.g., Hare et al., 2008), as we did in this study. Thus, we examined two accepted hot EF tasks, each with several indices of performance. We found that adaptive responding on a particular block of the IGT, and reaction time to the neutral face condition of the GNG were significant mediators between a history of maltreatment and GPA and self-reported college adaptation, respectively. Our future research will explore potential mechanisms (e.g., cognitive strategies, emotion regulation) underlying each of these findings. Further, we will continue to examine the degree to which other hot EF tasks, such as a heated working memory paradigm, demonstrate the same associations with both history of childhood maltreatment and college adaptation and achievement.

A fourth and final limitation involves our small effects. In this first study to examine the indirect effects of EF on the association between childhood maltreatment history and college outcomes, it is the case that the indirect effects we did identify were small, albeit significant. However, it should be noted that research designed to explain individual differences in the multifactorial domains of college adaptation and achievement (e.g., GPA) typically identify very small effects of single variables, even those factors much more proximal to the college outcomes (e.g., intelligence, self-efficacy, metacognition) than we examined in the current study (e.g., Murray and Wren, 2003; Kitsantas et al., 2008). Therefore, our small indirect effects utilizing only two hot EF tasks are very much in line with the effect sizes reported in the college adaptation and attrition literature, while also contributing new information regarding the potential pathways between history of maltreatment to college outcomes.

While we certainly acknowledge these various limitations, we must stress that this work represents the early stages of a novel approach to understanding an understudied subgroup of vulnerable college students. The overarching goal of our research program is to identify individuals within the large

#### REFERENCES

Adolphs, R. (2002). Recognizing emotion from facial expressions: psychological and neurological mechanisms. Behav. Cogn. Neurosci. Rev. 1, 21–61. doi: 10.1177/1534582302001001003

heterogeneous group of college students who might benefit from targeted interventions that are informed by our findings regarding the mediational pathways between childhood trauma and academic outcomes. From the start, our approach has rested on the assumption that any effort to identify patterns of relative risk within the large heterogeneous group with a maltreatment history must involve in exploration of current phenotype. The importance of understanding current cognitive and emotional factors reflects the multifactorial relationship between child developmental history and academic outcome. In this first, exploratory investigation we identified hot executive processes as a potential meditational connection. While our results should be considered preliminary and in need of further study, we believe that the examination of EFs within relatively hotter contexts may be a very fruitful direction for researchers interested in understanding the mechanisms that connect child maltreatment history to academic outcome. Promoting successful adaptation in the college setting may be one of our most effective means of contributing to positive life outcomes for individuals who carry developmental risk.

#### ETHICS STATEMENT

This study was carried out in accordance with the recommendations of the Institutional Review Board for the Protection of Human Subjects in Research of the University of Northern Colorado, with written informed consent from all subjects. All subjects gave written informed consent in accordance with the American Psychological Association ethical guidelines. The protocol was approved by the Institutional Review Board of the University of Northern Colorado.

#### AUTHOR CONTRIBUTIONS

MW and EP collaborated in the conceptualization of the study, as well as in its design, execution, and the preparation of this manuscript. MJ contributed her expertise in data analysis and interpretation, as well as collaborated in the preparation of this manuscript.

#### FUNDING

This study was supported by The Avielle Foundation to EP and MW; Research Dissemination and Faculty Development Award to MW and EP. Publication of this article was funded in part by the University of Northern Colorado Fund for Faculty Publication.


behaviors in youth. J. Community Psychol. 36, 989–1007. doi: 10.1002/jcop. 20277


structure and relationships with correlates and consequences. Educ. Psychol. Rev. 24, 133–165. doi: 10.1007/s10648-011-9184-5



Zhang, Q., and Wang, Y. (2004). Trends in the association between obesity and socioeconomic status in U.S. adults: 1971 to 2000. Obes. Res. 12, 1622–1632.

**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2017 Welsh, Peterson and Jameson. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

## Relationships between Motor and Executive Functions and the Effect of an Acute Coordinative Intervention on Executive Functions in Kindergartners

#### Marion Stein<sup>1</sup> \*, Max Auerswald<sup>2</sup> and Mirjam Ebersbach<sup>1</sup>

<sup>1</sup> Department of Psychology, University of Kassel, Kassel, Germany, <sup>2</sup> Department of Quantitative Methods in Psychology, Ulm University, Ulm, Germany

There is growing evidence indicating positive, causal effects of acute physical activity on cognitive performance of school children, adolescents, and adults. However, only a few studies examined these effects in kindergartners, even though correlational studies suggest moderate relationships between motor and cognitive functions in this age group. One aim of the present study was to examine the correlational relationships between motor and executive functions among 5- to 6-year-olds. Another aim was to test whether an acute coordinative intervention, which was adapted to the individual motor functions of the children, causally affected different executive functions (i.e., motor inhibition, cognitive inhibition, and shifting). Kindergartners (N = 102) were randomly assigned either to a coordinative intervention (20 min) or to a control condition (20 min). The coordination group performed five bimanual exercises (e.g., throwing/kicking balls onto targets with the right and left hand/foot), whereas the control group took part in five simple activities that hardly involved coordination skills (e.g., stamping). Children's motor functions were assessed with the Movement Assessment Battery for Children 2 (Petermann, 2009) in a pre-test (T1), 1 week before the intervention took place. Motor inhibition was assessed with the Simon says task (Carlson and Wang, 2007), inhibition and shifting were assessed with the Hearts and Flowers task (Davidson et al., 2006) in the pre-test and again in a post-test (T2) immediately after the interventions. Results revealed significant correlations between motor functions and executive functions (especially shifting) at T1. There was no overall effect of the intervention. However, explorative analyses indicated a three-way interaction, with the intervention leading to accuracy gains only in the motor inhibition task and only if it was tested directly after the intervention. As an unexpected effect, this result needs to be treated with caution but may indicate that the effect of acute coordinative exercise is temporally limited and emerges only for motor inhibition, but not for cognitive inhibition or shifting. More generally, in contrast to other studies including older participants and endurance exercises, no general effect of an acute coordinative intervention on executive functions was revealed for kindergartners.

Keywords: executive functions, physical activity, acute exercise, coordinative intervention, kindergartners, cognition

#### Edited by:

Dieter Baeyens, KU Leuven, Belgium

#### Reviewed by:

Hilde Van Waelvelde, Ghent University, Belgium Julius Verrel, Max Planck Institute for Human Development (MPG), Germany

> \*Correspondence: Marion Stein marion.stein@uni-kassel.de

#### Specialty section:

This article was submitted to Developmental Psychology, a section of the journal Frontiers in Psychology

Received: 31 January 2017 Accepted: 10 May 2017 Published: 30 May 2017

#### Citation:

Stein M, Auerswald M and Ebersbach M (2017) Relationships between Motor and Executive Functions and the Effect of an Acute Coordinative Intervention on Executive Functions in Kindergartners. Front. Psychol. 8:859. doi: 10.3389/fpsyg.2017.00859

## INTRODUCTION

fpsyg-08-00859 May 26, 2017 Time: 15:44 # 2

Children's increasing use of technological devices, such as smartphones or computers, promotes a sedentary lifestyle at least in industrial societies. The minimum of 60 min of daily physical activity, as recommended by the World Health Organization [WHO] (2010), is accomplished only by one third of the children in Germany (Manz et al., 2014) and in the United States (Centers for Disease Control and Prevention, 2008). This has not only an effect on their physical development and health (Janssen and LeBlanc, 2010) but may also affect their cognitive development. Correlative studies with children revealed positive relationships between cognitive functions and physical activity (Campbell et al., 2002; Becker et al., 2014) as well as between cognitive and motor functions (Livesey et al., 2006; Davis et al., 2011). Moreover, intervention studies suggested that both acute (i.e., one-time) and chronic (i.e., repeated) physical exercise may cause beneficial effects on subsequent cognitive functions of children aged older than 6 years, adolescents, and adults (for a metaanalysis, see Sibley and Etnier, 2003; Verburgh et al., 2014).

These positive effects of physical activity on cognitive functions can be explained by physiological and developmental mechanisms: First, physical activity might elicit physiological changes, such as enhancing the cerebral blood flow (e.g., Herholz et al., 1987) and increasing the release of neurotransmitters – factors that are assumed to positively affect cognitive functions (Chmura et al., 1998; Winter et al., 2007). Second, motor development and cognitive development are closely interrelated (e.g., Sibley and Etnier, 2003). According to Piaget (1972), the first concepts that are acquired in infancy are based on sensorimotor experiences. The skills and relations learned through these experiences can be transferred to cognitive problems and therewith form the basis of further cognitive development. In addition, acquiring and executing new and more complex motor movements requires and stimulates cognitive functions (Ackerman, 1987; Best, 2010). This stimulation occurs, for instance, also during team sports (e.g., soccer), where players have to cooperate with team mates, anticipate their movements, develop strategies, and switch between changing task conditions (Best, 2010). The interrelation between motor and cognitive development is also reflected by the existence of neuronal connections between the cerebellum (responsible, for instance, for the control and temporal coordination of movements) and the prefrontal cortex (responsible, for instance, for executive functions; Raichle et al., 1994; Diamond, 2000). The simultaneous activation of the cerebellum and prefrontal cortex primarily occurs in cognitive or motor tasks, which are complex, unknown, require fast reactions, or underlie changing conditions (Diamond, 2000). Thus, it would be suggestive to use the relationship between motor and cognitive functions to enhance one or the other by means of interventions.

Besides benefitting from physical activity in the long run, it could also be reasonable that an acute bout of physical activity induces a short-term increase of cognitive performance, for example, if it is executed before a school test or a difficult learning situation. Most of the intervention studies reporting positive effects of acute physical activity on cognitive functions of children used aerobic exercise interventions and were conducted in individual settings, in which the physical intensity of the intervention was strictly controlled (e.g., Hillman et al., 2009; Ellemberg and St-Louis-Deschênes, 2010; Pontifex et al., 2013). Only a few studies examined the efficacy of acute coordinative interventions on cognitive functions so far. These acute coordinative interventions were often conducted in group settings and were therewith less controlled. Nevertheless, they also yielded positive effects on cognitive functions (e.g., Budde et al., 2008; Jäger et al., 2014). Despite the small number of studies including acute coordinative interventions, several researchers theoretically assumed a superior effect of coordinative in comparison to aerobic exercise interventions (e.g., Budde et al., 2008; Best, 2010). This assumption is grounded in higher demands of motor control and cognitive functions (e.g., spatial orientation) for coordinative exercises compared to aerobic exercises (Budde et al., 2008; Voelcker-Rehage et al., 2011). Coordinative interventions might thus not only evoke physiological changes, as already mentioned (e.g., general enhanced release of neurotransmitters in the brain), but additionally stimulate the neuronal network between cerebellum and prefrontal cortex due to their higher motor and cognitive complexity. This stimulation could function as a pre-activation for the subsequent cognitive performance (Diamond, 2000; Budde et al., 2008), for instance, by an increased release of neurotransmitters in these specific areas.

However, the question of how complex or how demanding a coordinative exercise actually is for a child depends mainly on the level of his or her motor and cognitive functions (McMorris, 2009). Due to this interaction between interventional and individual factors, the abilities of each child should be taken into account when designing a coordinative intervention, which was rarely done so far. In the current study, we aimed at enhancing the executive functions of kindergartners (i.e., 5 to 6-year-olds) by an individually executed, acute coordinative intervention that was adapted to the kindergartners' individual motor performance.

In particular, executive functions can be positively affected by both kinds of physical activity (e.g., Tomporowski et al., 2008b; McMorris and Hale, 2012). Executive functions are fundamental cognitive processes, which are responsible for goaldirected behavior, especially in new and not automated situations (Banich, 2009). They include updating, inhibition, and shifting. Updating means to monitor and modify mental representations in the working memory. This is required, for instance, in order to remember plans and to evaluate available behavioral alternatives. Inhibition involves the suppression of predominant and automated reactions as well as being resistant against distraction. It includes controlling one's behavior and attention, instead of being affected by external stimuli and emotions. Shifting allows to switch attention between different tasks or rules, enabling fast and flexible adjustments to changing conditions (Miyake et al., 2000; Diamond, 2006). Executive functions play a central role in current and future academic achievement (St Clair-Thompson and Gathercole, 2006; Bull et al., 2008; Best et al., 2011) as well as in social competence and the occurrence of externalized behavior (Nigg et al., 1999; Ciairano

et al., 2007; Best et al., 2009). Therewith, early interventions that enhance children's executive functions before they enter school might be helpful to promote their social and academic development.

Until now, only a few studies examined the effect of an acute bout of physical activity on cognitive functions of kindergartners (Palmer et al., 2013; Mierau et al., 2014), which is astonishing given the fact that this age phase can be conceived as a sensitive period, in which cognitive and brain development rapidly progress (Brown and Jernigan, 2012). In particular, inhibition as one of the executive functions develops markedly at kindergarten age (Best et al., 2009; Best and Miller, 2010; Röthlisberger et al., 2010). In general, a positive effect of acute physical activity on inhibition has already been demonstrated for different age groups including older children (e.g., Hillman et al., 2009; Ellemberg and St-Louis-Deschênes, 2010; for a review see Barenberg et al., 2011). However, the only study with kindergartners in this regard, examining the effect of an acute coordinative group intervention (Palmer et al., 2013), showed only a marginal effect on inhibition. Furthermore, the study of Mierau et al. (2014) that included an acute group intervention based on aerobic exercise games (e.g., soccer), failed to find an effect on shifting of kindergartners. Therefore, further studies are needed to clarify whether effects of acute physical activity interventions, revealed for older children, adolescents, and adults, emerge in kindergartners, too.

The present study had two aims: First, the correlational relationships between motor and executive functions in kindergartners were examined. Several studies reported a positive, moderate relationship between these functions in this age group (Livesey et al., 2006; Cameron et al., 2012; Roebers et al., 2014). However, only a few studies included shifting (e.g., Mierau et al., 2016) and motor inhibition (Sereno et al., 2006) as aspects of executive functions. Motor inhibition requires suppressing a dominant motor action, while cognitive inhibition requires focusing the attention to a relevant cue and ignoring an irrelevant cue (Sereno et al., 2006). To consider a broader variety of executive functions, shifting and motor inhibition were included in addition to cognitive inhibition in the current study.

Second, we investigated whether an acute, adaptive coordinative intervention yielded causal effects on specific executive functions of kindergartners. Besides an expected, general effect of the intervention, we assumed that the three assessed executive functions (i.e., motor inhibition, cognitive inhibition, shifting) would be affected differently. The efficacy of an acute physical activity intervention on cognitive inhibition of older children and adolescents could be shown in several studies (e.g., Jäger et al., 2014; for a review see Verburgh et al., 2014). However, it is still unclear if this finding could be replicated in kindergartners. Furthermore, some researchers assumed that the efficacy of physical activity on executive functions depends on the developmental status of the child and of the executive function, in that higher developed executive functions should benefit more (Tomporowski et al., 2008b; Best, 2010). Besides the activation of common brain regions in the prefrontal cortex, different executive functions are also associated with distinct brain regions, which follow other developmental courses (Olson and Luciana, 2008; Best and Miller, 2010). In particular, brain regions associated with shifting fully mature only between late adolescence and early adulthood (Olson and Luciana, 2008). Accordingly, the neurophysiological basis for shifting could be too premature among kindergartners to show great changes due to physical activity in this age group. This led us to the assumption that cognitive inhibition, which is better developed than shifting in kindergartners, therefore should benefit more than shifting from physical activity. Furthermore, the efficacy of an acute bout of physical activity on motor inhibition as one aspect of executive functions was rarely examined. We expected that the coordinative intervention would be more effective for motor inhibition than for cognitive inhibition or shifting due to the greater congruency between the coordinative intervention and the motor inhibition tasks: Both require that whole-body movements are inhibited.

## MATERIALS AND METHODS

#### Design

The experiment followed a 2 × 2 × 3 mixed design with experimental condition (i.e., acute coordinative intervention condition vs. control condition) and order of the tasks assessing executive functions (i.e., the "Hearts-and-Flowers" task to assess cognitive inhibition and shifting first or the "Simonsays" task to assess motor inhibition first) as between-subjects factors and type of executive function (motor inhibition vs. cognitive inhibition vs. shifting) as within-subjects factor. Accuracy and reaction times in the executive function tasks, measured 1 week before the experimental conditions (T1), were included as predictors in the respective linear mixed model. The dependent variables were accuracy and mean reaction times in the executive function tasks, conducted immediately after the experimental conditions (T2). During the coordinative intervention, motor performance (e.g., how often a ball was thrown at a target and how often the target was hit) was recorded and physical intensity of the intervention was assessed in both conditions by recording children's heart rates. In addition, children's motor functions were assessed in T1 to test whether they yielded correlations with executive functions at T1.

#### Sample

Ethical consent for the experiment was obtained from the faculty's ethic committee<sup>1</sup> . Initially, 135 kindergartners were recruited from nine local kindergartens in a medium-sized town in Germany after their parents signed a consent form. The children had intermediate socio-economic backgrounds and spoke and comprehended German fluently. Several children had to be excluded due to being absent on the day of the experimental intervention (n = 13), failures in

<sup>1</sup>This study was carried out in accordance with the recommendation of the ethics committee of the Faculty of Human Sciences of the University of Kassel with written informed consent from all legal guardians of the subjects in accordance with the Declaration of Helsinki.

measuring – or missing – the targeted physical intensity level in the coordinative intervention (n = 11) or lacking motivation or comprehension during the executive function tasks (n = 11). The remaining sample consisted of 101 kindergartners aged 60 to 85 months. These children were randomly assigned to one of two experimental conditions: an acute coordinative intervention condition (n = 48, mean age: M = 72.2 months, SD = 5.2, 24 males) or a control condition (n = 53, mean age: M = 72.3 months, SD = 6.9, 25 males). Preliminary analyses revealed that there were no significant differences between the drop-outs and the remaining sample with regard to the motor or executive functions (ps > 0.091).

## Assessment of Executive Functions and Motor Functions and Order of Tasks

Three aspects of executive functions (motor inhibition, cognitive inhibition, and shifting; Miyake et al., 2000; Sereno et al., 2006) were assessed individually by means of two tasks at two times: at T1, 1 week before, and at T2, immediately after the coordinative intervention or control condition. Each task took approximately 10 min.

To assess motor inhibition, the "Simon-says" task (Strommen, 1973) was adapted from Carlson and Wang (2007). In this task, the children were asked to imitate ten simple movements, which had been named and performed first by the investigator who was facing the child (e.g., "touch your nose"). However, movements should only be imitated if the investigator said "Simon says" before naming and performing the movement (i.e., imitation trial). Otherwise, the child had to stay still and to suppress the imitation (i.e., inhibition trial). At the beginning of the "Simon-says" task, the investigator demonstrated all movements, which the child had to imitate, to ensure that he or she was able to perform these movements. Afterwards, practice trials were conducted as long as the child reacted correctly in an inhibition and a successive imitation trial. The following main task consisted of five imitation and five inhibition trials, which were presented mixed-up in one of two fixed orders. One second after the child imitated the movement – or after 3 s, if the child did not react – a new trial was demonstrated. After the fifth trial, the investigator reminded the child of the imitation rules. Only inhibition trials were considered in the statistical analyses. They were evaluated with a score between 0 and 3 (i.e., 0: full movement, 1: partial movement, 2: flinch, 3: no movement; Carlson and Meltzoff, 2008). Thus, across all five inhibition trials, a total score between 0 and 15 points could be achieved. The dependent variable was the percentage of the total motor inhibition score (i.e., accuracy in %).

Cognitive inhibition and shifting were assessed with the computer-based "Hearts-and-Flowers task" (Davidson et al., 2006) using E-Prime Software (Psychology Software Tools, Pittsburgh, PA, United States). The task was presented on a laptop (Dell, Vostro 3700, 17.3 inches, distance to monitor: 50 cm) and the child had to react to the trials on a separate keyboard that was placed in front of the child and which only consisted of a left and a right button. The child was presented with one of two stimuli: a heart or a flower, which had the same size (i.e., 3.8 cm × 3.9 cm) and color (i.e., red). The stimuli emerged on the right or the left side of a rectangle (7.2 cm × 28.6 cm), located in the center of the screen. There were three blocks (i.e., congruent, incongruent, and mixed), presented in a fixed order, each with 20 trials. In the first, congruent block, a heart appeared on the right or left side in the rectangle. The child had to press the button that was located on the same side as the heart. This block assessed the speed of information processing. In the second, incongruent block, a flower appeared on the right or left side in the rectangle. Now, the child had to push the button that was on the opposite side of the flower. Because of the dominant tendency to push the button on the same side on which the stimulus appears as the attention was focused to this side (i.e., Simon effect; Simon et al., 1976), the incongruent block required cognitive inhibition. In the third, mixed block, ten hearts and ten flowers appeared one after another in a fixed, pseudo-random order. The fixed order was chosen to realize the same difficulty (i.e., the same number of switches between congruent and incongruent trials) for all children and both times of measurement. Due to the permanent change of stimulus type (i.e., a total of 16 switches; the same stimulus type appeared maximally two times in succession), this block assessed shifting.

Children were instructed to react as quickly and accurately as possible in all blocks. Before the congruent and incongruent block started, children practiced the rules in at least four trials. Each practice trial was presented on the display until any button was pushed. The congruent and incongruent block started as soon as two of four successively shown practice trials were completed correctly. If the child reacted to less than two trials correctly, the four practice trials were repeated as long as two correct answers were given. Before the mixed block started, the two rules were repeated but no practice trials were executed. Each trial began with a fixation cross (500 ms), followed by a white slide (500 ms), the target stimulus (heart or flower, max. 1500 ms), and ended with another white slide (500 ms). The dependent variables were the mean reaction time (in ms) in the correct trials and the accuracy (in %) in the incongruent block, both assessing cognitive inhibition, and the mean reaction time (in ms) in the correct trials and the accuracy (in %) in the mixed block, both measuring shifting. We decided to not include the congruent block in the statistical analysis due to high accuracy rates and small variance between the children. All reaction times shorter than 200 ms were interpreted as random reactions and were excluded (Davidson et al., 2006). Furthermore, reaction times deviating more than three SD from the individual mean were also excluded (cf. Roebers and Kauer, 2009). Concerning accuracy, all trials were analyzed – independent of the exclusion of the associated reaction times – to treat random responses equally. Lacking reactions – no response within 2000 ms –were interpreted as wrong responses. Children with less than 20 % correct trials were excluded from the data analysis (n = 4).

The order of the tasks was counterbalanced between participants and was identical for both times of measurement within participants. There were two possible task orders: Simon

says first or Hearts-and-Flowers task first. The three blocks of the Hearts and Flowers Task needed to be presented in sequential order to remain the same difficulty level for all children. Since it was assumed that the first task after the intervention could benefit the most, whereas the cognitive resources for the second task could be limited due to performing the first task, order of tasks was considered as an independent variable.

Motor functions were assessed with the German version of the "Movement Assessment Battery for Children – Second Edition" (M-ABC 2; Petermann, 2009). It consists of eight tasks that can be assigned to three scales: manual dexterity, ball skills, and balance. All children were examined with the task set for the age band 3–6 years. One child was already 7 years old, but as we did not use norm values and this child did not score at the maximum, this raised no problem. This test took about 20–30 min. For the correlative analyses, the raw scores of each task were z-standardized and summed up for each scale to realize a norm independent score. The sum of these three scores formed the total score for motor functions.

## Experimental Conditions: Coordinative Intervention and Control Condition

Each experimental condition took about 25 min (20 min exercise and 5 min instructions) and was executed with each child individually in the kindergarten. The order of the exercises in both conditions was counterbalanced between children. The acute coordinative intervention started with a 2 min running warm-up. Afterward, the children participated in four coordinative exercises (4.5 min each), which likewise required both sides of the body (bimanual and bipedal). The exercises included jumping in diverse combinations, balancing on a rope, bouncing a ball (Exercise 1), throwing balls on targets and running in diverse combinations (Exercise 2), kicking balls on targets and catching balls (Exercise 3), as well as boxing and kicking against a gymnastic ball (Exercise 4).

The acute coordinative intervention was adapted to the motor performance of each child during the intervention. Each exercise consisted of three to five difficulty levels with increasing motor and inhibition demands<sup>2</sup> . Whether a child achieved a higher level depended on the faults (e.g., missing a target) the child made on the previous level. For example, the second subexercise "throwing balls" with both hands consisted of three levels: On the first level, children should throw balls into a box within a distance of 1.5 m. If at least four of five balls were on target, the next level was reached, in which the box was placed in 2.0 m, and in the third level in 2.5 m distance from the child. Thus, only if a child made less faults, it could achieve a higher task level. A research assistant recorded the performance of the children during the intervention and signalized if the current exercise level was completed and if the child was allowed to proceed to a higher level.

The control condition also started with a warm-up (2 min.), in which the children stamped different pictures on freely chosen locations on a blank sheet of paper. Subsequently, four different tasks (4.5 min each) were executed: playing three different board games and watching a short movie. The board games included simple actions like pushing a button, putting objects in a container, or moving a meeple on the board in maximally three steps, depending on the number of points on a previously drawn card. The games were played interactively with short waiting times for the child to take the next action. Therefore, the execution of the tasks required little to no motor or cognitive resources.

## Manipulation Check

To measure the physical intensity in the coordinative intervention and control condition, children's heart rate was assessed with a Polar RS800sd watch and a H1 sensor belt. According to the reversed U-shaped curve between physical arousal and cognitive performance (Yerkes and Dodson, 1908), a moderate physical intensity was expected to lead to optimal levels of cerebral blood flow and neurotransmitter release, and therewith to maximize cognitive performance (Timinkul et al., 2008; McMorris, 2009). Therefore, the aim was that the children in the coordinative intervention condition exercised on a moderate intensity level (i.e., 65–70% of maximal heart rate). The maximal heart rate (HRmax) was estimated by the formula: HRmax = 208 – 0.7 × age (cf. Mahon et al., 2010). The children of the examined sample had a theoretical HRmax of approximately 204 beats per minute (bpm). Thus, a moderate intensity was reached by a target heart rate between 122 and 153 bpm. During the intervention, the heart rate was controlled every 20 s by the investigator and the frequency of movements was adapted to remain in the target heart rate range. A successful manipulation should yield a higher heart rate of the children in the coordinative intervention condition compared to children in the control condition. In fact, analyses revealed that children of the coordinative intervention exercised at a moderate intensity level (M = 136 bpm, SD = 9) that was significantly higher than in the control condition (M = 103 bpm, SD = 10), t(99) = 17.42, p < 0.001, d = 3.50.

Furthermore, we adapted the coordinative exercises to the motor functions of each individual child by means of applying different levels of task difficulty during the coordinative intervention (see Experimental Conditions: Coordinative Intervention and Control Condition). The descriptive statistics in **Table 1** confirm that children accomplished different levels in each coordinative exercise. In addition, the sum of the accomplished levels across coordinative exercises was positively correlated with total score of children's motor functions, r(47) = 0.61, as well as with the sub-scores of manual dexterity, r(47) = 0.58, ball skills, r(49) = 0.37, and balance, r(49) = 0.51, ps < 0.001. This suggests that children with better motor skills completed the coordinative exercises on more advanced levels.

#### Statistical Analyses

Firstly, partial correlations between all executive and motor functions were calculated with age as control variable. Secondly,

<sup>2</sup>For an overview of each task, difficulty levels, and criteria to reach a higher level, see the Supplementary Material.


TABLE 1 | Percentage of children who maximally reached a certain level in the intervention condition (n = 48), separately for each exercise.

"–": Task only contained three levels.

fpsyg-08-00859 May 26, 2017 Time: 15:44 # 6

to check whether the dependent measures changed significantly from T1 to T2, we calculated paired t-tests with accuracy and mean reaction times of the three executive functions tasks as dependent variables. We also analyzed if there were any differences between the conditions concerning children's motor and executive functions at T1. In addition, two mixed models<sup>3</sup> were computed to examine if children in the coordinative intervention condition showed higher accuracies and lower reaction times at T2 in the executive function tasks than children of the control condition. In the mixed model with accuracy as dependent variable, planned comparisons were used to test whether this improvement was greater for motor inhibition compared to cognitive inhibition and shifting (task 1) as well as greater for cognitive inhibition than for shifting (task 2). Since reaction times were only recorded in the cognitive inhibition and shifting task, the mixed model with reaction times as dependent variable only allowed for a planned comparison between these two tasks. In addition, in both models the independent variables condition (coordinative intervention vs. control condition) and order of executive function tasks (Heartsand-Flowers task first or Simon-says task first) were included. The independent variables could only be included as fixed effects as for random slopes a minimum of two observations is needed.

	- + β<sup>6</sup> conditionijk orderijk + β<sup>7</sup> conditionijk task1ijk
	- + β<sup>9</sup> orderijk task1ijk + β<sup>11</sup> conditionijk orderijk task1ijk

Each mixed model, analyzed with the package lmerTest (Kuznetsova et al., 2016) in the statistical computing software R (R Core Team, 2016), addressed that response data were nested within children and kindergartens and controlled for score differences at T1. Statistical assumptions (normal distribution and variance homogeneity) for the linear mixed models were visually checked by inspecting the residual plots and were judged as being sufficient.

#### RESULTS

Mean accuracy and reaction time of all three executive function tasks for both times of measurement as well as z-standardized scores of the motor functions for T1 are presented in **Table 2**, separately for each experimental condition. Two preliminary MANOVAs were calculated to check whether children of the coordinative intervention and the control condition differed at T1 concerning their performance in the motor function tasks and the executive function tasks. However, there was no difference between the two experimental conditions at T1, neither concerning children's motor performance, F(3,95) = 0.30, p = 0.827, nor concerning their executive functions: accuracy: F(3,97) = 0.40, p = 0.396; reaction time: F(2,98) = 0.05, p = 0.954.

## Correlations between Motor and Executive Functions at T1

Partial correlations between motor and executive functions at T1, controlled for age, are presented in **Table 3**. In line with the first hypothesis, correlations between executive functions and motor functions at T1 were positive, ranging from small to moderate effect sizes (Cohen, 1988). Especially, the accuracy in the shifting task correlated moderately with all motor functions. In addition, the reaction times in the cognitive inhibition task correlated positively and moderately with all motor functions with exception of balance, whereas the accuracy in the cognitive inhibition task yielded no significant correlations to any motor function. Moreover, the reaction times of the shifting task were also not associated with any motor function. Motor inhibition

<sup>+</sup> u0j + u0k + εijk

For each measurement i, child j, kindergarten k. ACC = accuracy (at T1 or T2), RT = reaction time (at T1 or T2), task1 = planned comparison for motor inhibition vs. cognitive inhibition and shifting, task2 = planned comparison for task cognitive inhibition vs. shifting, µ0j random intercept for each child, µ0k random intercept for each kindergarten.

TABLE 2 | Means and standard deviations of the executive functions and the z-standardized motor functions, separately for each experimental condition (N = 101).


Standard deviation in parentheses; T1/T2, first/second time of measurement; ACC, accuracy in %; RT, reaction times in ms; <sup>1</sup>Sum of z-standardized scores; <sup>2</sup>n = 99, two children did not solve all tasks of manual dexterity.

TABLE 3 | Partial correlations between executive and motor functions across both experimental conditions at the pre-test T1, controlled for age (N = 101).


<sup>1</sup>Sum of z-standardized scores; <sup>2</sup>n = 99, two children did not solve all tasks of manual dexterity; <sup>∗</sup>p < 0.0025, ∗∗p < 0.0005; ACC, accuracy; RT, mean reaction time.

only showed a significant positive moderate correlation to balance.

## Effect of the Acute Coordinative Intervention on Performance in the Executive Function Tasks

#### Accuracy

Paired t-tests showed a significant gain in accuracy (in %) from T1 to T2 across the coordinative intervention and the control condition in motor inhibition, t(100) = −3.53, p < 0.001, d = −0.35, cognitive inhibition, t(100) = −6.19, p < 0.001, d = −0.62, and shifting, t(100) = −8.40, p < 0.001, d = −0.84. The mixed model with accuracy in the three executive function tasks at T2 as dependent variable yielded no significant main effect of the experimental condition, ß<sup>1</sup> = 1.83, t(91.89) = 1.50, p = 0.137<sup>4</sup> . It should be noted that the power to detect differences between both conditions in the present sample, assuming a medium effect size of f <sup>2</sup> = 0.15 (Cohen, 1988), was large enough: 1 – ß = 0.97 (Faul et al., 2007). Therefore, the second hypothesis had to be rejected: Given that there was no difference between children in the coordinative intervention condition and the control condition at T1 (see Preliminary analyses, 3.), the acute coordinative intervention did not lead to a higher overall gain of accuracy in the executive function tasks from T1 to T2 in contrast to the control condition. However, there was a significant three-way interaction between condition, type of executive function task (task1: motor inhibition vs. cognitive inhibition and shifting), and order of tasks, ß<sup>11</sup> = 3.91, t(191.59) = 2.93, p = 0.004. Post hoc tests revealed that the accuracy in the motor inhibition task at T2 was higher in the coordinative intervention condition (M = 73.3%, SD = 44.5%) compared to the control condition (M = 53.9%, SD = 34.6%, ß = 9.77), t(39.33) = 2.87, p = 0.007, – but only if the Simonsays task was presented first (**Figure 1**) 5 . Thus, the coordinative intervention led to a larger improvement of motor inhibition than the control condition under a specific condition, which confirms at the same time our hypothesis that motor inhibition profits more (or at all) compared to other executive functions from a coordinative intervention. Furthermore, there was a significant effect of accuracy at T1 on the accuracy at T2

<sup>4</sup>All regression coefficients are unstandardized and all factors were effect coded with the exception of task (see Statistical Analyses).

<sup>5</sup>The results were confirmed by a direct model comparison analysis, χ 2 (2) = 8.83, p = 0.012.

across all executive function tasks and conditions, ß<sup>2</sup> = 0.66, t(256.53) = 16.16, p < 0.001: Children showed a higher accuracy in the executive function tasks at T2 if their accuracy was higher at T1. Moreover, the accuracy of the three executive function tasks (across both conditions) differed at T2: It was lower for motor inhibition compared to cognitive inhibition and shifting, ß<sup>4</sup> = −3.95, t(195.27) = −2.92, p = 0.004, and lower for shifting compared to cognitive inhibition, ß<sup>5</sup> = −2.78, t(213.34) = −2.23, p = 0.027. No other effects or interactions were significant.

#### Reaction Times

Paired t-tests showed a significant reduction of reaction times from T1 to T2 across both experimental conditions in cognitive inhibition, t(100) = 7.88, p < 0.001, d = 0.78, and shifting, t(100) = 5.48, p < 0.001, d = 0.55. Note that no reaction times were assessed in the motor inhibition task. The mixed model with mean reaction times (in ms) at T2 as dependent variable revealed no significant effect of the experimental condition, ß<sup>1</sup> = 5.77, t(91.74) = 0.52, p = 0.608 (power corresponds to that of accuracy), and there were no significant interactions with this variable. Consequently, our hypothesis has to be rejected: Given that there was no difference between children in the coordinative intervention and the control condition at T1 (see Preliminary analyses, 3.), the coordinative intervention did not lead to a greater reduction of reaction times for the two executive functions in contrast to the control condition. However, there was a significant effect of reaction times at T1 on reaction times at T2 across both executive function tasks and conditions, ß<sup>2</sup> = 0.55, t(193) = 11.32, p < 0.001: Children showed shorter reaction times at T2 if they had shorter reaction times at T1. In addition, across both conditions, children showed shorter reaction times at T2 in cognitive inhibition (M = 743 ms, SD = 175) than in shifting (M = 945 ms, SD = 200), ß<sup>4</sup> = −42.37, t(132.3) = −5.12, p < 0.001<sup>6</sup> .

## DISCUSSION

One aim of the present study was to examine the relationship between motor functions and executive functions in kindergartners. In particular, shifting (accuracy) and cognitive inhibition (mean reaction time) correlated significantly with almost every motor function whereas motor inhibition only showed a significant correlation with balance. A second aim was to investigate whether there is a causal effect of an acute coordinative intervention on different aspects of executive functions in kindergartners. In general, the acute coordinative intervention had no greater effect on executive functions of kindergartners than the control activity. However, if motor inhibition was tested as first executive function immediately after the intervention, the children of the coordinative intervention condition appeared to perform more accurately in the motor inhibition task than children of the control condition. These results are discussed in more detail in the following.

### Correlations between Motor and Executive Functions

Our findings concerning positive correlations between motor and executive functions are largely in line with previous research. For instance, motor functions were related to cognitive and motor inhibition (Livesey et al., 2006; Cameron et al., 2012), shifting (Mierau et al., 2016), and global measures of executive functions (Röthlisberger et al., 2010; Roebers et al., 2014) in kindergartners and older children. More specifically, manual dexterity, balance, and ball skills, used as indicators of children's motor functions in the present study (cf. Petermann, 2009), were positively correlated with kindergartners' executive functions. All of these motor functions require the precise execution and constant adaptation of movements, an elaborated coordination between visual perception and processes of motor movement. It can therefore be assumed that executive functions at least partly navigate these processes. Even though in the current study the

<sup>6</sup> In addition, a bootstrap analysis confirmed the results of the linear mixed model.

effect sizes of several correlations between motor functions and motor inhibition on the one hand and accuracy in the cognitive inhibition task on the other hand were comparable to previous studies (e.g., Roebers and Kauer, 2009; Röthlisberger et al., 2010), they failed to reach significance when the significance level was adjusted for multiple testing. Furthermore, reaction times in the shifting task did not correlate with any motor function. Given children's poor performance in this task (i.e., accuracy about 54%) and given the fact that only reaction times of correctly completed trials were considered, it can be presumed that mean reaction time in the shifting task is no reliable measure and therefore yielded no significant correlation with the motor functions.

To conclude, our correlational results suggest that motor functions and executive functions are interrelated in kindergartners. It can be assumed that these relationships base on shared developmental and learning mechanisms (e.g., Piaget, 1972; Ackerman, 1987; Best, 2010) as well as on the collective activation of an underlying neuronal network that connects brain regions being associated with motor and cognitive functions (Diamond, 2000). Furthermore, general biological maturation processes might lead to a parallel increase of executive and motor functions (Luo et al., 2007). The underlying causal mechanisms of these relationships are still not fully understood, but some studies indicated bi-directional effects between motor and executive functions (e.g., Weinert and Schneider, 1999; Roebers et al., 2014). One attempt to uncover a causal relationship between motor and executive functions was the implementation of an acute coordinative intervention.

## Effect of the Acute Coordinative Intervention on Executive Functions

In the current study, there was no general effect of the acute coordinative intervention on the examined executive functions in kindergartners. Children in both the intervention and control condition reacted faster and more accurately in the executive function tasks at T2 compared to T1. These results contradict studies reporting positive effects of coordinative interventions and aerobic exercise interventions on inhibition (Barenberg et al., 2011; Jäger et al., 2014) and shifting (Ellemberg and St-Louis-Deschênes, 2010; Chen et al., 2014), revealed for children older than 6 years, adolescents, and adults. However, other studies failed to find effects of acute aerobic exercise or coordinative interventions on shifting and inhibition of kindergartners (Mierau et al., 2014) as well as on shifting of adolescents (Kubesch et al., 2009) and of overweight children (Tomporowski et al., 2008a). One might assume that at least some of these contradicting findings could be assigned to differences in the general design of the mentioned studies (i.e., setting, in which the intervention was executed; type of acute physical activity; measures of executive functions; and examined age groups). If this is true, it also suggests that the causal effects of acute physical activity interventions on executive functions are not as robust as one might expect and emerge only under certain conditions that still need to be uncovered.

Besides expecting a general effect of the acute coordinative intervention, we assumed that the size of the effect would depend on the kind of executive function tested: The effect on motor inhibition should be greater compared to the effect on cognitive inhibition and shifting. Indeed, if motor inhibition was tested first, children of the coordinative intervention condition showed a higher accuracy in this task than children of the control condition. Even though this finding should be interpreted cautiously, it might be based on the closer correspondence between the coordinative intervention and the requirements of the motor inhibition task in contrast to the other executive function tasks: At higher difficulty levels of the coordinative intervention, the children had to inhibit whole-body movements based on specific rules and commands – an ability that is also required in the motor inhibition task. However, this effect did no longer emerge if motor inhibition was tested as second task after the intervention. One explanation could be that due to the cognitive demands of the first task, the cognitive resources to solve the second executive functions task were reduced. In the few studies with children older than 6 years of age and adolescents, examining more than one cognitive function, the positive effects of acute physical activity on some of these cognitive functions were reported independently of the order in which the tasks were presented (e.g., Cooper et al., 2012; Jäger et al., 2014). However, kindergartners' attentional and cognitive resources are stronger limited than those of older children (Bjorklund and Harnishfeger, 1990; Best et al., 2009). Therefore, executing one cognitive task could reduce the cognitive resources for the following task. This assumption should be tested in future studies because so far, studies with kindergartners only examined the effect on one cognitive function at a time (e.g., Palmer et al., 2013; Mierau et al., 2014).

One reason for the lacking general effect of the coordinative intervention in the current study might be the setting in which the intervention took place. The interventions of the studies mentioned earlier, reporting positive effects, were often conducted either as coordinative interventions in group settings (Chen et al., 2014; Jäger et al., 2014) or as individually executed, aerobic exercises (Hillman et al., 2009; Ellemberg and St-Louis-Deschênes, 2010; Pontifex et al., 2013). Group settings pose higher social demands as participants have to anticipate the intention and behavior of other participants, adapt their own behavior based on that and switch their behavior between changing conditions. This anticipation and adaption directly requires cognitive functions, such as attention and executive functions (Diamond, 2000), so that group settings could stimulate the preactivation of these functions for a subsequent executive function task. The higher efficacy of acute interventions with a social component on neuronal brain structures was already shown for rats: Free wheel running in addition to living in groups led to a higher neurogenesis in the hippocampus (associated with memory) than individual wheel running of isolated living rats (Stranahan et al., 2006). Taken together, interventions in group settings might enhance cognitive functioning by their social demands and can have a high ecological validity in contrast to interventions in individual settings, but they also bear difficulties concerning the control of the physical intensity of the interventions and of the correct execution of movements.

Aerobic exercise interventions conducted in individual settings also found positive effects on diverse cognitive functions (cf. Hillman et al., 2009; Ellemberg and St-Louis-Deschênes, 2010; Pontifex et al., 2013). One advantage of those settings is that the physical intensity for each child can be individually adapted. Based on this adaption, a precise, moderate intensity level can be achieved, which provides an optimal arousal level for the subsequent cognitive performance (Yerkes and Dodson, 1908). Therefore, studies using an individual setting to implement physical activity have a high internal validity.

The intervention in the current study is a mixture of the designs mentioned above, including a coordinative intervention conducted in an individual setting. Even as it was a coordinative intervention, it was easier to control the physical intensity for each child, compared to a group setting. The only study with a partly similar design using an individual setting was conducted by Best (2012), testing the efficacy of one coordinative, cognitive engaging intervention and one intervention that only included repetitive movements in 6- to 10-year-olds. There was a positive effect for both interventions on inhibition. However, in contrast to the current study, the interventions were computerbased and therewith strictly controlled as each child received exactly the same procedure and executed the same amount of movements.

Taken together, it might be assumed that stable positive effects of acute interventions on executive functions can only be achieved if the interventions include social components (e.g., the interaction with others), or if they allow for controlling the physical intensity and the correct execution of movements (or both). More generally, it can be concluded that acute physical activity interventions can have positive effects on executive functions of kindergartners only under certain conditions. Thus, an acute intervention might not have a general, enhancing effect on executive functions.

An additional reason that might explain the temporally limited effect and the lack of a more general effect of the acute intervention on executive functions in the current study is the arousal level during the intervention, which could have been too low or rather inadequate. The physical intensity of the intervention was controlled to induce a moderate arousal, which should allow for an optimal cognitive performance (Yerkes and Dodson, 1908; McMorris, 2009). However, determining a moderate arousal for an individual child also depends on his or her aerobic capacity. The moderate intensity of the intervention in this study was only approximately estimated for all children, depending on their age, and not individually identified. Therefore, for some children the physical intensity could have been higher or lower than their individual moderate level, which could have resulted in a suboptimal arousal and, thus, an inadequate cognitive stimulation. Besides, the time children were actually physical active was interrupted four times for about 45 s to tell the instructions for the next exercise. These short pauses led to a decline in children's heart rates, which could also have reduced the effectivity of the coordinative intervention.

The current study was the first to our knowledge that aimed to adapt the coordinative difficulty level of the exercises to the individual motor performance of kindergartners. The reason for this adaption was to stimulate the neuronal network between the cerebellum and the prefrontal cortex, and to achieve a pre-activation for the subsequent executive function tasks (cf. Diamond, 2000; Budde et al., 2008). One way to achieve this activation is by means of complex tasks (Diamond, 2000). The complexity of the coordinative tasks depends on the individual motor functions of the children. However, in the current study all children started at the same difficulty level which led to a longer exercise on a low difficulty level for children with high motor functions and therefore to a lower mean cognitive and motor demand in contrast to children with low motor functions. The adaptation thus was not optimal. In future studies, the individual performance during the intervention should be analyzed beforehand and participants should then start at different difficulty level depending on their motor performance.

Besides these points of concern regarding the acute physical activity intervention, a potential interaction between the complexity and the intensity of the intervention has not been taken into account. Pesce (2009) describes that the efficacy of an acute physical activity intervention on cognitive functions depends on the interaction between task-related characteristics (e.g., duration, intensity, complexity of the intervention and of the cognitive task) and individual characteristics (e.g., the individual level of aerobic capacity, coordinative, and cognitive abilities). These interactions have an influence on the cognitive resources that can be provided due to the physical activity (Pesce, 2009). For the interaction between intervention intensity and the complexity of the cognitive task, an inverted u-shaped curve was assumed (e.g., McMorris, 2009) and confirmed in several studies (e.g., Kamijo et al., 2007; Chang et al., 2011), whereby a moderate intensity level led to a moderate arousal and to an optimal performance in a complex cognitive task. However, it is still unknown how the interaction between the complexity of an acute physical activity intervention and the cognitive task influences the effect of the intervention as well as in which way the intensity level and the complexity interact. Some studies varied the complexity of the acute intervention, while keeping the intensity constant (e.g., Pesce et al., 2009; Best, 2012), with inconsistent results: Some studies found a greater effect for complex interventions (Budde et al., 2008; Pesce et al., 2009). Other studies showed comparable positive effects of interventions with high and low complexity on cognitive functions (Best, 2012; Jäger et al., 2015). In contrast, one study showed a detrimental effect for an intervention with high complexity, which was explained by a too high arousal level and therefore a suboptimal cognitive precondition (Gallotta et al., 2012). Similarly, in the current study the interaction between a moderate intensity and a moderate to high complexity level could have led to a mental or physical overload – in particular as here a younger sample was involved, compared to the above mentioned studies. Future studies should realize acute physical activity interventions with different demands concerning the complexity and intensity to allow for a more precise examination of the interaction between these interventional characteristics.

Another limitation of the current study derives from the measures used for assessing executive functions. To measure cognitive inhibition, the Hearts-and-Flowers task (Davidson et al., 2006) was applied although many studies assessed cognitive inhibition with the Flanker task (e.g., Eriksen and Eriksen, 1974; Chen et al., 2014; Jäger et al., 2015). It was chosen to avoid the potential ceiling effect concerning the accuracy of the Flanker task that was reported for kindergartners (e.g., Diamond et al., 2007). The Hearts-and-Flowers task therefore allows to measure change due to physical activity for a greater percentage of children. However, the tasks apparently assess different aspects of inhibition: While the Flanker task assesses the resistance against distraction, the Hearts-and-Flowers task measures the resistance against a predominant response. The Flanker task thus might be more sensitive to the effect of acute physical activity. This assumption is supported by Kubesch et al. (2009), who found a positive effect of 30 min aerobic exercise on the performance of adolescents in the Flanker task, but not in the Hearts-and-Flowers Task. An additional limitation of the applied measures is that only inhibition and shifting were assessed to represent the construct of executive functions without taking updating into account. In general, the evidence for the beneficial effects of acute physical activity on updating are inconsistent, including studies that found no effect (e.g., Cooper et al., 2012) and studies that found a positive effect (e.g., Jäger et al., 2015) for children and adolescents. Therefore, further studies should include measures for all three executive functions to allow for generalized predictions on the effect of acute physical activity on executive functions. Moreover, it has been suggested that executive functions do not only involve cognitive "cold" functions, as assessed in the present study, but also "hot" executive functions that refer to affective cognitive abilities (Zelazo and Müller, 2002). Such "hot" executive functions play a central role in many situations in which decisions have to be made that might have marked emotional consequences and that require the control of emotional arousal. Social and behavioral aspects of hot executive functions can be differentiated. Social aspects involve, for instance, negotiations with other persons or solving interpersonal conflicts. Behavioral aspects include abilities like waiting for a delayed gratification (Mischel et al., 1989) or choosing the less risky but less promising alternative in a gambling game (Kerr and Zelazo, 2004). "Hot" aspects of executive functions might also benefit from physical activity, especially if it is executed in group settings that involve emotionally relevant aspects, such as social comparisons or waiting until it is one's turn. Thus, future research might widen the focus to uncover whether there are effects of physical activity on a broader range of executive functions.

Until now, little evidence exists for the positive effect of an acute intervention on cognitive functions of kindergartners. Palmer et al. (2013) showed that the attention of kindergartners benefitted from an acute coordinative bout, but there was only a marginal effect on inhibition. Similarly, the study of Mierau et al. (2014) and the current study did not find a general effect of an acute coordinative intervention on inhibition or shifting for this age group. Thus, the results from older children, adolescents, and adults could not be replicated for kindergartners so far. A possible explanation is the still poor maturation of the prefrontal cortex in kindergartners, a brain region that is associated with executive functions (Gogtay et al., 2004). Accordingly, the neuronal association between the prefrontal cortex and the cerebellum could not be developed sufficiently in order to enable a co-activation in complex executive function tasks or motor tasks. Therefore, basal cognitive functions like attention, which develops earlier (Garon et al., 2008), might be better abilities to be improved by an acute bout of physical activity in kindergartners than executive functions. Further studies with kindergartners are needed to draw a reliable conclusion if acute physical activity can benefit cognitive functions in this age group.

## CONCLUSION AND IMPLICATIONS

Taken together, the current study revealed small to moderate relationships between executive and motor functions in kindergartners. Although the concrete underlying processes of this association are not fully understood, it could be assumed that motor and executive functions could affect each other bi-directionally. Nevertheless, in the current study the coordinative intervention did not lead to a larger gain of kindergartners' executive functions in general, compared to a control condition. The intervention augmented only motor inhibition, if it was tested first after the intervention. Thus, there is no simple mode to enhance executive functions of kindergartners in general by acute coordinative interventions. Instead, such interventions might yield specific effects, depending on the design, and further research might uncover the conditions under which these effects occur. It might also be promising to investigate the effect of an acute bout of physical activity on more classroom learning related measures of executive functions that involve an emotional component (e.g., waiting for a turn or suppressing impulsive reactions) as well as effects on more basal cognitive functions in addition to executive functions in kindergartners. Moreover, it could be useful to measure neurophysiological (e.g., brain activity by means of eventrelated potentials) and physiological parameters (e.g., release of neurotransmitters) to analyse the underlying processes. Even if no overt effect of an acute bout of physical activity in behavioral measures is evident, compensatory mechanisms optimizing task performance could be uncovered by this means.

## AUTHOR CONTRIBUTIONS

MS designed and conducted the study and wrote large parts of the manuscript. ME supported designing the study, wrote parts of the manuscript and critically revised it for important intellectual content. MA wrote large parts of the result section and supported the statistical analyses and interpretation of the data. All authors gave their final approval of the manuscript version to be published and agreed to be accountable for all aspects of the work in ensuring that questions related to the accuracy or integrity of any part of the work are appropriately investigated and resolved.

## REFERENCES

fpsyg-08-00859 May 26, 2017 Time: 15:44 # 12


## SUPPLEMENTARY MATERIAL

The Supplementary Material for this article can be found online at: http://journal.frontiersin.org/article/10.3389/fpsyg. 2017.00859/full#supplementary-material

children. Psychol. Sport Exerc. 15, 627–636. doi: 10.1016/j.psychsport.2014. 06.004



ventral subregions," in Developmental Cognitive Neuroscience. Handbook of Developmental Cognitive Neuroscience, eds C. A. Nelson and M. L. Collins (Cambridge, MA: MIT Press), 575–590.



**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2017 Stein, Auerswald and Ebersbach. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

## Mindfulness Plus Reflection Training: Effects on Executive Function in Early Childhood

Philip David Zelazo<sup>1</sup> \*, Jessica L. Forston<sup>2</sup> , Ann S. Masten<sup>1</sup> and Stephanie M. Carlson<sup>1</sup>

1 Institute of Child Development, University of Minnesota, Minneapolis, MN, United States, <sup>2</sup> Learning Tree Yoga, Minneapolis, MN, United States

Executive function (EF) skills are essential for academic achievement, and poverty-related stress interferes with their development. This pre-test, post-test, follow-up randomized-control trial assessed the impact of an intervention targeting reflection and stress reduction on children's EF skills. Preschool children (N = 218) from schools serving low-income families in two U.S. cities were randomly assigned to one of three options delivered in 30 small-group sessions over 6 weeks: Mindfulness + Reflection training; Literacy training; or Business as Usual (BAU). Sessions were conducted by local teachers trained in a literacy curriculum or Mindfulness + Reflection intervention, which involved calming activities and games that provided opportunities to practice reflection in the context of goal-directed problem solving. EF improved in all groups, but planned contrasts indicated that the Mindfulness + Reflection group significantly outperformed the BAU group at Follow-up (4 weeks post-test). No differences in EF were observed between the BAU and Literacy training groups. Results suggest that a brief, small-group, school-based intervention teaching mindfulness and reflection did not improve EF skills more than literacy training but is promising compared to BAU for improving EF in low-income preschool children several weeks following the intervention.

#### Edited by:

Mariëtte Huizinga, VU University Amsterdam, Netherlands

#### Reviewed by:

Anna Ridderinkhof, University of Amsterdam, Netherlands Kimberly Schonert-Reichl, University of British Columbia, Canada

> \*Correspondence: Philip David Zelazo zelazo@umn.edu

#### Specialty section:

This article was submitted to Developmental Psychology, a section of the journal Frontiers in Psychology

Received: 24 October 2017 Accepted: 07 February 2018 Published: 26 February 2018

#### Citation:

Zelazo PD, Forston JL, Masten AS and Carlson SM (2018) Mindfulness Plus Reflection Training: Effects on Executive Function in Early Childhood. Front. Psychol. 9:208. doi: 10.3389/fpsyg.2018.00208 Keywords: mindfulness, reflection, executive function, intervention, preschool

## INTRODUCTION

Executive function (EF) skills (cognitive flexibility, working memory, inhibitory control) are essential for goal-directed problem solving and classroom learning, and as such, they are important for kindergarten readiness (see Zelazo et al., 2017, for a review). Relations between EF and academic achievement in early childhood are robust. Results of a meta-analysis showed a mean effect size of r = 0.27 across 75 studies of preschool and kindergarten age children, indicating a moderate and statistically significant association (Allan et al., 2014). Children who arrive at school with wellpracticed EF skills may find it easier to sit still, pay attention, remember and follow rules, control their impulses, wait their turn, and flexibly consider new ideas and different perspectives. This, in turn, may initiate a cascade of beneficial consequences: Children may learn more easily, gain confidence, enjoy going to school, and get along better with teachers and peers. Moreover, EF skills, and the reflective processes that underlie them, may jointly allow for a more fully engaged, active, and intentional form of learning (Marcovitch et al., 2008; Zimmerman, 2008). Evidence indicates that preschoolers with better EF skills do indeed learn more from a given amount of instruction and practice (Welsh et al., 2010; Benson et al., 2013; Hassinger-Das et al., 2014; Bascandziev et al., 2016).

EF skills may be especially important for children from lower socioeconomic (SES) backgrounds, in part because of the bidirectional relations between EF and stress. Children with lower SES show lower levels of EF skill, even controlling for general cognitive skills (e.g., Mezzacappa, 2004; Noble et al., 2005; Farah et al., 2006; Obradovic, 2010; Masten et al., 2012 ´ ). They also show higher levels of stress and stress hormones, which undermine the use of EF skills and interfere with EF development (e.g., Evans and Schamberg, 2009; Blair et al., 2011; Hostinar et al., 2014). In contrast, strong EF skills may protect against the risks associated with poverty and adversity (Obradovic, 2010; ´ Masten et al., 2012). EF skills are instrumental in regulating stress (e.g., Zelazo and Lyons, 2012; Hostinar et al., 2014; Blair and Raver, 2015), so the combination of high stress and low EF skills may pose a substantial and potentially synergistic risk to healthy neurocognitive development and adaptation more generally (Masten, 2014).

A growing body of evidence indicates that EF skills can be fostered by relatively brief interventions that provide children with opportunities to practice their developing EF skills at increasing levels of challenge (e.g., Rueda et al., 2005; Karbach and Kray, 2009; Thorell et al., 2009; Mackey et al., 2011; Tominey and McClelland, 2011; Neville et al., 2013; Weiland and Yoshikawa, 2013; Schmitt et al., 2015; see Diamond and Lee, 2011, for a review). These interventions often require children to pause momentarily and reflect before responding: in other words, to be intentional about their cognition and behavior. The repeated engagement and use of reflection and EF skills in problem solving evidently strengthens those skills, increases the efficiency of the corresponding neural circuitry, and increases the likelihood that the skills will be activated in the future (Zelazo, 2015).

According to the Iterative Reprocessing model (e.g., Cunningham and Zelazo, 2007; Zelazo, 2015), reflection involves noticing challenges, pausing, considering the options, putting things into context prior to responding, and monitoring progress toward a goal. When children respond to situations reactively, without much reflection upon what they are doing, they are more likely to show classic EF failures, such as treating a new situation as if it were an old, familiar one.

Espinet et al. (2013) provided preschool-age children with ∼20 min of "reflection training" in the context of a challenging EF task, the Dimensional Change Card Sort (DCCS). Children who perseverated on this task were taught to pause before responding, reflect on the conflict inherent in the task, and formulate higherorder rules for responding flexibly: "In the color game, if it's a green pig, then it goes here; but in the shape game, that same green pig goes there." Compared to children who received only minimal yes/no feedback (without practice in reflection) and to children who received mere DCCS practice with no feedback at all, children who received reflection training showed significant improvements in performance on a subsequent administration of the DCCS. Improvements were also seen on other tasks, including a measure of flexible perspective taking (a false belief task), and these behavioral changes were accompanied by predictable changes in children's brain activity, specifically a reduction in the amplitude of the N2 component in the ERP.

Moriguchi et al. (2015) also provided 3- to 5-year-old children with practice on the DCCS, but then had children teach the rules to a puppet, which demands consideration and reconsideration of what is being taught. Compared to controls, trained children showed considerable improvement in performance on the DCCS along with increased brain activity (oxygenated hemoglobin) in the left lateral parts of prefrontal cortex.

In general, EF training studies suggest it is possible to train high-level skills like reflection and cognitive flexibility, with corresponding neural changes. A consequence is that trained networks become more efficient (e.g., Hebb, 1949), so reflection and executive function occur more automatically and more quickly, providing more time for thoughtful consideration of options prior to overt action or to decision making. Although there are questions about the extent to which the benefits of EF training transfer to new situations (e.g., Diamond and Lee, 2011, for review), it has been proposed that supplementing direct EF skills training with reflection training facilitates transfer by inducing metacognitive awareness of the skills and their range of application (Zelazo, 2015).

Another, complementary approach to reflection training explicitly addresses stress reduction through mindfulness (for review see Shapiro et al., 2014). Mindfulness is a practice that entails attending to one's moment-to-moment experiences and reflecting on them in a nonjudgmental and nonreactive way. Mindfulness may be cultivated through a variety of attentional exercises, such as those included in Mindfulness Based Stress Reduction training (MBSR; Kabat-Zinn, 2003), and has been applied in a range of contexts (e.g., Segal et al., 2013; Bögels and Restifo, 2014). For example, during mindful practice, adult individuals might initially intend to focus their attention on their breathing. When they notice that their mind has wandered, they simply bring their attention back to their breathing. As with reflection and EF skills, repeated practice in becoming reflectively aware of attentional lapses presumably renders the neural networks involved in attention regulation stronger and more efficient.

A growing literature indicates that repeated engagement in mindfulness practices do indeed improve performance on measures of EF and emotion regulation (e.g., Baer, 2003; Grossman et al., 2004; Ortner et al., 2007; Tang et al., 2007; Chambers et al., 2008; Heeren et al., 2009; Zeidan et al., 2010; Schonert-Reichl et al., 2015; Zoogman et al., 2015; Lyons and DeLange, 2016; Kaunhoven and Dorjee, 2017). Improvements in emotion regulation may mediate observed reductions in social anxiety, depression, and rumination (e.g., Goldin and Gross, 2010). In addition, however, practice being nonjudgmental may promote calmness and well-being, as may focusing on the present moment (e.g., instead of ruminating over a recollected source of anxiety; Kabat-Zinn, 2003).

In children, mindfulness training often includes small group activities designed to promote sustained introspective reflection on various experiences (e.g., Flook et al., 2010). For example, to foster awareness of internal states, children might describe how different parts of their bodies feel from head to toe. Props may scaffold these exercises; for example, holding a hula hoop around their bodies and moving it up and down helps children focus attention to a zone like their shoulders, and a stuffed animal may be placed on children's abdomens to help them pay attention to their breathing as they lie down on a mat and breathe to lift the animal up and down.

In the current study, we examined the impact of a 6 week intervention for low-income preschoolers that combines reflection training and mindfulness. The combined intervention was delivered by trained teachers during 30 daily small-group sessions over 6 weeks in preschool classrooms. We expected that mindfulness activities and reflection training would provide a synergistic combination for boosting EF skills that would be wellsuited to this population. Whereas mindfulness training (e.g., belly breathing; body scan) was expected to help children calm down, regulate stress, become aware of moment-to-moment experience, and sustain attention, reflection training in the context of EF games should also help children recognize when they need to "go off autopilot" and instead act deliberately, relying on their EF skills to achieve their goals. Reflection training occurred in the context of 3 EF-challenging games presented with reflection protocols designed to provide explicit consideration of their own thoughts, emotions, and behavioral tendencies in the context of goal-directed problem solving. The EF games were adapted from an EF intervention (Ready? Set. Go!) designed by the authors for use with homeless and highly mobile children (Casey et al., 2014). For each game, reflection protocols were designed to help teachers: scaffold children's performance on the game, adjusting the degree of challenge to maintain engagement; encourage children to notice sources of difficulty in the game and to acquire strategies for pausing, stepping back, and acting deliberately.

The active control condition (Literacy training) allowed for differentiating effects specific to the Mindfulness + Reflection training condition, controlling for receipt of an effective smallgroup pull-out intervention from a novel instructor for the same amount of time. We expected children in the Mindfulness + Reflection group to show greater improvement in EF at post-test and follow-up, compared to both BAU and Literacy children. Children in the Literacy condition were expected to show improvements on a standardized measure of early literacy (the Woodcock-Johnson III Letter-Word Identification subtest), compared to both BAU and Mindfulness + Reflection children. A measure of theory of mind served as a potential marker of improved awareness of self and other, and children in the Mindfulness + Reflection condition were expected to show the largest improvements.

## METHODS

#### Participants

The sample of 218 children (M = 57 months, SD = 3.7, range = 47–63 months) included all preschool children at two schools serving low-income families. One school in Houston, Texas, served children who were primarily Hispanic White: White = 55%; More than one = 32%, African American = 9%, Native American = 3%, Hispanic = 97.4%. The other school, in Washington, DC, served children who were African American (100%). The sample included 101 males and 117 females (53.7%; 50.5% in DC and 56.1% in Houston). The study protocol was approved by the Institutional Review Board for Human Participants at the University of Minnesota, and all parents were provided with written information about the study and received a passive (opt-out) consent form. Parents were invited to fill out a Family Information Questionnaire (FIQ) for a \$10 gift card. In DC, only 28 families (29%) returned a FIQ. In Houston, 91 did so (74%). The median reported family income for both sites was \$25,000–50,000 annually. See **Table 1** for demographic information by location.

### Design

The sample size was determined based on the effect sizes reported in prior literature (e.g., Blair and Raver, 2014). An a priori power analysis using G∗Power (v. 3.1; Faul et al., 2007) indicated that a sample of 200 children should provide sufficient power (>0.8) to detect a small to moderate interaction effect of time by condition assuming α = 0.05. Within each school, children were randomly assigned to Mindfulness + Reflection (n = 72), Literacy (n = 76), or Business as Usual (n = 68) conditions. Business as Usual involved regular classroom activities at the Houston school, and a Second Step socialemotional learning intervention (Committee for Children, 2011) at the Washington, DC school. Primary dependent measures (executive function, theory of mind, teacher-rated behavior, and academic achievement) were administered at three time points: (1) within 2 weeks prior to the start of the 6-week intervention (Pre-test), (2) within 2-weeks following the intervention (Posttest), and (3) 4–6 weeks following the Post-test (Follow-up). Additional measures (intelligence and school district measures) were obtained at one time point only (Pre-test or Follow-up).

#### Measures

Several direct behavioral assessments were administered at pretest, post-test, and follow-up. These included three measures

TABLE 1 | Demographic information by location.


C1, Primary caregiver.

of executive function, a measure of theory of mind, and a measure of early literacy. In addition, teacher ratings of children's behavior were obtained at each time point. Children's IQ was assessed at pre-test only. For one school (DC), we had access to additional data collected by the school district following the intervention.

#### Executive Function

#### **Head-toes-knees-shoulders (Ponitz et al., 2008)**

Children were invited to play a game like "Simon Says." Following a practice round, in part 1, they were instructed to touch their head whenever the examiner said, "touch your toes" and vice versa. If the child passed this section, then in part 2, they were given the additional instruction to touch their knees whenever the examiner said, "touch your shoulders" and vice versa (10 trials). Each trial was scored as 0 (wrong action), 1 (selfcorrect), or 2 (correct), with up to 20 trials, for a total possible score of 0–40. This task was designed for ages 4–7 years, has adequate test-retest reliability (0.78; Lipsey et al., 2017), and takes 5–12 min.

#### **Peg tapping (Diamond and Taylor, 1996)**

Children were given a wooden peg, identical to a peg held by the examiner. They were instructed to tap their peg twice when the examiner tapped his/hers once, and vice versa. Following up to two practice trials per instruction, there were 16 test trials, for a possible final score of 0–16. This task is appropriate for ages 3–5 years, has adequate test-retest reliability (0.80; Lipsey et al., 2017), and takes 5–7 min.

#### **Minnesota executive function scale (MEFS; Carlson and Zelazo, 2014)**

In this standardized computer tablet-based assessment designed for participants age 2 and up, children were instructed to sort virtual cards into one of two boxes on the screen according to an increasingly complex set of rules. The MEFS is nationally normed, has been used with over 30,000 children, and has adequate test-retest reliability (0.86; Carlson, 2017). Past studies have established multiple forms of criterion validity for the MEFS (e.g., Doom et al., 2014; Fuglestad et al., 2014; Hassinger-Das et al., 2014; Prager et al., 2016). Scores are automatically computed using an algorithm that combines accuracy and response time, and can range from 0 to 100. The MEFS is adaptive to children's ability and takes ∼4 min to complete.

#### Theory of Mind

#### **Theory of mind scale (Wellman and Liu, 2004)**

This measure consists of 5 brief vignettes in which children are asked to reason about the mental state of a protagonist, with increasing levels of difficulty (discrepant desire, knowledge/ignorance, discrepant belief, false belief, discrepant emotion). To receive credit for each level, they had to answer both the test and memory control questions correctly. Total scores could range from 0 to 5. The ordinal scale of this measure was confirmed in longitudinal research across the preschool period (Wellman et al., 2011).

#### Literacy

Literacy was assessed at all three time points using the Woodcock-Johnson III (WJ-III) Letter-Word Identification subtest (Woodcock et al., 2001). Items require children to identify and pronounce individual letters and words. Testing followed the standardized procedure with age-appropriate starting points. Raw scores were calculated based on the number of correct responses.

#### Teacher Report Measures

Teachers were invited to complete the Children's Behavior Questionnaire (CBQ; Very Short Form; Putnam and Rothbart, 2006), as well as the Child Behavior Rating Scale (CBRS; Bronson et al., 1990), at each time point (pre-test, post-test, and followup). The authors of each measure report adequate test-retest reliability. Teachers were compensated \$10 for each report in the form of gift cards (up to \$60 per child).

The 36-item CBQ-VSF asked parents to rate their child's temperament in a variety of situations and contexts. Twelve items each contributed to three subscales, Surgency, Negative Affect, and Effortful Control, with alphas of 0.75, 0.72, and 0.74, respectively (Putnam and Rothbart, 2006). Surgency reflects positive loadings for Impulsivity, High Intensity Pleasure, and Activity Level items, and negative loadings for Shyness items. Negative Affect reflects positive loadings for Sadness, Fear, Anger/Frustration, and Discomfort items and negative loadings for Falling Reactivity/Soothability items. Finally, Effortful Control reflects positive loadings for items indicating Inhibitory Control, Attentional Control, Low Intensity Pleasure, and Perceptual Sensitivity.

#### Additional Measures

IQ was estimated using the Stanford-Binet Early 5 (Abbreviated IQ; Roid, 2005) at one time point only (pre-test). Standard protocols and scoring methods were used.

The Washington DC group only was given Spring Assessments by the school district including: the Peabody Picture Vocabulary Test (PPVT-IV; Dunn and Dunn, 2007), the Devereaux Early Childhood Assessment (DECA; LeBuffe and Naglieri, 2012), which is a teacher-report measure of the child's Attachment/Relationships, Behavioral Concerns, Initiative, and Self-control; the Test of Early Math Abilities (TEMA; Ginsburg and Baroody, 2003); and the Strategic Teaching and Evaluation of Progress (STEP) (a direct assessment of reading readiness; Kerbow and Bryk, 2005).

## Procedure

Four local teachers (two in each city) were recruited to deliver the two active interventions, Mindfulness + Reflection and Literacy. These teachers received a full day of training at the University of Minnesota. Two teachers were trained to administer activities in the 14-lesson mindfulness curriculum (see Appendix in Supplementary Material), as well as three EF-challenging games presented with reflection protocols. Two teachers were trained to administer early literacy lessons from the Opening the World of Learning (OWL) curriculum (see www.pearsonlearning.com/ microsites/owl/main.cfm; Schickedanz and Dickinson, 2005).

Children were tested by trained assessors (n = 3 per site) individually at their schools in spare classrooms, staff rooms, or the cafeteria. At the Houston site, the assessors were bilingual in English and Spanish and presented the tasks in the child's preferred language. Pretesting took place prior to the start of the intervention (December–January). The interventions took place in January–February. Post-testing took place in the 2 weeks immediately following the intervention, and again 4–6 weeks later.

Both active interventions, Mindfulness + Reflection and Literacy, were provided to children during 30 small-group (8– 12 children) sessions (24 min each; daily for 6 weeks). Children in the Business as Usual (BAU) group remained in the classroom and engaged in regularly scheduled activities and exercises; BAU children in DC received the Second Step intervention during this period. Children in the Mindfulness + Reflection group participated in a variety of brief (e.g., 2 min) mindfulness and relaxation practices adapted for children, along with three EFchallenging games, HTKS, Bear/Dragon/Simon Says, and Mother May I? The mindfulness exercises, often involving small props (e.g., a snow globe), were introduced and repeated across sessions (see Appendix in Supplementary Material for examples). The EF games each had six levels of EF challenge that allowed instructors continually to challenge children's skills to a moderate degree. Instructors encouraged children to notice and discuss their thoughts, emotions, and behavioral tendencies. For example, in Bear/Dragon/Simon Says, children start with much easier version of Simon Says in which they are shown two puppets and first asked simply to follow the command of one puppet, then to ignore the command of another puppet, then to alternate between them, and so on through increasing levels of EF challenge (see **Table 2**).

Intervention teachers were also given other techniques for adjusting the level of EF challenge so that the activities continued to be challenging for most if not all children in the group. For example, they were told they could use exaggerated "nice" and "mean" voices to help children remember whom to obey, remind children to "use your brain" or adopt a 3rd-person perspective, and when children become proficient at Bear/Dragon, they could try playing regular Simon Says.

The Literacy group received lessons taken from the OWL curriculum. This active control condition allowed for the identification of effects that are specific to the Mindfulness + Reflection training by providing control participants with cognitive enrichment activities, interaction with a novel teacher, and involvement in a program outside the classroom.

TABLE 2 | Adaptive levels of difficulty for bear/dragon/simon says.

Level 1: Follow Bear

Level 2: Don't Listen to Dragon (sitting on hands)

Level 3: Don't Listen to Dragon (standing)

Level 4: Bear and Dragon together with modeling

Level 5: Bear and Dragon together without modeling

Level 6: Reverse Bear and Dragon instructions

#### RESULTS

The initial sample included 218 children, and some data were missing from the final data set due to variations in teacher compliance (for teacher reported measures), child absences, or experimenter error. For direct behavioral measures, the final sample sizes ranged from 185 to 216 (mean N = 202). For teacher report measures, the final sample sizes ranged from 92 to 192 (mean N = 149). The majority of missing data were from teacher reports at Time 2, which came at a busy time in the Spring term. We examined how missingness on the key measures was correlated with other variables and discovered the only systematic factor was study location. Participants in DC were more likely to be missing Stanford Binet (r = −0.136), CBQ and CBRS at Time 1 (rs = −0.362), MEFS at Time 1 (r = −0.174), and Peg Tapping at Time 3 (r = −0.136), whereas participants in Houston were more likely to be missing several measures at Time 2, including Letter/Word Knowledge (r = 0.252), HTKS (r = 0.28), Theory of Mind Scale (r = 0.28), MEFS (r = 0.242), and Peg Tapping (r = 0.258) (all ps < 0.05). These patterns appeared to be due to logistical and staffing issues at the sites rather than differences in the children. Nevertheless, we included Location as a factor in the main analyses. Missing data were treated as missing using pairwise deletion in correlations and listwise deletion in repeated measures ANOVAs.

All analyses were two-tailed with alpha set to 0.05. Children in the three randomly assigned groups did not differ significantly at Pre-test on age, sex, IQ (Stanford-Binet), or any of the pretest measures of literacy (WJ Letter/Word Knowledge), theory of mind (ToM Scale), or EF (HTKS, Peg Tapping, MEFS), all ps > 0.10 (see **Table 3**).

Correlations among all study variables at Pre-test are shown in **Table 4**. IQ was moderately correlated with several measures of EF, ToM, and Literacy, as expected. The three EF measures (HTKS, Peg Tapping, and MEFS) were moderately correlated with one another (showed intra-individual reliability over time), thus we computed composite EF scores for each time point, by averaging the proportion scores on each EF task (proportion out of 40 on HTKS, out of 16 on Peg Tapping, and out of 100 on MEFS), yielding an EF score (0–1.0) for Pre-test, Post-test, and Follow-up for each individual. This method maximized our N for the overall EF analyses by accommodating missing data on a single EF measure. Data on one or more EF tasks were missing for 7% of participants.

Next, we examined effects of the interventions on EF composite scores. As shown in **Table 5**, there was a highly significant linear effect of time, indicating that most children improved over the course of the study, from Pre-test to Posttest to Follow-up. There was no effect of Condition, and no interaction effect (**Figure 1**). In planned contrasts, however, the Mindfulness + Reflection group outperformed the BAU group (p < 0.05) whereas the Literacy group did not do significantly better than BAU (p = 0.173). Follow-up tests showed this advantage for the Mindfulness + Reflection group was a trend at the immediate post-test but significant at the delayed post-test, 4–6 weeks after the intervention was completed.

#### TABLE 3 | Descriptive statistics.


BAU, Business as Usual; M + R, Mindfulness + Reflection; SBIQ, Stanford-Binet IQ; HTKS, Head Toes Knees Shoulders; Peg Tap, Peg Tapping; MEFS, Minnesota Executive Function Scale; ToM, Theory of Mind Scale; Literacy, Woodcock-Johnson III Letter-Word Identification subtest; CBQ, Children Behavior Questionnaire; EC, Effortful Control; Srg, Surgency; NA, Negative Affect; CBRS, Child Behavior Rating Scale.

Given the substantial growth in EF shown by the whole preschool sample, we examined the rank order of participants at each time point as a function of group assignment, using zscores for the EF Composite (which resets the mean to 0 at each time point). As illustrated in **Figure 2**, children's ranks improved considerably for the Mindfulness+ Reflection group, whereas they declined for the BAU group and remained stable for the Literacy group. At Follow-up, the difference between Mindfulness + Reflection and BAU was significant, p < 0.05.

#### Individual EF Task Analysis

In the HTKS task, there was a significant linear effect of time and a marginally significant effect of condition. Although there was no interaction between time and condition, planned contrasts revealed that the Mindfulness + Reflection group performed significantly better than the BAU control group (**Figure 3**). Posthoc t-tests showed the difference in performance was significant at Follow-up, t(129) = −2.23, p = 0.028. The Literacy training group also trended toward superior performance compared to BAU overall, p = 0.062, but was not significantly different from BAU at any given time point. There was a Time × Location interaction, in which the Houston sample improved more on the HTKS over time than did the DC sample, F(1, 172) = 18.4, p < 0.0001, η<sup>p</sup> <sup>2</sup> = 0.10.

On Peg Tapping, there was a significant linear and quadratic effect of time, but no effect of condition and no interaction



p < 0.10; \*p < 0.05; \*\*p < 0.01; \*\*\*p < 0.0001; SBIQ, Stanford-Binet IQ; HTKS, Head Toes Knees Shoulders; Peg Tap, Peg Tapping; MEFS, Minnesota Executive Function Scale; ToM, Theory of Mind Scale; Literacy, Woodcock-Johnson III Letter-Word Identification subtest; CBQ, Children Behavior Questionnaire; EC, Effortful Control; Srg, Surgency; NA, Negative Affect; CBRS, Child Behavior Rating Scale.

(**Table 5**). Although the overall difference between Mindfulness + Reflection and BAU was non-significant, there was a trend at Post-test 1, t(123) = −1.71, p = 0.09. There also was a significant quadratic interaction effect of Time × Location, F(1, 174) = 6.46, p = 0.012, η<sup>p</sup> <sup>2</sup> = 0.04, such that the Houston sample improved more from Pre-test to Post-test 1 than did the DC sample.

On the MEFS, there was again a significant linear effect of time, no effect of condition or location, and no interactions (**Table 5**). In contrast to HTKS and Peg Tapping, there was no evidence of an advantage for the Mindfulness + Reflection group at any time point.

#### Other Measures

For the Theory of Mind Scale, there was no effect of condition or any interactions involving condition. There was a significant effect of location, however, in which children in the Washington, DC sample performed significantly better overall than children in the Houston sample.

Analysis of the WJ Letter-Word Identification test showed a highly significant linear effect of time, but no effect of condition or Time × Condition interaction. The DC sample had higher literacy scores than the Houston sample overall, as might be expected given the high rate of English Language Learner status in the latter group.

For the Washington DC school only, children were administered standardized assessments by the school district, following completion of the intervention period. A MANOVA with planned contrasts found no significant effects of Condition. Planned contrasts showed a trend for the Mindfulness + Reflection group, M(31) = 0.68, SD = 0.87, doing better than the BAU group, M(31) = 0.29, SD = 0.90, on the STEP (a reading readiness assessment), p = 0.087. (Note that scores on this measure ranged from −1 to +2.)

Teachers reported on children's behavior observed in the classroom at all three time points, although several children did not have complete data. Results for the repeated measures ANOVAs are shown in **Table 5**. On the CBQ Effortful Control subscale, there was a main effect of condition, with the Literacy group being rated higher than the other two groups at all time points. There was no difference between Mindfulness + Reflection and BAU on teacher ratings of Effortful Control. On the CBQ Surgency subscale, ratings generally increased over time, but this did not interact with condition, and there was no difference between M+R and BAU. On the CBQ Negative Affect subscale, there was a significant effect of location, with the children in Houston being rated higher in Negative Affect than those in Washington, DC. This did not differ by condition, but it did interact with time, F(1, 84) = 4.56, p = 0.038, η<sup>p</sup> <sup>2</sup> = 0.05, such that ratings in the two locations became more similar over time. There was no difference between the Mindfulness + Reflection and BAU conditions. Lastly, on the Children's Behavior Rating Scale, there was a marginal effect of time (scores increasing) but this did not interact with condition and there was no difference between the Mindfulness + Reflection and BAU groups.

#### DISCUSSION

The aim of this study was to test the effectiveness of an intervention designed to improve EF skills in preschool children at-risk for school failure. The 6-week small group pull-out intervention was comprised of mindfulness (to reduce stress and increase sustained attention) and reflection (to increase meta-cognition and verbal self-regulation in the context of goal-directed problem solving). A well-established pre-literacy curriculum served as an active control condition. At Pre-test, there were no differences among conditions on any of the relevant variables (all ps > 0.10).

Teacher ratings of behavior showed few condition differences and no Condition × Time interactions indicating intervention effects. Direct behavioral assessments of EF, however, revealed some intervention effects. All groups showed improvement in EF skills (measured behaviorally) over the 5-month span of the study, which was expected because the preschool period is marked by particularly rapid EF development (Carlson et al.,


M + R, Mindfulness + Reflection; BAU, Business as Usual; SBIQ, Stanford-Binet IQ; HTKS, Head Toes Knees Shoulders; Peg Tap, Peg Tapping; MEFS, Minnesota Executive Function Scale; ToM, Theory of Mind; Literacy, Woodcock-Johnson III Letter-Word Identification subtest; CBQ, Children's Behavior Questionnaire; CBRS, Child Behavior Rating Scale.

2013). The Mindfulness + Reflection group did not show larger improvements in EF than children in the Literacy group. However, planned contrasts showed that the Mindfulness + Reflection group (only) significantly outperformed the BAU group, with the differences most pronounced at Follow-up. This effect was most clearly seen when examining the rank order of participants at each time point as a function of group assignment. Children's ranks went up markedly over time for the Mindfulness + Reflection group, whereas they declined for the BAU group and remained stable for the Literacy group. Thus, while all children showed improved EF skills, children in the Mindfulness + Reflection group climbed to the top of the class and those receiving BAU occupied the lowest ranks by the end of the study. In contrast, the Literacy group (active control) did not differ from BAU on EF at any time point. In future research, it will be important to investigate the longer-term stability of intervention effects on EF, as well as how improvements in EF may predict improvements in children's academic achievement.

It is notable that of the three EF outcome measures, HTKS showed the strongest results favoring the Mindfulness + Reflection intervention. This task also bears the strongest resemblance to the reflection activities that were repeated throughout the curriculum (modified HTKS and Bear/Dragon), suggesting a near-transfer effect. Peg-tapping, which also requires children to explicitly do an opposite motor activity, showed positive results for Mindfulness + Reflection in the immediate post-test only. The MEFS could be considered a farther transfer

task because it was not directly trained. Similarly, theory of mind, which requires shifting mental perspectives, was not improved by either intervention. Thus, we found a transfer gradient effect in which the activities most similar to the training showed the greatest benefit, consistent with other EF interventions to date (Diamond and Lee, 2011).

Children at the Houston site showed larger improvements on two measures of EF (HTKS and Peg Tapping) than children at the DC site, and the English-speaking DC sample had higher literacy scores than the bilingual Houston sample overall. Location differences are difficult to interpret because the two sites differed in a variety of ways, but these findings highlight the need to consider the range of contexts in which particular interventions are most effective. One possible explanation for the site differences is that parents of children in the Houston site may have been more engaged. Whereas only 29% of the DC families returned a Family Information Questionnaire (FIQ), 74% of the Houston families did so.

Overall, results suggest that a brief small-group schoolbased intervention that teaches mindfulness and reflection in the context of goal-directed problem solving is promising for improving EF skills in pre-school age, low-income children, and that the effects of this intervention on EF may become more pronounced during in the weeks following the intervention. The finding that effects become more pronounced following the intervention, a "sleeper effect," is consistent with the idea that these skills require time for consolidation, independent practice, or generalization to the context of the EF assessments (Hermida et al., 2015).

The importance of EF in early childhood education is increasingly widely recognized, and the participating schools already place a lot of emphasis on self-control. For this reason, it is possible that the baseline rate of EF development in this sample was already very high. The MEFS measure is standardized and, in fact, the children in our study performed at the 47th percentile nationally, whereas low-income children score at the 38th percentile on average (Carlson, 2017). It is possible, therefore, that this RCT subjected the Mindfulness + Reflection intervention to an overly rigorous test, and future research might usefully include a larger and more diverse sample of children, from a wider range of schools. We also do not know how well or faithfully the interventions were implemented because the fidelity of implementation was not assessed in this initial study.

To the extent that the Mindfulness + Reflection group was better than BAU at Follow-up (the delayed post-test), there is support for the idea that combining mindfulness and reflection training may provide children with potentially valuable improvements in their EF skills. We were unable to parse the separate contributions of mindfulness, reflection, and practice with EF games in the present design, however, we hypothesize that reflection, which fosters an internal verbal commentary about one's actions vis-à-vis goals, is an essential ingredient that may be especially important for allowing transfer of trained EF skills to new situations and assessments (Espinet et al., 2013). Moreover, mindfulness may support reflection training by reducing emotional distress which can interfere with reflection and the top-down control of attention (Zelazo and Lyons, 2012). An important goal for future research will be to reveal the conditions under which interventions of this sort are maximally effective, and for whom. Future research should also address several limitations of the current study that make interpretation difficult. These include the lack of fidelity measures, the low parent participation rate in DC, and the lack of a longer-term followup assessment to examine possible positive cascades or fade-out effects.

## CONCLUSION

Interventions designed to reduce stress and increase reflection may have the potential to help children at risk for a wide range of difficulties. Research is growing on the efficacy of interventions designed to interrupt automatic responding and reflect on situations prior to acting, and there is evidence that the processes involved in reflection become more efficient with practice. Results of this study align with other evidence suggesting that it may be possible to target EF skills during the preschool years to improve school readiness. However, it is clear that further study is needed to elucidate optimal strategies for improving EF skills in high-risk preschoolers, as well as the key moderators of response to intervention. Effects were quite modest in this initial trial. Nonetheless, there were signs of positive change, particularly when measured 4 weeks following the end of the 6-week intervention. Further iterative research is needed to improve the curriculum employed here, consolidate and broaden the generalization of EF skills, study the fidelity of implementation and expand the indicators of response to intervention. Results also suggest that children should be followed for a longer period of time.

The preschool years may be a window of opportunity for the development of EF skills due to a combination of brain plasticity, rapid development of the neurocognitive processes supporting EF skills in this developmental window, and the growing prevalence of preschool attendance and scholarships for low-income children to gain access to high quality early childhood education (e.g., Zelazo, 2015). Basic scientific research on EF suggests that these skills have may have cascading effects on achievement and well-being (e.g., Carlson et al., 2013, for review). Intervention studies using randomized controlled trials offer the best strategy to test the feasibility and efficacy of initiating a positive cascade to success among very disadvantaged

#### REFERENCES


children (Masten and Cicchetti, 2010). This is an important and challenging research agenda that could yield high returns on investment.

## AUTHOR CONTRIBUTIONS

PZ, SC, AM, and JF designed the Mindfulness + Reflection intervention, and JF trained teachers to deliver it; PZ and SC designed the evaluation trial; SC and PZ were responsible for data analysis; PZ, SC, AM, and JF wrote the article.

#### ACKNOWLEDGMENTS

We wish to thank the KIPP School Network, who helped make this study possible, and Catherine Schaefer, who served as project coordinator. Funding for this research was provided by the Character Lab to PZ and SC.

#### SUPPLEMENTARY MATERIAL

The Supplementary Material for this article can be found online at: https://www.frontiersin.org/articles/10.3389/fpsyg. 2018.00208/full#supplementary-material


Hebb, D. O. (1949). The Organization of Behavior. New York, NY: Wiley.


behavioral self-regulation in preschool. Early Educ. Dev. 22, 489–519. doi: 10.1080/10409289.2011.574258


**Conflict of Interest Statement:** PZ, SC, and the University of Minnesota are entitled to royalties from the sale of the Minnesota Executive Function Scale (MEFS) by Reflection Sciences, Inc.

The other authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2018 Zelazo, Forston, Masten and Carlson. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

## A Cluster Randomized-Controlled Trial of the Impact of the Tools of the Mind Curriculum on Self-Regulation in Canadian Preschoolers

Tracy Solomon<sup>1</sup> \*, Andre Plamondon<sup>2</sup> , Arland O'Hara<sup>3</sup> , Heather Finch<sup>4</sup> , Geraldine Goco<sup>5</sup> , Peter Chaban<sup>6</sup> , Lorrie Huggins<sup>7</sup> , Bruce Ferguson1,8,9 and Rosemary Tannock5,10 \*

<sup>1</sup> Department of Psychiatry, The Hospital for Sick Children, Toronto, ON, Canada, <sup>2</sup> Département des Fondements et Pratiques en Éducation, Université Laval, Quebec City, QC, Canada, <sup>3</sup> Lawrence S. Bloomberg Faculty of Nursing, University of Toronto, Toronto, ON, Canada, <sup>4</sup> School of Social and Community Services, George Brown College, Toronto, ON, Canada, <sup>5</sup> Neurosciences and Mental Health, The Hospital for Sick Children, Toronto, ON, Canada, <sup>6</sup> Institute of Medical Science, Faculty of Medicine, University of Toronto, Toronto, ON, Canada, <sup>7</sup> YMCA of Greater Toronto, Toronto, ON, Canada, <sup>8</sup> Department of Psychology, University of Toronto, Toronto, ON, Canada, <sup>9</sup> Department of Psychiatry, University of Toronto, Toronto, ON, Canada, <sup>10</sup> Applied Psychology and Human Development, Ontario Institute for Studies in Education, University of Toronto, Toronto, ON, Canada

#### Edited by:

Dieter Baeyens, KU Leuven, Belgium

#### Reviewed by:

Antje Von Suchodoletz, New York University Abu Dhabi, United Arab Emirates Regula Neuenschwander, University of Bern, Switzerland

#### \*Correspondence:

Tracy Solomon tracy.solomon@sickkids.ca Rosemary Tannock rosemary.tannock@utoronto.ca

#### Specialty section:

This article was submitted to Developmental Psychology, a section of the journal Frontiers in Psychology

Received: 12 September 2017 Accepted: 27 December 2017 Published: 17 January 2018

#### Citation:

Solomon T, Plamondon A, O'Hara A, Finch H, Goco G, Chaban P, Huggins L, Ferguson B and Tannock R (2018) A Cluster Randomized-Controlled Trial of the Impact of the Tools of the Mind Curriculum on Self-Regulation in Canadian Preschoolers. Front. Psychol. 8:2366. doi: 10.3389/fpsyg.2017.02366 Early self-regulation predicts school readiness, academic success, and quality of life in adulthood. Its development in the preschool years is rapid and also malleable. Thus, preschool curricula that promote the development of self-regulation may help set children on a more positive developmental trajectory. We conducted a clusterrandomized controlled trial of the Tools of the Mind preschool curriculum, a program that targets self-regulation through imaginative play and self-regulatory language (Tools; clinical trials identifier NCT02462733). Previous research with Tools is limited, with mixed evidence of its effectiveness. Moreover, it is unclear whether it would benefit all preschoolers or primarily those with poorly developed cognitive capacities (e.g., language, executive function, attention). The study goals were to ascertain whether the Tools program leads to greater gains in self-regulation compared to Playing to Learn (YMCA PTL), another play based program that does not target self-regulation specifically, and whether the effects were moderated by children's initial language and hyperactivity/inattention. Two hundred and sixty 3- to 4-year-olds attending 20 largely urban daycares were randomly assigned, at the site level, to receive either Tools or YMCA PTL (the business-as-usual curriculum) for 15 months. We assessed self-regulation at pre-, mid and post intervention, using two executive function tasks, and two questionnaires regarding behavior at home and at school, to capture development in cognitive as well as socio-emotional aspects of self-regulation. Fidelity data showed that only the teachers at the Tools sites implemented Tools, and did so with reasonable success. We found that children who received Tools made greater gains on a behavioral measure of executive function than their YMCA PTL peers, but the difference was significant only for those children whose parents rated them high in hyperactivity/inattention initially. The effect of Tools did not vary with children's initial language skills. We suggest that, as both programs promote quality play and that the

two groups fared similarly well overall, Tools and YMCA PTL may be effective curricula choices for a diverse preschool classroom. However, Tools may be advantageous in classrooms with children experiencing greater challenges with self-regulation, at no apparent cost to those less challenged in this regard.

Clinical Trial Registration: ClinicalTrials.gov, identifier NCT02462733.

Keywords: tools of the mind, self-regulation, executive function, preschool, curriculum, intervention

#### INTRODUCTION

We report the results of a cluster-randomized controlled trial of the effectiveness of a preschool curriculum aimed at improving children's self-regulation. Self-regulation refers to the ability to exert control over one's thoughts, feelings and behavior. It is involved in delaying gratification, sustaining focus in the midst of distraction and suppressing strong reaction in provocative situations, opting instead to apply reason. Language plays a central role in self-regulation. According to Vygotsky (1967, 1978), language is not only a cognitive tool for social communication but also permits control over one's own cognitive processes such as memory and attention. Empirical evidence supports this proposition; toddler's vocabulary predicts the development of self-regulation even after controlling for general cognitive development (Vallotton and Ayoub, 2011).

Self-regulation is related to the construct of executive function, neurocognitive processes that exert a top down influence on goal-directed behavior. Characterizations of the relation between self-regulation and executive function vary somewhat in the literature, but researchers generally agree that the core processes of executive function – working memory, inhibition and cognitive flexibility – are critical to self-control. Executive function refers to the cognitive aspect of self-control while selfregulation is concerned with behavior, including emotionally laden behavior, in the social context (see e.g., Blair and Ursache, 2011; Hofmann et al., 2012; Zelazo and Carlson, 2012). In the developmental literature, executive function is typically assessed with cognitive tasks administered to children individually, in a controlled setting, while self-regulation is captured through observation or by asking parents and teachers to complete questionnaires regarding children's everyday behavior. Executive function has been linked to the pre-frontal cortex, which undergoes rapid development in the preschool years (Carlson, 2005; Zelazo and Carlson, 2012), although its development is susceptible to experiential and life stress factors. Several studies have shown that children from low socioeconomic status (SES) families consistently lag their higher SES peers in performance on executive function tasks (Noble et al., 2005, 2007).

Early challenges with self-regulation have considerable longterm consequences. A recent study following a cohort of children from birth to 32 years of age revealed that early childhood self-control predicted health and psychiatric problems, financial security and even criminality in adulthood, after controlling for intelligence and SES (Moffitt et al., 2011). These challenges are already apparent when children enter school, a critical transition that sets the stage for long-term learning. On a recent survey, more than half of a representative sample of American kindergarten teachers attributed children's difficulties in kindergarten to challenges with following directions and maintaining attention. Indeed, teachers ranked these skills as more critical to early school success than content knowledge (Rimm-Kaufman et al., 2000). These observations are supported by findings that self-regulation is more strongly associated with school success than IQ or entry level reading and mathematics skills (Vitaro et al., 2005; Blair and Razza, 2007). Several studies have linked self-regulation to academic achievement, with better self-regulation related to better outcomes (Bull and Scerif, 2001; Welsh et al., 2010; Pingault et al., 2011).

Recent evidence suggests that self-regulation and executive function are both malleable, even in early childhood, and there is evidence that intervention might help to improve their development during this time (Diamond and Lee, 2011; Blakey and Carroll, 2015; Ling et al., 2016). Such findings are promising as Moffitt et al. (2011) have shown that children whose rank on measures of self-control improves between childhood and adolescence fare better in adulthood than their peers whose rank remains relatively stable. This could be because interventions that improve executive function early on might help to close academic achievement gaps down the road, with long term benefits for employment and overall wellbeing. Indeed, children with the weakest executive function skills, who tend also to struggle more academically, seem to gain the most from interventions that target self-regulation (Diamond and Lee, 2011; Blair and Raver, 2014). With a substantial percentage of 3- and 4- year olds now attending preschool (Kena et al., 2014) and evidence that preschool curricula can have a positive impact on school readiness (see, e.g., Gormley et al., 2005; Assel et al., 2007; Domitrovich et al., 2007), it seems reasonable to consider whether preschool curricula that targets the development of selfregulation might help set children on a more positive trajectory at the start of formal schooling.

#### Tools of the Mind

One program that has garnered increasing attention in recent years for its potential to improve self-regulation is the Tools of the Mind curriculum (Tools; Bodrova and Leong, 2007). Tools is based on Luria's (1966) and Vygotsky's (1967, 1978) theories of cognitive development in which the social context of learning, imaginative play, language, and other cognitive tools play a critical role. Tools aims to improve self-regulation by providing frequent, structured opportunities for children to use these cognitive tools to practice self-regulation in the social context. The Tools daily routine is built around a set of activities

carefully scaffolded by teachers that have a clear self-regulatory component. A substantial amount of time is devoted to pretend play. Children work with their teachers to choose a character (e.g., be a 'doctor'), draw a play plan on paper and must then act in accordance with their plan, inhibiting the impulse to act out of character. Teachers refer children back to their play plan should they veer from their designated role. Children are taught to use a variety of cognitive tools, including language (to self and to others), to help regulate their behavior. For example, several activities require that children talk aloud as they complete the appropriate actions such as saying "clap" every time the task requires them to clap their hands. In other activities children are given pictorial cues (e.g., of a pair of ears or a mouth) to help them take turns listening and talking, and to self-regulate inappropriate behavior.

Tools is now used at numerous pre-primary sites in 20 States and in a few sites in Canada. Teachers in the entire country of Chile have been trained in Tools pedagogy (Farran and Wilson, 2014). Tools currently reaches over 30, 000 children<sup>1</sup> and continues to receive a fair amount of public attention; in 2001, the United Nations Education, Scientific and Cultural Organization (UNESCO) added Tools to their list of exemplary instructional innovations<sup>2</sup> and media coverage of the impact of Tools has appeared in major publications such as the New York Times and also on National Public Radio<sup>3</sup>,<sup>4</sup> . Yet efforts systematically to evaluate the effectiveness of Tools for improving self-regulation, socio-emotional and academic outcomes are limited, and the results have been mixed. A summary of this work is shown in **Table 1**.

Seven studies have evaluated the impact of Tools of which three – those with positive effects – have been published (see **Table 1**). Only one study evaluated the kindergarten version; comparing children who received Tools instruction to those who received the business as usual instruction (BAU, the state curriculum) in Kindergarten. Children were assessed in the fall and spring of kindergarten and the fall of first grade. The authors found significant benefits to the Tools group including greater stress reduction, and greater improvement on cognitive and academic measures, with some effects carrying over to first grade. Of note, the effect sizes in the overall sample were relatively small compared to those in high poverty schools (see **Table 1**; Blair and Raver, 2014).

Two published studies investigated the impact of the preschool version of Tools (Diamond et al., 2007; Barnett et al., 2008). Both studies compared low SES children who received Tools to those who received a literacy-focused curriculum. Diamond et al. (2007) found that children in the Tools group performed significantly better than their non-Tools peers on executive function measures, and that the more demanding the executive function task the more strongly performance was correlated with measures of academic achievement. Barnett et al. (2008) reported that Tools significantly improved classroom quality, but the only student measure with a significant finding (from analyses taking the hierarchical nature of the data into account) was that teachers rated Tools children significantly lower on a brief problem behavior scale (measuring externalizing) compared to their non-Tools peers. However, as no baseline data were collected in either study and achievement data were only available for the Tools group in the Diamond et al. study, it remains unclear whether the apparent benefits to the Tools groups reflect an improvement in functioning and if those benefits were unique to the children who received Tools instruction.

Two unpublished studies investigating the Tools preschool program were reported at the meeting of the Society for Research on Educational Effectiveness in 2012 and 2013. These studies involved large samples and rigorous methodology, including assessments at pre and post. Farran and Wilson (2014; see also Wilson and Farran, 2012; Farran et al., 2013) compared children who received the Tools curriculum to those who received the BAU curriculum, which varied across participating sites. Lonigan and Phillips (2012) compared the effectiveness of four curricula; a skills-focused curriculum, the Tools curriculum, the skills focused curriculum enhanced by the pretend play component of the Tools curriculum, and the BAU curriculum that varied across participating sites. The results showed no significant benefits to children who received either the Tools program or the skills focused curriculum enhanced by the pretend play component of Tools. Moreover, both studies found greater advantages to the comparison group children on a range of academic and self-regulation or executive function outcomes (see **Table 1**).

Two additional unpublished studies investigated the impact of adding only the pretend play component of Tools to existing curricula. Clements et al. (2012) compared 4-year-old children who received Building Blocks (BB, a math focused curriculum), to children who received BB plus the pretend play component of Tools, to a control group of children and Morris et al. (2014) investigated the impact of three enhancements to the curricula in preschools in the Head Start program; The Incredible Years (which focuses on teachers ability to create an organized, positive classroom context), Preschool PATHS (which provides teachers with a set weekly lessons focused on improving emotion knowledge and problem-solving skills) and the pretend play component of Tools. Neither study found any significant advantages to including the pretend play element of Tools to the existing curricula. Morris et al. (2014) reported a slightly greater gain in emotion knowledge in the Tools group compared to the group with no enhancements but it did not translate to better problem-solving skills. In sum then, previous research on the effectiveness of Tools has produced some positive evidence for Kindergarten and limited, mixed evidence for the preschool version of the program.

It is difficult to know what to make of the inconsistent pattern of findings in the research to date in part because previous studies have varied considerably in methodology, including sample size and characteristics (such as age and SES), the duration of Tools instruction, and whether children received the Tools program or their regular instruction enhanced by aspects of Tools (see **Table 1**). The research to date has also focused on different

<sup>1</sup>http://toolsofthemind.org/learn/what-is-tools/

<sup>2</sup>http://toolsofthemind.org/about/history/

<sup>3</sup>http://www.nytimes.com/2009/09/27/magazine/27tools-t.html?smid=plshareand\_r=0

<sup>4</sup>http://www.npr.org/templates/story/story.php?storyId=76838288


TABLE 1 | Summary of previous research on the effectiveness of the Tools preschool curriculum.

Tools, Tools of the Mind; −K, Kindergarten; −P, preschool; BAU, Business as Usual; SES, socioeconomic status; EF, effect size; DBL, District Balanced Literacy; LEPCP, Literacy Express Comprehensive Preschool Curriculum; HS, High Scope; CC, Creative Curriculum; BB, Building Blocks; IY, Incredible Years. <sup>∗</sup>See also Wilson and Farran (2012) and Farran et al. (2013).

key outcomes (measuring only executive function, only selfregulation or both) as well as in the actual measures used to assess them. However, two commonalities amongst previous studies are worthy of closer consideration for their potential to provide further insights regarding why Tools might sometimes be effective. The first commonality is that the impact of Tools has typically been compared to that of curricula focused on academic skills, especially literacy (see e.g., Diamond et al., 2007; Barnett et al., 2008; Blair and Raver, 2014). Hence, it is not clear whether the apparent benefits of Tools might be due to the program's focus on self-regulation or on improving quality play. The second commonality is that positive effects of Tools tended to occur in low SES samples (Diamond et al., 2007; Barnett et al., 2008; Blair and Raver, 2014). Low SES has been linked to greater challenges on a number of measures that are critical for school readiness including language, executive function and hyperactivity/inattention (Noble et al., 2005; Fitzpatrick et al., 2014; Foulon et al., 2015). Thus, it could be that Tools is most effective in children for whom these abilities are relatively undeveloped. To be sure, not all studies with low SES samples

have found positive evidence of Tools (e.g., Farran and Wilson, 2014; Morris et al., 2014). However, as previous research has varied widely in how children were assessed, including their cognitive abilities, it is possible that positive effects occurred in samples with especially poor cognitive skills or for whom low cognitive performance might have been more homogeneous (e.g., Diamond et al., 2007; Barnett et al., 2008). We explored these ideas in the present study.

## Rationale for the Present Study

The goals of the present study were to ascertain whether the Tools program, which targets self-regulation through imaginative play and self-regulatory language, leads to greater gains in self-regulation compared to Playing to Learn (YMCA PTL), another play based program that does not target self-regulation specifically, and whether the effects were moderated by children's initial language and hyperactivity/inattention. As both programs are play based, and devote considerable time in their daily routine to improving quality play, but only Tools explicitly targets the development of self-regulation (see the section "Materials and Methods" for program descriptions), the study offered the opportunity to explore whether or not any gains resulting from Tools was related to the program focus on self-regulation over and above its focus on improving quality play.

It also afforded the opportunity to investigate whether previous findings of positive effects of Tools in low SES children is related to their relatively less developed cognitive skills.<sup>5</sup> At issue was whether children with low language and high hyperactivity/inattention might gain more from Tools instruction. It is possible that the Tools program emphasis on language may boost language skills in children with low language and thus improve their capacity to use language to regulate their behavior with self-directed speech and also to communicate more effectively with others. Children with high hyperactivity/inattention may benefit from having more frequent opportunities to practice self-regulation integrated into their daily routine because at least some opportunities to do so that arise as par for the course of most preschool curricula (e.g., taking turns with a coveted toy or activity) may be lost on them. Accordingly, we analyzed the data to determine if the effectiveness of Tools varied as a function of children's initial levels of language and hyperactivity/inattention.

We included well-established measures of executive function and also of self-regulation to paint a clearer picture of the impact of the Tools curriculum and to facilitate comparisons to previous studies. To better understand the nature of the impact of Tools on executive function, we included two measures of executive function with similar cognitive demands but different response modalities, one behavioral and one verbal.

Finally, it is also worth mentioning that the study was of considerable practical import. The opportunity to work with preschoolers in the target age range attending a network of childcare sites was particularly appealing at the time because it occurred in the last year before the final school year rollout of a free, full-day kindergarten program (FDK), which would soon be available to all 4-year-old children through the public school system. Interest in the study was intensified by ongoing discussion regarding the type of curriculum that might be best suited for the new program.

## MATERIALS AND METHODS

### Study Overview

The study was a registered cluster-randomized controlled trial, clinical trials.gov identifier NCT02462733, carried out in multiple childcare centers run by the YMCA Canada – a charitable, community-based organization – in a large urban center, in Ontario, Canada. A total of 20 sites, each with one participating class, were randomly assigned to teach either the Tools of the Mind (TOOLS, n = 10 classes; 109 preschoolers) or the YMCA Playing to Learn (YMCA PTL, n = 10 classes; 86 preschoolers) curriculum. The different number of participating preschoolers in the two groups reflects the variation in the number of children in the target age range attending the participating sites, prior to random assignment of sites to curricula. The YMCA PTL preschool curriculum was the business-as-usual curriculum in use throughout the YMCA prior to the study. Each class was led by an early childhood educator (ECE), with an assistant teacher. Teachers received training in their respective curricula and used only the method of instruction assigned to their site for about 15 months; from March 2012 to June of 2013, inclusive. We assessed children at three time points; at start of the study (T1), around 8 months later (T2) and at the end of the study (T3)<sup>6</sup> . We also assessed the teacher's fidelity of implementation of the Tools program at three time points; at 7, 11, and 14 months of implementation (F1, F2, and F3, respectively). There were two cohorts of participating children: Cohort A who entered the study at T1 and Cohort B who entered the study at T2 (see the section on "Participants" for further details).

Our primary outcome measures comprised two executive function tasks as well as parent and teacher reports of children's behavior at home and school, The two measures of executive function were the Day/Night task (D/N; Gerstadt et al., 1994) and the Head-To-Toes version of the Head-Shoulders-Knees-Toes task (HTT; Ponitz et al., 2009). The questionnaires were the parent and teacher versions of the Strengths and Difficulties Questionnaire (SDQ-P, SDQ-T, respectively; Goodman, 1997, 1999) and the Social Competence and Behavior Evaluation Scale, which was completed by teachers only (SCBE-30; LaFreniere and Dumas, 1995). We used the total difficulties score from the SDQ-P and SDQ-T (see measures) as outcome measures. For the analyses looking at the effect of Tools as a function of hyperactivity/inattention we used the hyperactivity/inattention subscale from the SDQ-P at baseline reasoning that parent

<sup>5</sup>The YMCA was only able to provide fee subsidy data by site, not for individual families. Hence, we could not include this proxy for SES in our analyses. However, analyzing the data in terms of the impact of Tools as a function of initial language and hyperactivity/inattention allowed us to go beyond simply replicating earlier findings regarding SES to better understand why Tools might sometimes work for low SES children.

<sup>6</sup>Children attended the daycare year-round, typically with a short break in the summertime for family vacation.

reports are unbiased compared to teachers who also delivered the curriculum. For the analyses looking at the effect of Tools as a function of initial language, we used the Peabody Picture Vocabulary Test (PPVT-4, Dunn and Dunn, 1997).

The data were collected as part of a larger investigation of the development of self-regulation that included a number of additional measures. We used these measures in the present study to help characterize the groups at baseline. The additional measures tapped children's expressive language (Expressive Vocabulary Test, EVT-4, Williams, 2007), as well as their early reading and math skills (Get Ready To Read, GRTR, Whitehurst and Lonigan, 2001; Point-to-X, PTX, Wynn, 1992). Teachers also completed a questionnaire on children's overall development (the Early Development Index, EDI, Janus and Offord, 2007).

## Ethics Statement

The study was approved by the research ethics board at the Hospital for Sick Children in Toronto, Canada and also by the YMCA Canada organization. Teachers provided written informed consent and parents provided written informed consent for their participating children. Consent for teachers and for children in cohort A was obtained prior to random assignment of sites to conditions. Consent for children in cohort B was obtained after the randomly assigned curriculum was already in place.

## Participants

The participating sample was drawn from a network of childcare sites operated by YMCA Canada. Resources as well as teacher credentials and professional development were therefore standardized across sites at the study outset. We targeted all of the sites in the network located in areas where local elementary schools were not scheduled to offer the FDK program until after the study period. The 20 eligible sites spanned the city limits and served populations that were ethnically and socioeconomically diverse.<sup>7</sup> The mean percentage of students in the preschool program receiving a fee subsidy was 54% (range 19– 100) and 59% (range 1–100) in the Tools and YMCA PTL sites, respectively. Note that as fee subsidy data were not available for individual children, we were unable to use this proxy for SES in our analyses. All of the site directors agreed to participate in the study.

Teachers were recruited to participate in the study if they were accredited Early Childhood Education teachers and were not expecting to take a leave of absence during the study period. Their participation was voluntary.

Children were recruited to participate if they were 3 or 4 years of age, had sufficient grasp of English, and did not have any developmental challenges serious enough to preclude full participation in the curriculum, as judged by their teacher. They were expected to remain at the daycare for the study duration. The participating children at each site were grouped together into a single mixed-age classroom along with other non-participating peers who did not meet the eligibility criteria for participation in the study or whose parents did not return a signed consent form in time for baseline data collection.

Cohort A: **Figure 1** shows the Consolidated Standard of Reporting Trials Organization (CONSORT) diagram of participant flow through the study. The details for Cohort A are shown on the left side of the figure. We received signed consent for 199 children at T1. Three children refused to participate and 1 child who participated exhibited developmental challenges that prohibited a fair administration of the battery of measures. These 4 children were dropped from the study, leaving 195 participants in Cohort A at T1. There were 106 children in the TOOLS group (58 boys and 48 girls, mean age = 45.1 months, range 37.2– 55.34 months) and 89 children in the YMCA PTL group (47 boys and 42 girls, mean age = 45.9 months, range 36.5–62.3 months). The two groups did not differ in mean age at the start of the study.

**Table 2** shows baseline performance on the study measures for cohort A at T1, by curriculum. Independent samples t-tests indicated that the groups did not differ significantly on either the D/N, HTT, or on the total difficulties score on either the SDQ-P or SDQ-T. The only significant difference between the groups was that teachers rated Tools children significantly higher than their YMCA PTL peers on the anxiety-withdrawal scale on the SCBE-30, but the difference between the group means was small (difference = 0.38, p < 0.0001). There were no significant differences between the groups on any of the additional measures. All tests were based on the Bonferroni adjusted p-level for multiple comparisons which was p < 0.02.

It is important to note here, that there was considerable attrition in the summer of 2012 (about 5 months after the start of the study), due to unforeseen circumstances. Seventy-eight children in Cohort A (Tools; 23 boys and 15 girls, YMCA PTL; 21 boys and 19 girls) left the study before T2. Three children moved to a different classroom in the same daycare, 1 child changed to part-time attendance and 2 children withdrew from the study (1 from each condition), but the remaining 72 children left the daycare altogether. It is unlikely that the attrition was related to either curricula, since attrition rates were comparable in the two conditions. It is also unlikely that the relatively large number of children who left the study can be fully accounted for by the typical reasons for attrition, such as a change of residence or parental leave to care for a new sibling. Rather, we believe that it was largely due to an unanticipated effect of the rollout of FDK. Although the attrition was distributed across sites, some sites were affected more than others, namely those located in the city core and at suburban transportation hubs that may have served commuter populations. Children may have attended these sites because they were close or en route to a parent's place of employment but resided in neighborhoods where local schools were scheduled to introduce FDK in the fall of 2012. There may therefore have been withdrawn to take advantage of the free program. This notion is supported by the timing of the attrition, the fact that most of the children left the daycare altogether (as opposed to switching to a non-study room), and because the

<sup>7</sup>Ethnicity data are not typically collected as part of daycare enrollment in Canada and we were not able to collect such data systematically as part of the study. Nor could we appeal to census data by daycare postal code as researchers sometimes do because some of our participating sites served commuter populations. However, the sites spanned the geographical limits of a large cosmopolitan city and our research team observed that the sample varied widely in ethnic composition.

daycare staff suggested that FDK was the most likely reason for leaving.

To determine the impact on the study of the attrition before T2, we compared children who left the study to those who stayed on the primary measures and also on the additional measures administered at baseline. Independent-samples t-tests revealed that leavers and stayers did not differ significantly in chronological age (mean age was 45.3 and 44.6 months, respectively, at the start of the study). Furthermore, although leavers generally scored somewhat less well than stayers, the differences between group means did not reach statistical significance.<sup>8</sup> A further 11 children in Cohort A (5 TOOLS, 6 YMCA PTL) left the daycare before T3; 1 moved to another classroom and 10 left the daycare altogether, due either to moving residence or to parental leave (based on teacher reports).

Cohort B: Given the considerable investment of resources, the daycare staff's enthusiasm to continue on with the study, and in order to improve statistical power for the data analysis, we recruited an additional cohort of 3- and 4-year olds (Cohort B) into the study at T2, from the participating sites. Details regarding the flow of participants in Cohort B are shown on the right side of **Figure 1**. Cohort B comprised 61 children; 42 children in the TOOLS group (25 boys and 17 girls, mean age = 42.4 months, range 37.0–50.0 months) and 19 children in the YMCA PTL group (8 boys and 11 girls, mean age = 43.5 months, range = 37.0–57.1 months). These children were already attending the participating sites but were not in the study classrooms at the time of recruiting. They became eligible for the study largely because they had achieved the minimum age criteria by T2. This meant that they were significantly younger than the children in cohort A at T2 (p < 0.0001; mean ages were 54.3 and 42.7 months, for cohorts A and B, respectively). Comparisons of the cohort A and cohort B children at T2 confirmed that, in general, cohort A was also

<sup>8</sup> See Supplementary Table S1 in the supplementary online materials for baseline data (collected at T1) for the participants in cohort A who withdrew and for those who remained in the study by T2.

#### TABLE 2 | Performance at study entry for cohort A (T1) and cohort B (T2).


SDQ, Strengths and Difficulties Questionnaire; SCBE-30, Social Competence and Behavior Evaluation Scale; PPVT-4, Peabody Picture Vocabulary Test, 4th ed.; EVT, Expressive Vocabulary Test 4th ed.; GRTR, Get Ready to Read; EDI, Early Development Inventory. We compared the group means within each cohort for each measure. P < 0.02 is the critical alpha level after Bonferroni adjustment for multiple comparisons. <sup>∗</sup>denotes a significant difference between the two groups in cohort A, based on the adjusted p-level.

developmentally more mature.<sup>9</sup> After obtaining parental consent, cohort B children joined their cohort A peers in the participating classroom in their daycare. Hence, they received the curriculum that was randomly assigned to the classroom at the start of the study and already in place at T2. Independent-samples t-tests revealed no significant differences on any of the study measures between the Tools and YMCA PTL children who entered the study at T2 (see **Table 2**). Six children in cohort B (5 TOOLS, 1 YMCA PTL) left the daycare before T3, again due to moving or parental leave (based on teacher reports).

#### Materials and Procedures

#### Program Descriptions, Teacher Training, Tools Fidelity **Program descriptions**

Tools of the Mind (Tools; Bodrova and Leong, 2007) is a playbased, preschool and kindergarten curriculum that emphasizes self-control, language and literacy skills. The present study involved the preschool version of the program. Tools is based on Vygotsky's (1967, 1978) social-cultural theory of child development in which development occurs in the context of the interactions between children and their social environment. These include interactions with peers as well as adults. Play, especially pretend play, is considered essential to propelling development. In pretend play, children adopt various social roles and implicitly agree to act in accordance with those roles, inhibiting the propensity to act out of character. Language is considered a critical tool for the formation of thought. Indeed, children employ a variety of tools to support their thinking. Initially, these tools are external such as a picture or language spoken aloud, but in time they become automatized and internalized as when children remember the significance of a picture or engage in internal self-talk to help regulate their own behavior.

The Tools curriculum comprises a set of explicit, scripted, teacher-directed activities that embody these ideas and that are aimed specifically at improving self-control. A considerable part of every day is devoted to pretend play, which begins with teachers helping children to formulate a play plan drawn on paper. Children are asked to think about the setting, the key roles and who will play them, the language their character might use, as well as the main events that will take place. They then draw – to the best of their ability – a depiction of the scenario. They are also encouraged to make marks, or draw letter-like forms, write letters, words or simple phrases to accompany their drawings, as appropriate for their skill level. During the pretend play sessions, teachers help children to self-monitor by reminding them of, or referring to the actual plan as needed, and suggesting additional activities and language for their character. Children also assist each other by pointing out and redirecting peers when they begin to act out of role.

Teachers provide additional support by integrating pretend play into other activities in the curriculum. For example, when

<sup>9</sup> See Supplementary Table S2 in the supplementary online materials for performance on the various measures for cohort A children who remained in the study and for cohort B children entering the study, at T2.

the whole class is gathered on the carpet the teacher may adopt the role of a baker, model the actions of cutting up and distributing a pretend pizza, using relevant vocabulary. Teachers also stimulate children's thinking about how various objects can be appropriated for use in the pretend scenario such as using a rectangular piece of cardboard as a telephone. Children are encouraged to practice the action sequences modeled by the teacher and to integrate them into their pretend play scenarios.

Language is also afforded a central role. It is the primary mechanism for introducing and for participating in the Tools activities. Children employ substantial overt, and eventually covert, speech to guide their actions on a variety of tasks such as when practicing writing. Teachers also help to build children's vocabularies by identifying new words in books selected for classroom reading. During large group time, teachers introduce and model the use of new words, relevant to the theme of the children's pretend play. Throughout the day, a variety of activities require children to articulate their ideas and to begin to make symbolic marks, construct letters, words and then simple phrases to express those thoughts. Hence, in addition to selfdirect speech, there is a great deal of verbal exchange between students as well as between the students and the teacher in the Tools classroom.

Opportunities to practice self-regulation are also incorporated into both large and fine motor activities. For example, in the "freeze game," the teacher plays rhythmic music while holding up a card depicting a stick figure in a particular stance. Children dance and when the music stops they must strike the pose in the picture. When the music recommences they begin dancing again and the sequence is repeated with a new card showing a different physical stance. In "pattern movement," children are shown different shapes (e.g., triangle and square) and taught to execute a different movement for each shape (e.g., touch your chin for the triangle, clap for the square). The teacher then reveals a sequence of shapes, one shape at a time (e.g., square, triangle, and triangle), and children must perform the sequence of actions that corresponds with the shape pattern (e.g., clap, touch chin, and clap). Children are encouraged to label their actions aloud as they execute them.

Tools academic activities also have a clear self-regulation component. For example, in "buddy reading" children read aloud in pairs, with each child taking a turn as the reader or the listener. They are given pictures (of a mouth or an ear) to help them stay in their role. The listener is encouraged to ask the reader a question about the text when the reader has finished reading. The children then switch roles along with their accompanying pictures. Similarly, for "making collections" children are designated as either the counter or the checker and given pictures (of a hand or a checkmark) to help them stay in role. The counter's role is to place the number of counters into a cup that matches the number of items shown on a "key card." The checker checks and provides feedback so that the counter can make corrections. After several efforts with different quantities of counters, the children switch roles and pictures. In both tasks, the different roles become internalized over time and children no longer require the external symbols to support appropriate behavior.

Detailed manuals of the Tools program have been developed for use in in-service training, which consists of an admixture of workshops and in-class coaching, and also for teachers to use as an ongoing resource throughout the training period<sup>10</sup> .

Playing to Learn (YMCA PTL; Eden and Huggins, 2001; Martin and Huggins, 2015) is also a play-based preschool curriculum. A critical difference between Tools and YMCA PTL is that, whereas Tools is a more teacher-directed, prescribed approach, YMCA PTL is a child-centered, emergent curriculum. The teacher's primary roles are to establish a safe, secure, social environment and to facilitate learning through play, following the child's interest. The set up of the physical environment is seen as essential to encouraging quality play (but not necessarily pretendplay). YMCA PTL classrooms resemble home-like environments. A wide variety of materials are available to encourage play that supports children's social, emotional and academic learning. Children who become disengaged may be enticed by different aspects of the environment to re-engage in play.

Teachers keep a flexible daily routine to encourage sustained, uninterrupted periods of play. Play is open-ended, creative and flexible, adapting to children's needs, interests, and ideas as they change. Teachers act as play partners, enthusiastically entering the play scenario but only on the children's invitation. They may modify or add to the experience and help to extend play according to level of interest, but the children continue to guide the play content.

Teachers are trained to observe children's play, to reflect on, and to carefully document their interests. They are encouraged to capitalize on learning opportunities as they arise. For example, teachers may encourage children who are using blocks to build a fort to think about how the size and arrangement of the blocks influences its final structure, to count the number of blocks involved, the number of children the structure can accommodate as well as the structure's affordances. Practicing self-control is an emergent property of these child-initiated activities. For example, teachers may help children to solve the problem of too few blocks for all of the children interested in the block building activity, by organizing themselves into teams and taking turns.

Teachers may also plan for innovative play opportunities but they are rooted in observations of the children's interests. Moreover, play planning is flexible, can be adapted or even abandoned according to children's changing interests and needs. For example, a teacher who observes some children's growing interest in dinosaurs may set out a box containing various dinosaur paraphernalia such as miniature figures, dinosaur eggs, books and so on for children's arrival the next day. She may set up outdoor play so that children can engage in digging for dinosaur "bones" (Martin and Huggins, 2015). The following day, the teacher will partner with the children on these dinosaur activities if they show an interest, but if children redirect their interests – such as spontaneously pretending to be riding on a bus – the teacher is flexible to enough to abandon the dinosaur idea and to apply efforts to the new scenario.

Opportunities for social, emotional and academic learning are embedded in play. For example, teachers encourage co-operation

<sup>10</sup>www.toolsofthemind.org

to solve problems, they model empathy, they may introduce new vocabulary in developing play on a particular topic, incorporate number and creative problem-solving into play such as in the fort building activity described above. Similarly, opportunities to practice self-regulation in play might occur when a teacher suggests turn-taking as a solution to sharing a highly desirable object or helps children to generate alternative solutions such as stopping to think about a conflict and using words instead of impulsive actions to express their dissatisfaction. Children are not routinely required to represent their play ideas visually, via drawing, symbolic mark-making or letter construction but drawing materials would be made available if they demonstrated an interest.

As with Tools, a YMCA PTL manual has been developed to guide classroom practice. In service training consists of professional development sessions and in-class coaching.

#### **Teacher training**

Prior to the study, all of the teachers, who were all ECE accredited, were fully trained in implementing the YMCA PTL curriculum. Upon joining the YMCA, they received an orientation to PTL followed by 4 further training sessions within the first 6 months and then 2 additional sessions in each subsequent year with the YMCA. Teachers were assessed for implementation fidelity by the participating organization, as part of an annual, general site evaluation. Regional supervisory staff provided ongoing coaching to help sustain implementation fidelity to organizational standards. For the YMCA PTL teachers only, YMCA PTL training, coaching support and evaluation continued as usual while the study was underway.

For teachers in the Tools classrooms, the YMCA PTL booster training, support and evaluation was suspended for the study duration. Instead, teachers received training in the Tools preschool curriculum by professional trainers from the Tools organization. Training was delivered incrementally, in five sessions, roughly evenly distributed throughout the study period. Teachers were trained in the core Tools activities (those essential to program implementation) at the beginning of the study and while data collection at T1 was underway, and the last session occurred about 2–3 months before data collection at T3. In between sessions, teachers continued to implement the core Tools program integrating any additional skills acquired at the most recent training. Two coaches, who received the same training as the teachers, as well as additional coaching training, provided ongoing support throughout the study. Each site received equal amounts of support from the two coaches during the study period. The Tools trainers also visited the participating sites following each training session and made recommendations to help support implementation fidelity.

#### **Tools fidelity**

Following similar published studies (see e.g., Diamond et al., 2007; Barnett et al., 2008), we focused on the implementation fidelity of Tools. Our primary aim was to establish that the Tools curriculum was in use in the Tools classrooms and not in the YMCA PTL classrooms. As reported above, ongoing coaching in YMCA PTL classrooms helped to ensure fidelity of implementation of the YMCA PTL curriculum to standards acceptable to the YMCA organization.

To capture fidelity of implementation of the Tools program, members of our research team completed the Tools Implementation Checklist (TIC) we created specifically for the present study. The TIC comprised a list of the Tools activities that would be expected to take place in the classroom based on teacher training. Each activity was broken into its essential elements laid out in the Tools manual and observers checked whether or not they observed each element. The 21 core activities teachers were expected to implement throughout the study (following the first training session) comprised 119 observable elements. Five additional activities were added at F2 and 1 further activity was added at F3 as teachers progressed in their training and the students in their learning. These additional activities comprised 35 observable elements at F2 and an additional 6 observable elements at F3. Hence, the total number of possible observable elements or items on the TIC was 119, 154, and 160 at F1, F2, and F3, respectively.

We assessed fidelity of implementation of the Tools program at all 20 participating sites, at three time points (as explained above). At each time point, a pair of observers attended each site for a full day and completed the TIC. The observers were graduate students in a combined early childhood education and elementary/junior teacher accreditation program nearing the end of their studies. A different pair of observers completed the observations at each time point. The observers were blind to the study hypotheses and to the assignment of sites to the two curricula. They attended the same site, on the same day, but completed their own copy of the TIC, independently without conferring. We report inter-rater reliability and percent implementation of the Tools activities, at the beginning of the Section "Results."

#### Measures

#### **Primary measures**

The D/N and HTT tasks are well-established measures of executive function, widely used in developmental research and suitable for children as young as 3 years of age (Gerstadt et al., 1994; Ponitz et al., 2009). For D/N, children are presented with two kinds of cards; either a white card depicting a yellow sun (day card) or a black card depicting a white moon and stars (night card). They are instructed to play a "silly" game in which they must say "day" when they see a night card and "night" when they see a day card. Hence, they must inhibit the pre-potent response to say the word that is associated with the picture, and say the opposite word. Children receive 16 cards presented in one of two predetermined pseudorandom orders. They are allowed to self-correct after an initial response, before the next card is presented. Only the last response is scored. One point was awarded for each correct trial, for a maximum score of 16. For HTT, children are instructed to touch their toes when told "touch your head," or to touch their head when told "touch your toes." Children receive four practice trials followed by 10 test trials, comprising a mix of "head" and "toes" instructions given in 1 of 2 pre-determined random orders. Conventional scoring

awards 2 points for a correct response on the first attempt, 1 point for a correct response on the last attempt (i.e., for self-correcting before the next command was given) and 0 points for an incorrect response, for a maximum score of 20 (HTT20). To bring the data for HTT more in line with the data for D/N, we also scored HTT awarding 1 point for a correct response (whether on the first attempt or after self-correction) and 0 points for an incorrect response, for a maximum score out of 10 (HTT10). We report the results for both methods of scoring the HTT task.

The SDQ (Goodman, 1997, 1999) is a widely used screening measure for parents and teachers of children aged 3–16 years. We used the American preschool version (for ages 3–4 years) for parents (SDQ-P) and the analogous version for teachers (SDQ-T). Respondents indicate the extent to which each of 25 attributes, some positive (e.g., Has at least one good friend") and some negative (e.g., "Often loses temper), applies to the child on a 3-point likert scale (not true, somewhat true, certainly true). The 25 attributes are divided equally between 5 subscales; emotional symptoms, conduct problems, hyperactivity/inattention, peer problems and pro-social behavior. Scores for each subscale range from 0 to 10 and scores on the first four scales are summed to form a total difficulties score out of 40, with higher scores indicating greater challenges<sup>11</sup>. The hyperactivity/inattention subscale (which we used as a moderator in our analyses) includes items such as "Restless, overactive, cannot stay still for long" and "Constantly fidgeting or squirming."

The SCBE-30 (LaFreniere and Dumas, 1995) comprises a list of 30 behaviors; 10 positive (e.g., "cooperates with other children) and 20 negative (e.g., "hits, bites or kicks other children," "inactive, watches other children play"). Teachers indicate the frequency of observing each behavior on a 6-point likert scale (1 = never, 2 or 3 = sometimes, 4 or 5 = often, 6 = always). The SCBE-30 yields three subscales – social competence, anger/aggression, and anxiety/withdrawal. Scores for each sub scale represent the mean of the teacher's ratings on the 10 items that contribute to the scale. Scores for the positive items – those that contribute to the social competence subscale – are reversed such that, for all three subscales, higher scores indicate greater challenges.

The PPVT-4 (Dunn and Dunn, 1997) is a standardized vocabulary measure with excellent reliability and validity and is widely used in the developmental literature. Children are presented with a matrix of four pictures and required to point to the picture that corresponds with a word the experimenter said aloud. We used the PPVT-4 as a moderator in our analyses. Administration followed standardized instructions.

#### **Additional measures**

The EVT-4 (Williams, 2007) is also standardized, with excellent reliability and validity and widely used. Children are presented with a picture (e.g., a key), the experimenter poses a prompting question (e.g., "What is this?") and children respond verbally. Administration followed standardized instructions.

The GRTR (Whitehurst and Lonigan, 2001) is a screening tool that assesses progress in developing early literacy skills in

<sup>11</sup>www.sdqinfo.com

preschoolers. Children are presented with a matrix of four items (symbols, text, or pictures) and asked to indicate, e.g., which picture contains letters or which letter corresponds to a particular sound. Performance on the GRTR is significantly correlated with other measures of language and letter knowledge (Whitehurst and Lonigan, 2001). Children are awarded 1 point for each correct response for each of twenty-two trials to yield a maximum score of 22.

PTX (Wynn, 1992) taps children's understanding of counting principles (one-to-one correspondence, stable order and cardinality), essential to helping get mathematical skills off the ground. Participants were presented with two arrays of black squares simultaneously, and required to point to the array with the number of squares that corresponds to a number the experimenter said aloud. The quantity of squares in the arrays ranged from 1 to 9. When the quantity depicted exceeds three (the majority of the trials) children cannot simply subitize (know by looking) and must count the squares in each array to be able to respond correctly. Children received 1 point for each correct response on each of 16 trials for a maximum score of 16.

Finally, the EDI (Janus and Offord, 2007) is a developmental checklist completed by teachers to assess overall development in young children. It was designed as a community or population measure rather than for individual diagnosis or screening. Researchers submit their raw data to the developers for the derivation of summary scores that are shared with the investigators and added to a central database to further enhance neighborhood, regional and national representation. The EDI has good psychometric properties, has been used in research internationally to inform regional and national policy on early childhood care and education<sup>12</sup>. The checklist comprises 103 items that probe observable behavior and competencies in 5 domains; physical well-being, language and cognitive development, social competence, emotional maturity, communication and general knowledge. Example items for each scale, respectively, include; "proficiency at holding a pen, crayon or brush," "is able to attach sounds to letters" and "remembers things easily," "is able to play with various children," "is nervous, high-strung or stressed," "ability to tell a story," and "answers questions about the world." Teachers rate the child in question on each item on Likert scales that vary across the instrument sections. Higher scale scores indicate greater maturity.

#### **Procedures**

Children were tested individually, in a quiet location in their preschool by a trained experimenter during regular preschool hours. The battery of measures was divided into two test sessions of about 30 min each. To help maintain motivation, each session included a variety of measures and breaks were given as needed. Parents and teachers completed their assigned questionnaires on a schedule to roughly coincide with the student data collection. The same teacher completed the teacher questionnaires at all data collection time points.

<sup>12</sup>https://edi.offordcentre.com/resources/bibliography-of-the-edi

## RESULTS

#### Overview

We first report the results for Tools implementation fidelity, essential for interpretation of the results for the student outcomes. We then report the results from analyses of the student outcomes addressing each of our three research questions in turn: (1) Was there a main effect of curriculum? (2) Was there a main effect of curriculum moderated by initial language skills? and (3) Was there a main effect of curriculum moderated by initial level of hyperactivity/inattention? For each question, we report the results from the analyses of the data for Cohort A only, followed by the results from the analyses for the data from Cohorts A and B combined.

## Tools Implementation Fidelity

We derived inter-rater reliability by adding the number of elements that the observers agreed were either present (both said yes) or absent (both said no) and then converting the total to a percentage of 119 (the number of possible observable core elements) at each time point, for each site. We focused on the elements of the core activities because they were essential for the Tools program to be considered "in place." We then calculated the mean percentage of inter-rater agreement for sites in the two groups, at F1, F2, and F3. Occasionally one or both observers were unable to observe an activity or to visit a site on the designated day (e.g., due to transit disruption in severe weather). Activities captured by only one observer were omitted from interrater analyses, when both observers missed activities, inter-rater agreement was based on the remaining observed activities, and in the rare case of a missed site, mean inter-rater agreement was based only on the sites attended. We used the same procedure to derive inter-rater reliability on the elements associated with the additional activities at F2 and F3 and report these results separately.

In general, inter-rater reliability for the 119 core elements was very high; at F1 it was 99.7% (range 99.2–100) and 97.4% (range 94.5–99.3); at F2, it was 98.4% (range 97.7–98.8) and 93% (range 81.2–98.9); and at F3, it was 87.7% (range 79.7–95.3) and 83% (range 75.6–94.3), for the YMCA PTL and Tools groups, respectively. For the 35 additional elements assessed at T2, interrater reliability was 99.6% (range 97.8–100) and 93.12% (range 86.7–98.2), and for the 41 additional elements assessed at T3, it was 100 and 92.7% (range 87.1–97.8), for the YMCA PTL and Tools groups, respectively. The somewhat higher (and less variable) agreement in the YMCA PTL group compared to the Tools group reflects the fact that the observers simply had to agree that the Tools activity (and therefore all of its elements) never occurred.

We calculated implementation fidelity by tallying the number of elements that were present and converting the total to a percentage of the 119 core elements, adjusting for missed activities, at each site, and for each time point. An element was counted as present if both observers agreed that it took place. It was also counted as present on the rare occasion that only one observer was in attendance (see above) and indicated that an

element occurred, because inter-rater agreement was very high (see the Section "Results"). We used the same procedure to derive the mean percentage of additional elements implemented at F2 and F3 and report these results separately.

**Figure 2** shows the results for implementation fidelity of the Tools program. The mean percentage of core elements implemented, based on the sites visited, was 0.15% (range 0–0.4) and 58.4% (range 44.8–69.2) at F1, 1.9% (range 0– 5.8) and 54.7% (range 44.9–63.7) at F2, and 0.35% (range 0– 0.8) and 48.9% (range 31.3–65.6) at F3, for YMCA PTL and Tools, respectively. As the figure reveals, the type of instruction occurring in the two groups was clearly different. Whereas the Tools activities were virtually absent from the YMCA PTL classrooms, teachers were moderately successful at implementing them in the Tools classrooms. As expected for implementing a new program, there was also considerable cross-site variability in Tools implementation fidelity.

The mean percentage of additional elements implemented was 0 and 0.6% (range 0–2.8) at F2 and 0 and 11.48% (range 0–37.83) at F3, for YMCA PTL and Tools, respectively. As expected, the additional activities were completely absent from the YMCA PTL group. For the Tools group, we found virtually no evidence of implementation at F2 and only modest evidence of implementation of the additional elements at F3, again with cross-site variability.

#### Analysis of Student Outcomes Method of Analysis

In all main analyses, we used a multilevel model to deal with the nesting of the data: children (level-1) who are nested within site (level-2). Multilevel analyses have been deemed


Models control for baseline score on each outcome, child sex and child age at T3 [when the outcome was measured). The effect was not significant for any of the outcome measures.

ideal to investigate the effects of randomized controlled trials (Wears, 2002). Because the number of sites was relatively low (n = 20), we used Bayesian estimation which has been shown to have adequate performance under this condition (Hox et al., 2012). This estimation method allowed us to include participants with missing data, including those lost to attrition, based on the missing-at-random assumption (Asparouhov and Muthén, 2010). Accordingly, and because we omit treatment fidelity here, these analyses may be interpreted as intent-to-treat (Gupta, 2011). With Bayesian estimation, significance is assessed based on the 95% credible interval (95% CI). A parameter is significant if the 95% CI does not include zero (i.e., it is significantly different from zero).

We ran three sets of analyses to address our three research questions. First, we tested for a main effect of curriculum on child outcomes by looking at the effect of curriculum on the average level for child outcomes at the site-level (i.e., at level-2 because randomization occurred at the site-level) while controlling for initial levels of the outcome being tested (i.e., T1 score for each child). The following covariates were included: child sex, child age at the final assessment, initial levels of the outcome being tested (level-1) and curriculum assignment (level-2).<sup>13</sup> Second, we asked whether the curriculum effect varied as a function of children's initial language ability. We did this by adding an interaction term at level-1 between the curriculum assignment and language at T1. And third, we asked whether the curriculum effect varied as a function of children's initial level of hyperactivity/inattention. Accordingly, we included an interaction term at level-1 between the curriculum assignment and parent ratings of hyperactivity/inattention at T1.

#### Main Effect of Curriculum

The results from the analyses of the main effect of curriculum are shown in **Table 3**. Specifically, we tested whether there were differences in the average level of children's outcome in each site as a function of curriculum assignment. Because there were a large number of analyses, we only report the main effect of curriculum on each outcome for each analysis. We found no evidence that curriculum had an effect on the primary outcomes, either for cohort A only (see **Table 3A**) or for cohorts A and B combined (see **Table 3B**). That is, there were no differences on any of the primary outcomes between children in sites who received Tools and YMCA PTL instruction.

#### Effect of Curriculum Moderated by Initial Language

The results from the analyses addressing whether or not there was an effect of curriculum moderated by initial language ability are shown in **Table 4**. Specifically, we tested whether or not the effect of curriculum varied systematically as a function of a child's initial language ability, i.e., a significant curriculum by T1 language ability interaction. We report only the estimate pertaining to the interaction term, which was not significant, either for Cohort A only (**Table 4A**) or when the two cohorts were combined (**Table 4B**). Thus, we found no evidence that the effect of the Tools curriculum varied as a function of children's initial language ability.

#### Effect of Curriculum Moderated by Initial Hyperactivity/Inattention

Finally, the results from the analyses addressing whether or not there was an effect of curriculum moderated by initial levels of hyperactivity/inattention are shown in **Table 5**. Specifically, we tested whether or not the effect of curriculum varied systematically as a function of initial level of hyperactivity/inattention, i.e., a significant curriculum by T1 level of hyperactivity/inattention (as indicated on the SDQ-P) interaction. The pattern of results was the same for cohort A (**Table 5A**) as for cohorts A and B combined (**Table 5B**), although the results were somewhat stronger for the larger combined sample.

There were significant interactions for HTT10 (i.e., when scoring only as correct or incorrect) and HTT20 (i.e., using the conventional scoring system for HTT) only. To interpret these findings, we report the standardized parameter estimate as a measure of the effect size of curriculum at different levels of child hyperactivity/inattention (1 SD below average, average, 1 SD above average).

For HTT10, the effect of curriculum was significant at trend level for cohort A and significant at the conventional level for cohorts A and B combined, for children with above average, but not average or below average hyperactivity/inattention. The

<sup>13</sup>There were no significant effects of sex in any of the models with D/N or HTT as outcome variables. See Supplementary Table S3 in the supplementary online materials.


Unstandardized estimates pertain to the interaction term only. Model controls for baseline score on each outcome, child sex, child age at T3 (when the outcome was measured) and children's initial language ability (standardized PPVT score). The effect was not significant for any of the outcome measures.

TABLE 5 | Results for effect of curriculum moderated by initial level of hyperactivity/inattention.


<sup>∗</sup>p < 0.05. Unstandardized estimates pertain to the interaction term only. Model controls for baseline score on each outcome, child sex, child age at T3 (when the outcome was measured) and initial parent rating of hyperactivity/inattention.

values for cohort A were β = 0.308, 95% CI [−0.018, 0.641] and 90% CI [0.037, 0.582], β = 0.063, 95% CI [−0.190, 0.319] and β = −0.179, 95% CI [−0.499, 0.127], and the values for the combined cohorts were β = 0.483, 95% CI [0.068, 0.846], β = 0.141, 95% CI [−0.146, 0.395] and β = −0.202, 95% CI [−0.570, 0.166], for above average, average and below average hyperactivity/inattention, respectively.

For HTT20, the effect of curriculum was not significant for any of the hyperactivity subgroups, either for cohort A or for the two cohorts combined. The values for cohort A were β = 0.250, 95% CI [−0.153, 0.662], β = −0.007, 95% CI [−0.345, 0.328] and β = −0.262, 95% CI [−0.650, 0.128], and the values for the combined cohorts were β = 0.338, 95% CI [−0.115, 0.755], β = 0039, 95% CI [−0.293, 0.355] and β = −0.261, 95% CI [−0.655, 0.158], for above average, average and below average hyperactivity/inattention, respectively.

Thus, we found that amongst children with high levels of initial hyperactivity/inattention, those who received Tools instruction showed significantly greater improvement on our behavioral measure of executive function, one of our key outcome measures, than their peers who received YMCA PTL instruction.

## DISCUSSION

We investigated the effectiveness of the Tools of the Mind preschool curriculum for improving self-regulation in a diverse sample of Canadian preschoolers. We were primarily interested in whether or not Tools instruction would lead to greater improvement on measures of self-regulation and executive function compared to YMCA PTL instruction (the business-as-usual approach) and in whether the effects of curriculum might be moderated by children's initial language and hyperactivity/inattention.

We did not find a main effect of curriculum or a significant interaction between curriculum and children's initial language skills, on any of our outcome measures. However, we found a significant interaction between curriculum and children's initial level of hyperactivity/inattention on one of our executive functions tasks. Amongst children with high levels of hyperactivity/inattention, those who received Tools instruction showed significantly greater improvement than those who received YMCA PTL instruction on HTT. The interaction was not significant for D/N, or for any of the scales derived from parents and teachers responses on the questionnaires. In keeping with some previous research then, we found a benefit of Tools instruction for children experiencing the greatest challenges in the development of self-regulation (Diamond et al., 2007; Barnett et al., 2008; Blair and Raver, 2014).

The absence of a main effect of curriculum in the present study contributes to the mixed results from five previous studies evaluating the impact of the full Tools curriculum; three of which found positive effects of Tools and two of which reported positive effects of the comparison curricula. Notably, two of the studies with positive effects of Tools were conducted with homogeneous, low SES, preschool samples (Diamond et al., 2007; Barnett et al., 2008), and in the third study with a larger more variable kindergarten sample, the effects of Tools were considerably stronger in high poverty schools (Blair and Raver, 2014). Our sample of preschoolers

varied considerably in SES, perhaps obscuring the benefit of Tools to the low SES children (we were not able to test for SES effects directly as these data were not available for individual children). At least one of the other studies using the full Tools curriculum also had a large, low SES sample (Farran and Wilson, 2014; SES details for the study by Lonigan and Phillips, 2012, which is unpublished, were not available), but the lack of positive effects of Tools in that study could be due to specific features of the comparison group instruction which differed from the comparison curricula in the preschool studies that found positive effects of Tools (see **Table 1**).

The lack of a significant interaction between initial language and curriculum in the present study was somewhat surprising because language plays such a central role in the Tools curriculum. In Tools, language is not only important for interpersonal communication, but many activities in the Tools daily routine effectively instruct children, and provide plenty of practice, in using language to regulate their behavior. One would therefore expect that boosting these capacities in children with relatively poor language skills might lead to greater gains in self-regulation. On the other hand, that we did not find a significant interaction between curriculum and initial language skills indicates that both types of instruction are suitable for children at all levels of language development (i.e., provided they have sufficient capacity to understand the teacher's instructions, which was a criterion for participation in the present study).

However, we found a significant interaction between initial hyperactivity/inattention and curriculum indicating significantly greater gains from Tools for children with high hyperactivity/inattention. It may be crucial for these children to have numerous, routinized opportunities, distributed throughout the day to practice self-regulation because their high level of activity and inattentiveness may impede their ability to profit from the opportunities to practice self-regulation that are par for the course of the typical preschool day. That the Tools curriculum is built around a set of self-regulating activities may increase the likelihood that struggling children will have more frequent self-regulating experiences and reap the associated rewards.

It is noteworthy that the interaction between curriculum and initial level of hyperactivity/inattention was significant at trend level for cohort A and significant at the conventional alpha level for the two cohorts combined (likely due to greater statistical power to detect significant effects). The consistent pattern of findings suggests that the benefit of Tools was already beginning to take hold in cohort B, who entered the study later and thus had a shorter duration of exposure to Tools. These children may have benefitted from entering a classroom where the Tools program was already underway. This is in line with the Tools theoretical framework that emphasizes the social context of learning. Moreover, the finding that high hyperactive/inattentive children benefited more from Tools, but that children with hyperactivity/inattention in the average range fared similarly well in the two programs shows that Tools can help students who struggle with self-regulation at no apparent cost to those who are less challenged in this regard. These findings are useful for educators in daycare settings where enrollment is ongoing and who may be concerned about integrating new students struggling with self-regulation and the impact on other students.

It is interesting that although the interaction between curriculum and hyperactivity/inattention held for HTT whether we scored children's response as right or wrong or we gave more credit for responding correctly on the first attempt (the conventional scoring method), the greater impact of Tools for high hyperactivity/inattention children only held for the first approach to scoring. It is possible that another approach to subgrouping children's hyperactivity/inattention scores based on the conventional scoring method might yield greater insights regarding the children for whom Tools is most effective, which may have scientific merit regarding nuances in the development of self-regulation. However, for present purposes, the results for the abbreviated scoring system may be more meaningful. In the interests of improving behavior, and ultimately school readiness, the critical issue was if Tools instruction could bring about an improvement in children's ability to rein themselves in at all.

It is also interesting that the observed benefits to the Tools children held for HTT but not for D/N, tasks with highly similar cognitive demands but different response modalities. Highly active or inattentive children may enjoy HTT more as it capitalizes on their propensity to move and may find it more challenging to sit still and attend to the cognitive demands of D/N. Young children may also be more familiar with action based inhibitory control games like Simon says and musical chairs that are similar to HTT. Finally, the fewer test trials for HTT (10) vs. D/N (16) may also reduce task demands enough for struggling children to demonstrate the extent of their development in self-regulation. It is possible that Tools instruction may eventually yield sufficient improvement in behavioral regulation for better performance on D/N, and these improvements may eventually translate to better behavior at home and at school. In other words, HTT may be optimally suited to tap early, subtle improvements in self-regulation in high hyperactivity/inattention children that were only beginning to occur.

A strength of the present study is that we compared the effectiveness of Tools, a program that targets pretend play, language, and self-regulation, to YMCA PTL another preschool program that promotes quality play but does not target selfregulation specifically. The benefit of Tools to growth in executive function, at least to some children, suggests that the Tools self-regulatory activities may be an important ingredient of the program. In other words, Tools instruction may benefit learners through its emphasis on self-regulation over and above its emphasis on quality play. Indeed, another reason for the greater sensitivity of HTT compared to D/N for revealing improvements in high hyperactivity/inattentive children may be because Tools actually includes activities that are highly similar to HTT (like the freeze game).

Other strengths include random assignment at the site level to the two curricula at the study outset; standardized resources, overall quality of care and teacher preparation across all sites in the participating organization; and that both groups

of teachers received professional development and ongoing coaching support in their respective curricula to help sustain acceptable quality in program delivery. Implementation fidelity data confirmed that only the Tools teachers implemented the Tools curriculum and that they did so with moderate success.

The present research also faced challenges beyond our control that could have impacted the study outcome. The participating sample was somewhat limited, perhaps constraining our power to detect further significant effects if present. That said, it was comparable to that in two similar previous studies that also reported significant effects of Tools on selected outcomes (Diamond et al., 2007; Barnett et al., 2008), one of which also included two cohorts of children entering the study at different times (Diamond et al., 2007). On the other hand, studies with larger samples have failed to turn up any significant effects of Tools (see Lonigan and Phillips, 2012; Farran and Wilson, 2014).

Other factors may have posed challenges to the Tools teachers' ability to reach and to sustain a high level of program fidelity. To be sure, teachers efforts to implement Tools was likely affected by the attrition that occurred just 5 months after the study got underway and the influx of a new cohort of younger children. The disruption likely posed greater challenges for the Tools teachers' who faced a considerable adjustment in pedagogical mindset. This notion is supported by the fact that implementation fidelity in the Tools classrooms was only moderately high, even with ongoing coaching support to help teachers stay on track. Another consideration is that since we did not assess implementation fidelity of the YMCA PTL program, it remains possible that teachers in the Tools group continued to implement some aspects of YMCA PTL, in which they were fully trained prior to participating in the study. However, we believe that this is unlikely. The Tools daily schedule comprises a set of prescribed activities that leave little room for other practices. Indeed, further inspection of the fidelity data showed that the moderate level of success reflected missing elements of the core Tools activities rather than neglecting to implement those activities at all, which may also be a consequence of adapting to the considerable change that occurred to the class composition. Thus, it remains possible that without serious disruptions to attendance, Tools teachers may be able to sustain a sufficiently high level of program fidelity for further gains to manifest in observable behavior.

The focus in previous studies of Tools effectiveness on low SES samples is understandable in light of the wellestablished link between poverty and school readiness (see e.g., Duncan et al., 1994). However, many preschools serve diverse populations. Moreover, Moffitt et al. (2011) have shown that low self-control is related to poorer outcomes in adulthood – after controlling for SES – and that improving self-control between childhood and adolescence can lead to better outcomes in adulthood. Since SES disparities in school readiness appear to be mediated by children's cognitive skills (e.g., Fitzpatrick et al., 2014), it may be more useful to early childhood educators to understand the cognitive and behavioral characteristics (such as language and hyperactivity/inattention) of the children for whom Tools is most effective, in determining the curriculum best suited to the populations they serve.

That both programs promote quality play and children in the two groups made similar gains on most of the study measures suggests that the Tools and YMCA PTL curricula both have merits for meeting the needs of children for whom self-regulation is in the typical range. However, Tools may offer an advantage in classrooms with children experiencing greater challenges to self-regulation. Importantly, our findings that Tools did not have a negative impact on children with hyperactivity/inattention in the average range suggests that the positive impact of Tools on children with high hyperactivity/inattention can occur without disadvantage to children less challenged in this regard. Given the considerable, long-term consequences of poor self-regulation and its impact on developing a healthy quality of life, the Tools program may be a worthy investment.

## AUTHOR CONTRIBUTIONS

RT, BF, and PC conceptualized the study with input from LH. TS and RT designed the study. TS operationalized and executed it. AO, HF, PC, and GG contributed to data collection and coding. AP conceptualized and carried out the analytic plan with input from TS and RT. TS wrote the manuscript with input from all of the authors.

## FUNDING

This study was supported in part by internal funding at the Hospital for Sick Children (BF, RT), the Canada Research Chairs Program (RT), and by support in kind by the YMCA of Canada (LH).

## ACKNOWLEDGMENTS

We are grateful to the daycare directors, staff and students who agreed to participate in the study, and also to Tricia Holzworth at the YMCA for support with the study execution. We extend our sincere thanks to the Tools developers Elena Bodrova and Debra Leong, as well as to Ann DeCico and the other Tools trainers for their support throughout the study.

## SUPPLEMENTARY MATERIAL

The Supplementary Material for this article can be found online at: https://www.frontiersin.org/articles/10.3389/fpsyg. 2017.02366/full#supplementary-material

## REFERENCES

fpsyg-08-02366 January 19, 2018 Time: 18:17 # 17


Luria, A. R. (1966). Higher Cortical Functions in Man. New York, NY: Basic Books.



**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2018 Solomon, Plamondon, O'Hara, Finch, Goco, Chaban, Huggins, Ferguson and Tannock. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

fpsyg-08-01187 July 11, 2017 Time: 15:44 # 1

## CanDiD: A Framework for Linking Executive Function and Education

Niki H. Kamkar\* and J. B. Morton\*

Department of Psychology, University of Western Ontario, London, ON, Canada

The close association between executive functions (EFs) and educational achievement has led to the idea that targeted EF training might facilitate learning and goal-directed behavior in the classroom. The evidence that training interventions have long-lasting and transferable effects is however decidedly mixed (Melby-Lervåg and Hulme, 2013; Simons et al., 2016). The goal of the current paper is to propose a new CanDiD framework for re-thinking EF and its links to education. Based on findings from basic EF research, the proposed CanDiD framework highlights dynamic and contextual influences on EF and emphasizes the importance of development and individual differences for understanding these effects. Implications for remedial interventions and curriculum design are discussed.

#### Edited by:

Jacob A. Burack, McGill University, Canada

#### Reviewed by:

Ayelet Lahat, McMaster University, Canada Ruth Ford, Anglia Ruskin University, United Kingdom

#### \*Correspondence:

Niki H. Kamkar nhossei@uwo.ca J. B. Morton bmorton3@uwo.ca

#### Specialty section:

This article was submitted to Developmental Psychology, a section of the journal Frontiers in Psychology

Received: 15 March 2017 Accepted: 29 June 2017 Published: 13 July 2017

#### Citation:

Kamkar NH and Morton JB (2017) CanDiD: A Framework for Linking Executive Function and Education. Front. Psychol. 8:1187. doi: 10.3389/fpsyg.2017.01187 Keywords: executive function, education, dynamics, development, individual differences, CanDiD

Executive functions (EFs) are a set of processes that are critical for organizing thought and behavior in the service of achieving goals. Although there is no consensus on the specific processes that comprise the "set," there is general agreement that:


Therefore, thoughts and actions governed by EF can be distinguished from habits, or crystalized forms of mental activity that are acquired gradually through repeated practice and that provide fixed automatic solutions to well-defined problems.

Understanding, the underlying causes of EF development remains a fundamental challenge. One influential position highlights the importance of experience by characterizing the development of EF as a form of skill-learning (Klingberg, 2014). On this view, everyday experience provides opportunities to maintain small amounts of information, filter out salient distractors, and examine situations from multiple vantage points. These experiences are important as they provide children with opportunities to exercise nascent executive skills, and drive functional and anatomical re-organization of associated brain networks. Over time, cognitive and neurophysiological mechanisms underlying EF become more practiced, and by extension, increasingly adult-like. Consistent with this account, targeted practice of working-memory (Klingberg et al., 2002) and task-switching (Karbach and Kray, 2009) paradigms beget measureable changes in cognitive and neurophysiological measures of higher-order cognition (Olesen et al., 2004).

## EF AND EDUCATION

fpsyg-08-01187 July 11, 2017 Time: 15:44 # 2

To the extent that EF lends intelligence to thinking, there has been a long-standing interest in the connection between EF and education. Do individual differences in EF predict success in educational contexts? If so, why? And can interventions that target EF facilitate learning and behavior in the classroom? What we have learned to date is that there is a close connection between EF and achievement in academic settings. Although the reasons are manifold, one is that EF is critical for learning (Bull et al., 2008; Clark et al., 2010). Acquiring new skills in the classroom has much to do with how students organize, seek, and evaluate information, aspects of thinking that depend of EF. EF is also important for managing challenges, be they purely intellectual or socio-emotional in nature. For example, EF predicts not only SAT scores, but also a capacity to cope with stress, uncertainty, and conflict (Mischel et al., 1989).

One implication is that interventions that target EF can facilitate focused behavior in the classroom, especially among students prone to distraction. Of available approaches, workingmemory training, in which participants mentally maintain and manipulate small amounts of information over a short delay, is perhaps the most widely recognized. The general approach involves assigning a child a daily regimen of computer-based tasks that demand short-term maintenance and manipulation of small amounts of information. As proficiency improves, the tasks become incrementally more difficult. On some accounts, working-memory training is highly effective not only at remediating EF-related problems, such as short-term memory difficulties among children with ADHD (Klingberg et al., 2002, 2005), but also in promoting "general cognitive enhancement" (Morrison and Chein, 2011), evident in abilities beyond those specifically practiced (Holmes et al., 2009; Bergman-Nutley and Klingberg, 2014).

## Summary and Challenges

One challenge is that evidence for the effectiveness of "EFtraining" programs is highly inconsistent. Evaluating workingmemory training focuses on the issue of far-transfer effects, namely evidence that training working-memory generalizes to tasks that are different from the task trained on. While evidence of far-transfer effects is arguably the most important for evaluating the utility of working-memory training for use in educational contexts, it also proves to be the least reliable (Melby-Lervåg and Hulme, 2013; Simons et al., 2016). For example, a metaanalysis revealed that there is no convincing evidence that training on working-memory would generalize to other skills including inhibitory control (Melby-Lervåg and Hulme, 2013). More recently, Simons et al. (2016) reported that while there is a body of evidence in support of brain-training interventions improving performance on the trained tasks, there is little evidence of far-transfer to distantly related tasks or everyday cognitive performance (Simons et al., 2016). Indeed, these recent reviews call attention to the weakness in available data, and draw dim conclusions regarding the utility of targeted workingmemory training (Melby-Lervåg and Hulme, 2013; Simons et al., 2016).

## CanDiD: AN ALTERNATIVE FRAMEWORK FOR LINKING EF AND EDUCATION

In light of this, we propose a new framework for considering links between EF and the classroom. Termed CanDiD, the framework emphasizes Contextual and Dynamic aspects of EF (CanD), and the importance of Development and Individual Differences (DiD). It is based on three assumptions. First, EF is dynamic and subject to contextual influences. Second, development is more than practice, insofar as development constrains the emergent dynamics and contextual influences governing EF. And third, individual differences are fundamental to EF. These assumptions are based on cognitive and neurophysiological studies of EF and its development and have unique implications for thinking about the relationship between EF and the classroom.

## Contextual and Dynamic Aspects of EF

Underemphasized in most cognitive and neurophysiological models is the fact that EF is by its very nature dynamic. Interference suppression, working-memory, and mental flexibility are all subject to a variety of intrinsic (i.e., internal to the child) and extrinsic (i.e., external to the child) influences that lead to continuous and patterned change in the efficacy of these processes over short periods of time. Even cortical networks putatively linked to EF dynamically vary over short timescales, with the nature and complexity of this variability intrinsic to the function of these networks (Hutchison and Morton, 2015; Medaglia et al., 2015; Nomi et al., 2016). Indeed, dynamic variation appears to be a fundamental characteristic of brain function that constrains even elementary aspects of behavior and cognition (McIntosh et al., 2008; Busch et al., 2009; Kucyi and Davis, 2014).

Intrinsic influences that lead to dynamic variability in the efficacy of EF include the body's natural circadian rhythm. The circadian rhythm is an evolutionarily ancient 24-h cycle of arousal governed by a neuroendocrine circadian clock. Although endogenous, or self-regulating, the circadian rhythm is entrained to the external world through the influence of external cues including light and temperature. Diurnal variations in arousal linked to the circadian rhythm impact EF (Hahn et al., 2012). These effects appear to be specific to effortful forms of cognition such as EF. Indeed, explicit – or effortful – forms of memory retrieval operate best during optimal times of an individual's circadian cycle, whereas implicit – or effortless – forms of memory retrieval operate best during non-optimal periods of an individual's circadian cycle. Taken together, these findings point to endogenously governed dynamic changes in thinking styles that evolve over a 24-h period, with effortful and automatic forms of thinking predominating during "optimal" and "non-optimal" circadian periods respectively.

Extrinsic influences on EF are manifold, and contribute to dynamic variation in the efficacy of EF that play out on multiple time scales. One example, referred to as the Gratton effect (Gratton et al., 1992), is driven by the statistics of a task environment, such that tasks saturated with incongruent stimuli show smaller interference effects than do tasks saturated with fpsyg-08-01187 July 11, 2017 Time: 15:44 # 3

congruent trials. These effects can be highly localized in time such that the magnitude of an interference effect is markedly attenuated following a single incongruent trial relative to when the same interference effect is measured following a single congruent trial. Varying task contexts are associated with distinct profiles of activity in the brain (Wilk et al., 2012), underscoring the idea that neurocognitive processes that manage conflict are not isomorphic, but subject to dynamic and contextual variability.

Other extrinsic influences that lead to dynamic variations in the efficacy of EF include stress and sleep. Acute stress causes a shift in an organism's learning style, away from an effortful construction of an allocentric model of the world toward an automatic reward-driven shaping of behavior (Shafiei et al., 2012). For reasons that are not well-understood, sleep duration and quality are linked to the efficacy of EF, with these associations potentially stronger in children than adults (for review, see Turnbull et al., 2013).

In summary, EF is by its very nature dynamic. Core processes, be they cognitively or neurophysiologically conceived, are subject to a variety of intrinsic and extrinsic influences that lead to change in the nature and efficacy of these processes over short periods of time.

## Development: Dynamic and Contextual Constrains

The proposed CanDiD framework assumes that development qualitatively transforms the cognitive, neurophysiological, and neuroanatomical foundations of EF, and therefore at any point in time, fundamentally constrains the dynamical nature of, and contextual influences on, EF.

As one illustration, consider dynamic variation in cortical networks putatively linked to EF. Dynamic variations in cortical network connectivity are an emergent property of highly connected and highly interactive systems such as the brain. Key structural properties of the brain, including path length (i.e., the distance traveled by signals in the brain), conduction velocity (i.e., how rapidly signals travel along pathways in the brain), and signal integrity (i.e., the signal to noise ratio) constrain emergent dynamics (Deco et al., 2011), but also change with development owing to changes in brain size (affecting path length), white matter myelination (affecting conduction velocity), and neurotransmitter availability and receptor density (affecting signal to noise ratio). Thus, development constrains emergent brain dynamics, with potential consequences for the efficacy of EF (McIntosh et al., 2008; Dajani and Uddin, 2015; Medaglia et al., 2015; Hutchison and Morton, 2016; Marusak et al., 2017).

In a similar vein, development constrains how contextual factors impact EF. Throughout development, there are profound changes in sleep duration, onset, and architecture owing in part to changes in the circadian regulation of the sleep-wake cycle. Consequently, optimal periods of the day for effortful goaldirected cognition can be quite different for toddlers, children, and adolescents. Similarly, the proximal and long-term effects of sleep restriction on EF also likely differ for toddlers, children, and adolescents (Bernier et al., 2010; Turnbull et al., 2013). Even the contextual modulation of working-memory and conflict processing efficacy are constrained by development. Whereas older participants retain information about prior processing context and carry this information forward in anticipation of forthcoming cognitive challenges, younger participants treat individual trials as separate instances. Age-related differences in the dynamic adaptation of EF is not only evident in behavior (Chatham et al., 2009), but also in patterns of evoked brain activity (Waxer and Morton, 2011; Wilk and Morton, 2012).

In summary, while the importance of experience for the development of EF is undeniable, it is also the case that development constrains how contextual factors impact EF. Furthermore, age-related differences in EF are likely not reducible to differences in practice, as cognitive, neurophysiological, and neuroanatomical foundations of EF are subject to qualitative transformation over time. Therefore, the CanDiD framework emphasizes the importance of development for understanding contextual influences on emerging EF.

#### Individual Differences

Despite the evidence demonstrating typical age-related changes in EF (De Luca et al., 2003), there are substantial inter-individual differences in EF at all developmental stages. The proposed CanDiD model assumes inter-individual differences are a central characteristic of EF that are not reducible to variation between good and poor EF, but reflect a diversity of strategies or approaches to organizing goal-directed behavior and cognition.

As a cognitive trait, EF varies from individual to individual as a consequence of both environmental and genetic factors. Aspects of the early environment such as parental sensitivity (Bernier et al., 2010; Blair et al., 2011, 2014; Hammond et al., 2012), and exposure to adversity (Kamkar et al., 2017) impact EF at a population level by influencing mean EF scores of large groups. At the same time, individual variation around the population mean can be largely accounted for by genetic difference between individuals, given that EF and associated networks are highly heritable (Polderman et al., 2007; Friedman et al., 2008; Lenroot et al., 2009; Anokhin et al., 2011; Miyake and Friedman, 2012).

The close connection between environmental and genetic influences suggests variation in EF does not follow a continuum of good to poor, but reflects a principled relationship between EF and the nature of a child's early environment. One example is gene-environment correlation, whereby individuals select environments that match their own genetic propensities (Scarr and McCartney, 1983; Plomin and Deary, 2015). This is best reflected in age-related increases in heritability estimates of EF and related constructs like intelligence (Deary et al., 2009, 2010, 2012; Haworth et al., 2010; Tucker-Drob et al., 2013; Tucker-Drob and Briley, 2014; Plomin and Deary, 2015; Plomin et al., 2016). Another example is gene-environment interactions in which certain genetic variants bestow phenotypic stability while others bestow phenotypic plasticity (Bennett et al., 2002; Belsky and Pluess, 2009). Gene-environment interactions are evident in selected aspects of EF such as self-regulation and decisionmaking (Carré et al., 2012).

Taken together then, there is evidence that diversity is not only the starting point of development, but is also evident in fpsyg-08-01187 July 11, 2017 Time: 15:44 # 4

developmental trajectories. Young children differ in the way they strategically organize their thoughts and actions, and will consequently seek out environments that complement their preferred approach to self-regulation as they grow older. In light of this important inter-individual variability, the CanDiD framework emphasizes differences between children in terms of the development of EF.

## SPEAKING "CanDiD-ly" ABOUT EF AND EDUCATION

## CanD: Context and Dynamics

Recognizing the contextual and dynamic nature of EF casts a new light on the relationship between EF and the classroom. Re-thinking this relationship has implications for how we understand and manage student behavior in educational settings. Consider, as an example, inattentiveness and distractibility in the classroom. If we approach the analysis of this style of thinking from the standpoint of EF as a stable cognitive trait that can be trained through targeted practice, we isolate this style of thinking from the context in which it evolves and overlook important intra-individual variability in intellectual focus that might serve as a critical building-block for remediation. Thinking "CanDiD-ly" on the other hand, shifts priorities toward cataloging potential contextual influences on EF and identifying variability in attentiveness over time. For instance, is the child's inattentiveness and distractibility related to unhealthy sleep routines? Is the child acutely (or chronically) stressed, either at home or amongst their peers? Is the child more attentive at certain times of the day than others? Thinking "CanDiD-ly" about inattentiveness gives priority to contextual influences on and dynamic variation in EF-related behavior. It also underscores the importance of working closely with students and parents to identify factors that influence a child's ability to concentrate, or times of the day when a student's focus and readiness to learn is optimal. Consistent with the spirit of this suggestion is the American Academy of Pediatrics recommendation that middle and high schools start at 8:30 am or later so that students can obtain the 8.5–9.5 h of sleep they require (Adolescent Sleep Working Group, 2014). Implicit in this approach is the notion that variation in inattentiveness can be part of larger cycle of arousal (diurnal or otherwise). Thinking "CanDiD-ly," the priority becomes adjusting the child's environment or daily routine to maximize the likelihood that the instructor is teaching when the children are ready to learn.

## DiD: Development and Individual Differences

Thinking "CanDiD-ly" about EF, we need to recognize there are qualitatively different styles of learning that are deeply rooted in the nature of individual children. One implication is a need to move from passive modes of instruction, to active modes in which children are granted more active roles in selecting, modifying, and creating their own educational experiences. On this view, an "equal" educational system is not one in which all children are exposed to exactly the same environments, but one in which all children are given an opportunity to select learning environments that accommodate their learning strengths (Asbury and Plomin, 2013).

Furthermore, rather than using computerized tasks that train a narrow range of cognitive processes, programs that allow for broad practice in EF-promoting activities may be more successful. Aerobic exercise, pretend play, yoga, and mindfulness meditation have all been implicated in improving EF (Diamond and Lee, 2011). A curriculum that targets broad activities and has shown promise in improving EF is the Tools of the Mind (Tools) curriculum, which constitutes activities including pretend play, self-regulatory private speech, and dramatic arts. These activities are said to promote EF because they require inhibitory control. For example, in dramatic arts, children must inhibit acting out of character (Diamond and Lee, 2011). When compared against the District's version of Balanced Literacy curriculum (dBL), participants in the Tools curriculum significantly outperformed those in the dBL curriculum on measures of inhibitory control (Diamond et al., 2007). Tools differs from the working-memory training discussed previously because it allows for practice in a broad range of EF-promoting activities, rather than training on a specific task and expecting gains on a construct as broad as EF; thus, Tools may not suffer of as many issues related to far-transfer effects.

## CONCLUDING REMARKS

The present paper offers a new framework for thinking about EF and its links with education, one that is informed by basic research into the nature of EF and its development, and departs from more conventional approaches to these issues. With this framework, researchers are at liberty to conduct studies that assess what contextual and dynamic factors might constrain EF, and how individual differences at the genetic and environmental levels might be related to the development of EF. In the context of education, using a CanDiD approach may allow teachers to take note of these contextual, dynamic, and individual factors and to use this knowledge in tailoring an educational curriculum that considers the needs of the child rather than expecting children to adjust to a one-size-fits-all education system.

## AUTHOR CONTRIBUTIONS

All authors listed have made a substantial, direct and intellectual contribution to the work, and approved it for publication.

## FUNDING

This study was funded by Natural Sciences and Engineering Research Council of Canada (http://dx.doi.org/ 10.13039/501100000038).

## REFERENCES

fpsyg-08-01187 July 11, 2017 Time: 15:44 # 5


fpsyg-08-01187 July 11, 2017 Time: 15:44 # 6


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2017 Kamkar and Morton. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.