EXECUTIVE FUNCTION(S): CONDUCTOR, ORCHESTRA OR SYMPHONY? TOWARDS A TRANS-DISCIPLINARY UNIFICATION OF THEORY AND PRACTICE ACROSS DEVELOPMENT, IN NORMAL AND ATYPICAL GROUPS

EDITED BY : Lynne A. Barker and Nicholas Morton PUBLISHED IN : Frontiers in Behavioral Neuroscience

#### Frontiers Copyright Statement

© Copyright 2007-2018 Frontiers Media SA. All rights reserved. All content included on this site, such as text, graphics, logos, button icons, images, video/audio clips, downloads, data compilations and software, is the property of or is licensed to Frontiers Media SA ("Frontiers") or its licensees and/or subcontractors. The copyright in the text of individual articles is the property of their respective authors, subject to a license granted to Frontiers.

The compilation of articles constituting this e-book, wherever published, as well as the compilation of all other content on this site, is the exclusive property of Frontiers. For the conditions for downloading and copying of e-books from Frontiers' website, please see the Terms for Website Use. If purchasing Frontiers e-books from other websites or sources, the conditions of the website concerned apply.

Images and graphics not forming part of user-contributed materials may not be downloaded or copied without permission.

Individual articles may be downloaded and reproduced in accordance with the principles of the CC-BY licence subject to any copyright or other notices. They may not be re-sold as an e-book.

As author or other contributor you grant a CC-BY licence to others to reproduce your articles, including any graphics and third-party materials supplied by you, in accordance with the Conditions for Website Use and subject to any copyright notices which you include in connection with your articles and materials.

All copyright, and all rights therein, are protected by national and international copyright laws.

The above represents a summary only. For the full conditions see the Conditions for Authors and the Conditions for Website Use. ISSN 1664-8714 ISBN 978-2-88945-555-3 DOI 10.3389/978-2-88945-555-3

### About Frontiers

Frontiers is more than just an open-access publisher of scholarly articles: it is a pioneering approach to the world of academia, radically improving the way scholarly research is managed. The grand vision of Frontiers is a world where all people have an equal opportunity to seek, share and generate knowledge. Frontiers provides immediate and permanent online open access to all its publications, but this alone is not enough to realize our grand goals.

# Frontiers Journal Series

The Frontiers Journal Series is a multi-tier and interdisciplinary set of open-access, online journals, promising a paradigm shift from the current review, selection and dissemination processes in academic publishing. All Frontiers journals are driven by researchers for researchers; therefore, they constitute a service to the scholarly community. At the same time, the Frontiers Journal Series operates on a revolutionary invention, the tiered publishing system, initially addressing specific communities of scholars, and gradually climbing up to broader public understanding, thus serving the interests of the lay society, too.

# Dedication to Quality

Each Frontiers article is a landmark of the highest quality, thanks to genuinely collaborative interactions between authors and review editors, who include some of the world's best academicians. Research must be certified by peers before entering a stream of knowledge that may eventually reach the public - and shape society; therefore, Frontiers only applies the most rigorous and unbiased reviews.

Frontiers revolutionizes research publishing by freely delivering the most outstanding research, evaluated with no bias from both the academic and social point of view. By applying the most advanced information technologies, Frontiers is catapulting scholarly publishing into a new generation.

# What are Frontiers Research Topics?

Frontiers Research Topics are very popular trademarks of the Frontiers Journals Series: they are collections of at least ten articles, all centered on a particular subject. With their unique mix of varied contributions from Original Research to Review Articles, Frontiers Research Topics unify the most influential researchers, the latest key findings and historical advances in a hot research area! Find out more on how to host your own Frontiers Research Topic or contribute to one as an author by contacting the Frontiers Editorial Office: researchtopics@frontiersin.org

# EXECUTIVE FUNCTION(S): CONDUCTOR, ORCHESTRA OR SYMPHONY? TOWARDS A TRANS-DISCIPLINARY UNIFICATION OF THEORY AND PRACTICE ACROSS DEVELOPMENT, IN NORMAL AND ATYPICAL GROUPS

Topic Editors:

Lynne A. Barker, Sheffield Hallam University, United Kingdom Nicholas Morton, Tickhill Road Hospital, United Kingdom

Sagittal view of brain and eye - drawn and illustrated by Dr L. A. Barker.

There are several theories of executive function(s) that tend to share some theoretical overlap yet are also conceptually distinct, each bolstered by empirical data (Norman and Shallice, 1986; Shallice & Burgess, 1991; Stuss and Alexander, 2007; Burgess, Gilbert, & Dumentheil, 2007; Burgess & Shallice, 1996; Miyake et al., 2000). The notion that executive processes are supervisory, and most in demand in novel situations was an early conceptualization of executive function that has been adapted and refined over time (Norman & Shallice, 1986; Shallice, 2001; Burgess, Gilbert & Dumentheil, 2007). Presently there is general consensus that executive functions are multicomponential (Shallice, 2001), and are supervisory only in the sense that attention in one form or another is key to the co-ordination of other hierarchically organized 'lower' cognitive processes. Attention in this sense is defined as (i) independent but interrelated attentional control processes (Stuss & Alexander, 2007); (ii) automatic orientation towards stimuli in the environment or internally–driven thought (Burgess, Gilbert & Dumontheil, 2007); (iii) the automatically generated interface between tacit processes and strategic conscious thought (Barker, Andrade, Romanowski, Morton and Wasti, 2006; Morton and Barker, 2010); and (iv) distinct but interrelated executive processes that maintain, update and switch across different sources of information (Miyake et al., 2000).

One problem is that executive dysfunction or dysexecutive syndrome (Baddeley & Wilson, 1988) after brain injury typically produces a constellation of deficits across social, cognate, emotional and motivational domains that rarely map neatly onto theoretical frameworks (Barker, Andrade & Romanowski, 2004). As a consequence there is debate that conceptual theories of executive function do not always correspond well to the clinical picture (Manchester, Priestley & Jackson, 2004). Several studies have reported cases of individuals with frontal lobe pathology and impaired daily functioning despite having little detectable impairment on traditional tests of executive function (Shallice & Burgess, 1991; Eslinger & Damasio, 1985; Barker, Andrade & Romanowski, 2004; Andrés & Van der Linden, 2002; Chevignard et al., 2000; Cripe, 1998; Fortin, Godbout & Braun, 2003). There is also some suggestion that weak ecological validity limits predictive and clinical utility of many traditional measures of executive function (Burgess et al, 2006; Lamberts, Evans & Spikman, 2010; Barker, Morton, Morrison, McGuire, 2011). Complete elimination of environmental confounds runs the risk of generating results that cannot be generalized beyond constrained circumstances of the test environment (Barker, Andrade & Romanowski, 2004). Several researchers have concluded that a new approach is needed that is mindful of the needs of the clinician yet also informed by the academic debate and progress within the discipline (McFarquhar & Barker, 2012; Burgess et al., 2006). Finally, translational issues also confound executive function research across different disciplines (psychiatry, cognitive science, and developmental psychology) and across typically developing and clinical populations (including Autism Spectrum Disorders, Head Injury and Schizophrenia – Blakemore & Choudhury, 2006; Taylor, Barker, Heavey & McHale, 2013). Consequently, there is a need for unification of executive function approaches across disciplines and populations and narrowing of the conceptual gap between theoretical positions, clinical symptoms and measurement.

Citation: Barker, L. A., Morton, N., eds (2018). Executive Function(s): Conductor, Orchestra or Symphony? Towards a Trans-Disciplinary Unification of Theory and Practice Across Development, in Normal and Atypical Groups. Lausanne: Frontiers Media. doi: 10.3389/978-2-88945-555-3

# Table of Contents

*07 Editorial: Executive Function(s): Conductor, Orchestra or Symphony? Towards a Trans-Disciplinary Unification of Theory and Practice Across Development, in Normal and Atypical Groups*

Lynne A. Barker and Nicholas Morton

# CHAPTER 1

# EXECUTIVE FUNCTIONS ACROSS THE LIFESPAN IN CLINICAL AND NORMATIVE COHORTS

# CHAPTER 1.1: EXECUTIVE FUNCTIONS AND DEVELOPMENT

*11 The Longitudinal Development of Social and Executive Functions in Late Adolescence and Early Adulthood*

Sophie J. Taylor, Lynne A. Barker, Lisa Heavey and Sue McHale


# CHAPTER 1.2: EXECUTIVE FUNCTION/ATTENTION AND INTELLIGENCE

*42 "Executive Functions" Cannot be Distinguished From General Intelligence: Two Variations on a Single Theme Within a Symphony of Latent Variance*

Donald R. Royall and Raymond F. Palmer

*52 The Relationship Between Executive Functions and Fluid Intelligence in Schizophrenia*

María Roca, Facundo Manes, Marcelo Cetkovich, Diana Bruno, Agustín Ibáñez, Teresa Torralva and John Duncan


*Neurons: Effects on Attention*

Ines Villano, Antonietta Messina, Anna Valenzano, Fiorenzo Moscatelli, Teresa Esposito, Vincenzo Monda, Maria Esposito, Francesco Precenzano, Marco Carotenuto, Andrea Viggiano, Sergio Chieffi, Giuseppe Cibelli, Marcellino Monda and Giovanni Messina

# CHAPTER 1.3: FUNCTIONAL CONNECTIVITY, EXECUTIVE FUNCTION AND WORKING MEMORY

*80 Event-Related Potentials Altered in Patients With Borderline Personality Disorder During Working Memory Tasks*

Ying Liu, Mingtian Zhong, Chang Xi, Xinhu Jin, Xiongzhao Zhu, Shuqiao Yao and Jinyao Yi


Xiaojing Fang, Yuanchao Zhang, Yuan Zhou, Luqi Cheng, Jin Li, Yulin Wang, Karl J. Friston and Tianzi Jiang

# CHAPTER 1.4: EXECUTIVE FUNCTION AND BILINGUALISM

*110 Executive Function and Bilingualism in Young and Older Adults* Shanna Kousaie, Christine Sheppard, Maude Lemieux, Laura Monetta and Vanessa Taler

# CHAPTER 1.5: EXECUTIVE FUNCTION AND NEUROPATHOLOGY

*122 Neurobehavioral Abnormalities Associated With Executive Dysfunction After Traumatic Brain Injury*

Rodger Ll. Wood and Andrew Worthington

*131 Metacognitive Aspects of Executive Function are Highly Associated With Social Functioning on Parent-Rated Measures in Children With Autism Spectrum Disorder*

Tonje Torske, Terje Nærland, Merete G. Øie, Nina Stenberg and Ole A. Andreassen

# CHAPTER 1.6: EXECUTIVE FUNCTION AND INHIBITORY CONTROL

*143 Automatic Inhibition and Habitual Control: Alternative Views in Neuroscience Research on Response Inhibition and Inhibitory Control* Agnes J. Jasinska

# CHAPTER 1.7: EXECUTIVE FUNCTION AND EXPERIENCE OF PAIN

*147 Cognitive Impairment in Patients With Chronic Neuropathic or Radicular Pain: An Interaction of Pain and Age*

Orla Moriarty, Nancy Ruane, David O'Gorman, Chris H. Maharaj, Caroline Mitchell, Kiran M. Sarma, David P. Finn and Brian E. McGuire

# CHAPTER 2

# EXECUTIVE FUNCTIONS, MEASUREMENT, MAINTENANCE AND TRAINING CHAPTER 2.1: EXECUTIVE FUNCTIONS: MEASUREMENT


Brian E. McGuire


# CHAPTER 2.2: EXERCISE AND EXECUTIVE FUNCTIONS

*202 "Neural Efficiency" of Athletes' Brain During Visuo-Spatial Task: An fMRI Study on Table Tennis Players*

Zhiping Guo, Anmin Li and Lin Yu

*210 Executive Function and Endocrinological Responses to Acute Resistance Exercise*

Chia-Liang Tsai, Chun-Hao Wang, Chien-Yu Pan, Fu-Chen Chen, Tsang-Hai Huang and Feng-Ying Chou


Chia-Liang Tsai, Chun-Hao Wang, Chien-Yu Pan and Fu-Chen Chen

# Editorial: Executive Function(s): Conductor, Orchestra or Symphony? Towards a Trans-Disciplinary Unification of Theory and Practice Across Development, in Normal and Atypical Groups

#### Lynne A. Barker <sup>1</sup> \* and Nicholas Morton<sup>2</sup>

<sup>1</sup> Reader in Cognitive Neuroscience, Brain, Behaviour and Cognition Group, Department of Psychology, Sociology and Politics, Sheffield Hallam University, Sheffield, United Kingdom, <sup>2</sup> Consultant Clinical Neuropsychologist, Neuro-Rehabilitation Services, Rotherham & South Humber NHS Trust, Tickhill Road Hospital, Doncaster, United Kingdom

Keywords: executive functions, development, psychometrics, imaging, three-dimensional, neuropathology

#### **Editorial on the Research Topic**

#### Edited and reviewed by:

Nuno Sousa, Instituto de Pesquisa em Ciências da Vida e da Saúde (ICVS), Portugal

> \*Correspondence: Lynne A. Barker l.barker@shu.ac.uk

Received: 29 March 2018 Accepted: 19 April 2018 Published: 08 May 2018

#### Citation:

Barker LA and Morton N (2018) Editorial: Executive Function(s): Conductor, Orchestra or Symphony? Towards a Trans-Disciplinary Unification of Theory and Practice Across Development, in Normal and Atypical Groups. Front. Behav. Neurosci. 12:85. doi: 10.3389/fnbeh.2018.00085

# **Executive Function(s): Conductor, Orchestra or Symphony? Towards a Trans-Disciplinary Unification of Theory and Practice Across Development, in Normal and Atypical Groups**

One problem with well-established executive function theories is that developmental disorders, brain injury, neuropathology, psychiatric conditions, and cognitive decline typically produce cross-cutting problems in social, cognitive, and emotional domains that seldom correspond to executive function models. Consequently, there is an argument that conceptual theories of executive function do not accord with clinical presentation (Manchester et al., 2004), and that executive function tests have limited predictive clinical utility (Barker et al., 2004; Burgess et al., 2006). Currently, there is need for unification of executive function approaches across disciplines, populations, and life span, further, it is also necessary for narrowing the conceptual gap between theoretical positions, clinical symptoms, and measurement.

This research topic includes findings on the development of executive functions in childhood, adolescence, and early adulthood. Taylor et al. found that the executive functions developed non-linearly in late adolescence and early adulthood, with peaks and troughs in executive ability corresponding to morphological brain change at these age ranges. These findings have ramifications for understanding the normal and abnormal development of executive functions. The reviewed evidence also indicates that working memory, attention, and inhibitory control develop alongside time keeping skills and may depend upon shared but distinct neural substrates (Vicario).

One possibility is that cognitively controlled timing skills make some unique but distinct contribution to the development of working memory and executive functions. Hsu et al. reviewed studies on the development and malleability of Executive Control (EC), which is defined as capacity to regulate cognitive processes for successful goal attainment. They concluded that targeted EC training interventions would likely benefit children from low socio-economic status backgrounds and those with attention-deficit disorders, although early findings are insufficient to warrant firm conclusions. The use of new neuroimaging techniques and better understanding of mechanisms underpinning EC training could further inform developmental interventions for targeted populations.

This special edition includes evidence that performance on supposed executive function and memory measures may depend upon some shared process defined as fluid intelligence. Royall and Palmer used a latent variable structural equation modeling approach to distinguish domain-specific variance in executive function and memory measures and shared cognitive variance defined by Spearman's g (where g represents general intelligence). When variance was accounted for across several memory and executive function measures, executive function ability overlapped with intelligence scores in a healthy elderly cohort. These findings have implications for classification and specificity of executive functions, measurement, and assessment and purported neural substrates. Similarly, impaired performance on a battery of executive function tests by schizophrenic patients was mostly explained by deficits in fluid intelligence Roca et al. Importantly, when fluid intelligence was partialled out multitasking and decision-making performance deficits remained indicating selective executive in addition to general cognitive deficits.

Executive attentional control and working memory functions have been investigated in a range of psychiatric and neuropathological conditions. Drabble et al. investigated the potential role of attentional control to self-harm in people with borderline personality disorder (BPD). However, the findings were surprising, high attentional focusing predicted self-harm history in those with high BPD features. In contrast, good attentional switching ability reduced the likelihood of self-harm. The notion of a potential moderating effect of attentional control on negative affect and self-harm in BPD individuals constitutes a new conceptualization of the condition. The review of extant evidence indicated that basal forebrain cholinergic system is a potential neural basis of executive attention (Villano et al.). Collectively, data supported the notion that neuropeptide regulatory orexin neurons stimulate cholinergic frontal pathways and may provide the mechanism of executive attention, but a further detailed work is needed.

Working memory capacity was also investigated in BPD individuals using an event related potentials (ERP) paradigm. Liu et al. found that BPD patients had lower P3 amplitudes and longer N2 latencies than controls that were independent of working memory load likely indicating working memory dysfunction. Working memory abnormalities were also found in mild cognitive impaired (MCI) patients based on distant synchronization of the background network at rest and during working memory task performance (Wang et al.). There was no significant difference between rest and working memory state in MCI patients when compared to controls indicating inefficient organization of the background network associated with cognitive impairment in patients. The neural locus of working memory networks was also explored in bilateral dorsolateral prefrontal cortices using a resting state functional connectivity approach mapped to working memory task accuracy in healthy controls (Fang et al.). The findings revealed the functional connectivity between dorsolateral prefrontal cortex and anterior cingulate cortex, and right dorsolateral prefrontal cortex and fronto-insular cortex using spectral dynamic causal modeling. The connectivity of these regions governed working memory ability and differences in resting-state effective connectivity might explain individual differences in working memory ability.

It has been revealed that speaking more than one language protects the executive functions in ageing. It is assumed that inhibition of one language whilst engaged in the other language confers an interference suppression advantage in bilingual individuals. However, available evidence also indicates that any executive control advantage is offset by poorer ability on language-specific tasks in bilinguals when compared to monolinguals. Kousaie et al. investigated the purported bilingual advantage on inhibitory control tasks in monolingual Anglophones and Francophones, and French/English bilingual young and older adults. Their findings did not show the expected bilingual advantage for executive task performance (except for a slight bilingual advantage on one measure), or a bilingual disadvantage on language tasks when compared to monolinguals. One possible explanation of findings is the mediating effect of context and frequency of exposure to both languages on executive and language skills in bilinguals, indicating that purported cognitive advantage/and disadvantage warrants further investigation, particularly in relation to resilience associated with ageing.

There is an extensive literature on the role of executive function deficits to socio-cognitive behavioral problems, yet there is no overarching theoretical framework that delineates how executive deficits impact social functioning. Wood and Worthington reviewed the literature on the purported link between executive function and socio-emotional functions in healthy ageing and post-traumatic brain injury (TBI) populations. They concluded that intact executive ability is crucial for appraisal and evaluation of social stimuli, and that the distinction between cognitive and socioemotional sequelae of TBI is no longer tenable based on current evidence. Torske et al. investigated whether social function problems were associated with specific executive impairments in children and adolescents with a diagnosis of Autism Spectrum Disorder (ASD) using parent-rated measures. Metacognitive executive functions contributed to social ability in young people with ASD and impaired social functioning potentially reflected poor behavioral regulation.

Jasinska argued that it is time to reconsider whether inhibitory executive processes are conceptually different from response selection and execution. The evidence from neuroscience accounts of inhibitory control mechanisms indicate that response inhibition could be simultaneously classified as a control process and a prepotent response tendency. This conceptualization provides a potential new approach to treating disorders of inhibitory control. Inhibitory control was also poor in those with chronic neuropathic or radicular pain when compared to controls that may have been caused by the chronic experience of pain or secondary tendency to have higher anxiety and depression levels than controls (Moriarty et al.). Problem solving executive ability was not affected by chronic pain in this cohort indicating selective effects of pain on cognition.

The measurement of executive functions remains problematic because (i), current standardized tests are not process pure: supervisory, attentional, and control executive functions invariably operate across other lower-level functions and (ii), there are abiding issues of sensitivity and ecological validity with current standardized tests. To address these issues, a computerized cooking task was developed to evaluate whether an analogue of real world behavior requiring multiple executive functions reliably indexed ability in a control group when compared to standard neuropsychological measures (Doherty et al.). Task parameters distinguished executive ability from overall IQ, unlike several currently widely used measures, and difficult task levels tapped different executive and memory functions when compared to easier levels. Test analogues of real-world tasks are potential candidates for next generation executive function measures. Tanguay et al. investigated executive function ability using a Breakfast Task, an Activities of Daily Living scale, and real cooking activity with acquired brain injury patients and matched controls. Patients had significant problems with all aspects of the Breakfast Task when compared to controls, although real cooking activity did not correlate with task performance indicating that purported 'real world' tasks may not isolate and capture the same functions used in everyday behavior. McGuire commented on the importance of developing ecologically valid tests of executive function for clinical assessment but also emphasized the multi-sensory context of real world behavior, which at present cannot be captured in immersive or computerized tasks but remains a possibility for the future. Cipresso et al. measured executive function ability in Parkinson's disease patients with and without cognitive impairments and controls on a virtual version of the Multiple Errands test (VMET) and standardized executive tests. Patients made more errors on the VMET than controls and the task was more sensitive to detection of early executive deficits than standardized executive measures. Sensitivity and reliability were also problematic on another widely used SELF and OTHER rating scale of executive ability. McGuire et al. compared Self, Other, and clinician ratings on the dysexecutive questionnaire to investigate factor structure and inter-rater reliability of patients' ratings of deficits when compared to others' ratings of their problems. There was poor agreement between clinician and other ratings on the measure indicating that accurate reporting of patients' post-injury deficits was essential to maintain the reliability and usefulness of the scale. Other raters should be selected with caution when asked to rate patient's problems on the dysexecutive questionnaire. Overall findings presented in this research topic were promising for new ecologically valid measures of executive function but were less encouraging for current measures of executive function and their clinical utility.

Finally, new approaches to enhance executive function ability are emerging associated with exercise and athleticism. The neural efficiency theory hypotheses that brain activity is attenuated in experts when compared to non-experts due to processing efficiency. When athletes and non-athletes were compared on a visuospatial executive task, athletes had faster reaction time responses but were not more accurate than non-athletes (Guo et al.). FMRI data showed that athletes had reduced activation to multiple frontal, temporal, and cerebellar regions during task performance when compared to non-athletes indicating task-specific neural reorganization in experts. Young healthy males were assigned to either high-, moderate-intensity or a no-exercise group and cortisol levels, EEG activity and executive function performance were measured (Liang Tsai et al.). Changes were seen in cortisol and P3 amplitude levels after resistance exercise along with enhanced executive function performance. The neural bases of these changes need further investigations, but effects may be associated with physiological arousal levels. Active participants also had better verbal working memory capacity, dual-task performance, and inhibitory ability when compared to a sedentary group (Padilla et al.). The authors concluded that chronic aerobic exercise can benefit cognitive and physical health across the lifespan.

In other work, Tsai et al. investigated the effects of resistance exercise on executive function performance in healthy elderly males and controls, and measured insulin, growth hormone and homocysteine levels at baseline and 1-year intervention period. Performance improvements in the exercise group were associated with an increase in growth factor levels, improved P3 amplitudes (indicating better attention), and decreased serum homocysteine. Aside from the physical benefits, regular consistent exercise contributed to improved executive function and general cognitive health in a well elderly cohort.

Collectively, the papers comprising this special topic reveal that executive function research is theoretically and methodologically diverse cross-cutting atypical, typical, and ageing populations. The neural bases of executive functions and working memory are receiving renewed interest using innovative and advanced imaging and modeling approaches. Importantly, new generation executive tests are emerging, that whilst in the early stages offer promise for speedy, sensitive process-specific clinical assessment. Finally, the relationship between cognitive and physical health reveals that healthy ageing need not be a process of gradual decline and executive functions, like cognition; generally, can be improved by enhanced physical health. Together these findings will hopefully stimulate new theoretical approaches and advances in the field.

# AUTHOR CONTRIBUTIONS

LB and NM wrote the editorial and were editors of the research topic.

# REFERENCES


Brain Injury 18:1067–1068. doi: 10.1080/026990504100016 72387

**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2018 Barker and Morton. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

# The longitudinal development of social and executive functions in late adolescence and early adulthood

Sophie J. Taylor\*, Lynne A. Barker, Lisa Heavey and Sue McHale

*Department of Psychology, Sociology and Politics, Sheffield Hallam University, Sheffield, UK*

Our earlier work suggests that, executive functions and social cognition show protracted development into late adolescence and early adulthood (Taylor et al., 2013). However, it remains unknown whether these functions develop linearly or non-linearly corresponding to dynamic changes to white matter density at these age ranges. Executive functions are particularly in demand during the transition to independence and autonomy associated with this age range (Ahmed and Miller, 2011). Previous research examining executive function (Romine and Reynolds, 2005) and social cognition (Dumontheil et al., 2010a) in late adolescence has utilized a cross sectional design. The current study employed a longitudinal design with 58 participants aged 17, 18, and 19 years completing social cognition and executive function tasks, Wechsler Abbreviated Scale of Intelligence (Wechsler, 1999), Positive and Negative Affect Schedule (Watson et al., 1988), and Hospital Anxiety and Depression Scale (Zigmond and Snaith, 1983) at Time 1 with follow up testing 12–16 months later. Inhibition, rule detection, strategy generation and planning executive functions and emotion recognition with dynamic stimuli showed longitudinal development between time points. Self-report empathy and emotion recognition functions using visual static and auditory stimuli were stable by age 17 whereas concept formation declined between time points. The protracted development of some functions may reflect continued brain maturation into late adolescence and early adulthood including synaptic pruning (Sowell et al., 2001) and changes to functional connectivity (Stevens et al., 2007) and/or environmental change. Clinical implications, such as assessing the effectiveness of rehabilitation following Head Injury, are discussed.

#### Keywords: adolescence, longitudinal, developmental trajectory, social cognition, executive function

# Introduction

Adolescence is a critical period of development with dynamic brain maturation characterized by psychological, behavioral and social change (Steinberg and Morris, 2001) indicative of the transition to autonomy and independence. Executive functions and socio-emotional development are key to adaptive functioning in this stage of development (Ahmed and Miller, 2011). Executive functions initiate, co-ordinate, maintain, and inhibit other cognitive functions (Miyake et al., 2000) and are recruited in novel or demanding situations to perform goal-directed behavior when routine behavior is inadequate. Social cognition incorporates a range of functions including emotion recognition, empathy, perspective taking, and Theory of Mind (ToM), the ability to impute a range of mental states including beliefs, desires, and intentions to self and others

Edited by:

*Niels Birbaumer, University of Tuebingen, Germany*

#### Reviewed by:

*Hans-Joachim Bischof, University of Bielefeld, Germany Lilian Konicar, Institute for Medical Psychology and Behavioural Neurobiology, Germany*

#### \*Correspondence:

*Sophie J. Taylor, Department of Psychology, Sociology and Politics, Sheffield Hallam University, 2.05 Heart of the Campus, Collegiate Crescent, Sheffield S10 2BQ, UK s.j.taylor@shu.ac.uk*

> Received: *02 June 2015* Accepted: *31 August 2015* Published: *15 September 2015*

#### Citation:

*Taylor SJ, Barker LA, Heavey L and McHale S (2015) The longitudinal development of social and executive functions in late adolescence and early adulthood. Front. Behav. Neurosci. 9:252. doi: 10.3389/fnbeh.2015.00252* (Frith, 2007; Carrington and Bailey, 2009). During adolescence some cognitive functions show protracted development including updating and switching (Magar et al., 2010), verbal fluency and planning (Romine and Reynolds, 2005), emotion recognition (Thomas et al., 2007), perspective taking (Choudhury et al., 2006; Dumontheil et al., 2010a), and empathy (Mestre et al., 2009). However, these data predominantly focus on younger age ranges with less known about late adolescent and early adulthood development. Furthermore, these studies indicate linear development of cognitive functions whereas there is also contrasting evidence of non-linear development (Taylor et al., 2013).

Findings from imaging studies indicate that brain maturation is dynamic across development including both progressive (myelination) and regressive (synaptic pruning) processes (Sowell et al., 2001) with protracted development of frontal networks into late adolescence and early adulthood (Schmithorst and Yuan, 2010). The continued development of frontal networks is particularly pertinent because they are thought to play an important role in executive functions (Barker et al., 2010) and some aspects of social cognition (Carrington and Bailey, 2009) that are crucial to adaptive goal-oriented behavior. Diffusion Tensor Imaging (DTI) data show protracted maturation of frontal networks (Schmithorst and Yuan, 2010) are associated with the executive function of strategy generation (Delis et al., 2001) between 16.2 and 20.6 years of age at Time 1 with followup testing 16 months later (Bava et al., 2010). These findings indicate that white matter maturation in late adolescence and early adulthood leads to an improvement of executive functions and provide the neural basis of developmental change in certain aspects of cognition.

# Executive Functions

Late adolescence is characterized by linear and non-linear brain maturation that may correspond behaviorally to functions showing linear or non-linear development, for example troughs and peaks in development (Fischer and Kennedy, 1997). Behavioral studies provide evidence of linear and non-linear executive function development in late adolescence. In a metaanalysis of cross-sectional executive function studies, Romine and Reynolds (2005) reported that executive functions show divergent developmental trajectories with planning and verbal fluency continuing to develop linearly between late adolescence and early adulthood. Magar et al. (2010) provided further support for linear executive function development with updating, assessed with the n-back task (Cohen et al., 1997) and switching, assessed with the number-letter switching task (Rogers and Monsell, 1995), improving between ages 11 and 17.

There is also some evidence of non-linear development with poorer performance on executive function tasks in late adolescence compared to early/middle adolescence, possibly due to neural re-organization (Uhlhaas et al., 2009; Taylor et al., 2013). The use of broad age ranges in previous studies may decrease sensitivity (De Luca et al., 2003) and mask nonlinear development due to the short time-frame when frontal pathways undergo steep maturational change around ages 17– 25 (Barker et al., 2010). To address this issue and measure executive function ability across later development, we (Taylor et al., 2013) employed a design with fine-grained age groups (17 years 0 months–17 years 8 months, 18 years 0 months–18 years 8 months, and 19 years 0 months–19 years 8 months) and found non-linear executive function development for strategy generation and concept formation, assessed with D-KEFS Letter Fluency and Sorting Tests (Delis et al., 2001). Seventeen year olds scored significantly higher, indicating better performance, than 18 year olds on strategy generation and four indices of concept formation (number of correct free sorts, free sort description score, sort recognition description score, and description score for perceptual sorts). Seventeen year olds also scored significantly higher, indicating more accurate concept formation than 19 year olds. These findings indicate non-linear executive function development likely reflecting dynamic brain maturation (Lebel et al., 2008; Uhlhaas et al., 2009). Similarly, Dumontheil et al. (2010b) reported non-linear development on a relational reasoning task requiring inhibition and cognitive flexibility (Diamond, 2013) with a dip in accuracy in middle adolescence. Overall these findings indicate that executive functions show linear and non-linear development during adolescence and early adulthood corresponding to linear and non-linear morphological brain changes. Previous studies are limited by cross sectional design so there is a need for longitudinal data to better inform knowledge of cognitive development in late adolescence and early adulthood.

# Social Cognition

Imaging studies have consistently implicated a mentalizing network comprised of the medial prefrontal networks, superior temporal sulci, and temporal poles in social cognition task performance (Carrington and Bailey, 2009). Behavioral studies report ongoing social cognition development between adolescence and early adulthood. Vetter et al. (2013) found adolescents aged 12–15 years scored significantly lower than young adults aged 18–22 years on the Story Comprehension Test (Channon and Crawford, 2000), a measure of ToM, and the German version of the Reading the Mind in the Eyes Test (Bölte, 2005), a measure of visual emotion recognition. The development of social cognition was independent of more basic cognitive abilities such as working memory, speed of processing, and verbal ability (Vetter et al., 2013) providing evidence for social cognition being domain specific (Apperly et al., 2005). However, a limitation of the Eyes Test is the use of static stimuli (Baron-Cohen et al., 2001) because they do not fully capture the dynamic and transitory nature of mental states in real life social situations (Vetter et al., 2013). In a longitudinal study, Davis and Franzoi (1991) assessed participants aged 15 and 16 years at 1-year intervals over three consecutive years on the Interpersonal Reactivity Index (IRI; Davis, 1983), a self-report measure of empathy. Perspective Taking, the tendency to consider another person's point of view, and Empathic Concern, the tendency to experience compassion and sympathy toward others, significantly increased, whereas ratings of Personal Distress, the tendency to experience uneasiness in tense social situations, significantly decreased between time points. In contrast, Taylor et al.'s (2013) study showed no group differences in 17, 18, and 19 year olds on social cognition tasks. Previous studies have assessed a narrow range of social cognition so a comprehensive assessment was included in the present study including emotion recognition in visual static stimuli (Reading the Mind in the Eyes Test; Baron-Cohen et al., 2001), auditory stimuli (Reading the Mind in the Voices Test; Golan et al., 2007), dynamic visual and auditory stimuli (Movie for the Assessment of Social Cognition; MASC; Dziobek et al., 2006), and self-report empathy (IRI; Davis, 1983). It is possible that as social cognitive functions are associated with multiple networks (Wolf et al., 2010) that mature earlier than frontal networks, social cognitive functions may be more resistant to change across later development compared to executive functions. Overall, these results highlight the multidimensional nature of social cognition, with different aspects of social cognition showing different developmental trajectories in late adolescence.

To summarize, there is evidence of linear (Romine and Reynolds, 2005; Magar et al., 2010; Vetter et al., 2013) and nonlinear development (Dumontheil et al., 2010b; Taylor et al., 2013) of executive functions and social cognition during adolescence. The majority of previous studies examining executive function (Kalkut et al., 2009; Magar et al., 2010) and social cognition (Tonks et al., 2007; Dumontheil et al., 2010a,b; Vetter et al., 2013) in late adolescence and early adulthood have utilized a cross sectional design. This type of design is easier and less time consuming to conduct compared to longitudinal designs (Moriguchi and Hiraki, 2011), although no data on developmental change is collected (Kraemer et al., 2000). The aim of the present study was to investigate the developmental trajectory of executive and social cognitive functions using a longitudinal design with 17 (Younger age group), 18 (Middle age group), and 19 year olds (Older age group) at Time 1 and follow up testing 12–16 months later. Previous research has recommended a longitudinal design to identify whether abilities improve or decline (are linear or non-linear) over time (Romine and Reynolds, 2005; Kalkut et al., 2009). We predicted that executive functions of strategy generation, planning, inhibition, and rule detection would improve and concept formation would decline, based on previous findings, whereas social cognition would be relatively stable between time points.

# Method

All participants gave written informed consent and parental consent was gained for 17 year olds. This research received approval from the Sheffield Hallam University Ethics Committee. A time frame of 12–16 months between testing sessions enabled the identification of any subtle linear and non-linear changes. A minimum 12-month interval between testing sessions conforms to neuropsychological assessment procedure for repeat testing (Lezak et al., 2004) and minimizes memory contributing to practice effects (Hausknecht et al., 2007). Head Injury and Autism Spectrum Disorders were exclusion criteria because of their influence on executive function and social cognition (Dziobek et al., 2006; Robinson et al., 2009; Muller et al., 2010). Participants were recruited from local schools, colleges, youth organizations, and university.

# Participants

Fifty eight participants took part at both time points. Ages of participants in Younger, Middle, and Older groups are presented in **Table 1**.

Participants were asked to report their current Education and changes to living arrangements and friendship groups in the previous 12 months. Seventeen year olds were studying for AS Levels (47%), A2 Levels (47%), and BTEC (Business and Technology Education Council; 6%). Eighteen year olds were studying for A2 Levels (11%), BTEC (11%), and degree (78%) and all 19 year olds were university students.

Changes to living arrangements were highest for 18 and 19 year olds, (72 and 74% respectively) compared to 17 year olds (16%). A higher percentage of 19 year olds (67%) reported making new friends relative to 17 and 18 year olds, (37 and 33% respectively). These data indicate that 18 and 19 year olds had undergone greater change in their living and social environment compared to 17 year olds.

# Procedure

Participants first completed the Positive and Negative Affect Schedule (Watson et al., 1988) to assess mood state and the Hospital Anxiety and Depression Scale (Zigmond and Snaith, 1983) self-report measures to assess anxiety and depression. Participants then completed the Wechsler Abbreviated Scale of Intelligence (Wechsler, 1999) followed by executive function and social cognition tasks which were counterbalanced across testing sessions lasting approximately 3 h. Rest breaks were participant determined. Alternate versions of the D-KEFS Letter Fluency and Sorting Tests were used to ameliorate any testing effects.

# Executive Function Measures

The executive function battery comprised the D-KEFS (Delis et al., 2001) Letter Fluency Test measure of strategy generation, the Sorting Test measure of concept formation, and Tower Test measure of planning. The Hayling and Brixton Tests (Burgess and Shallice, 1997) provided measures of inhibition and rule detection. The D-KEFS Letter Fluency and Tower Tests were selected because Romine and Reynolds (2005) reported that strategy generation and planning continue to develop between 17 and 22 years. Romine and Reynolds (2005) suggested that future research investigating the development of executive functions should use alternative measures. The D-KEFS Sorting Test was selected as an alternative to the Wisconsin Card Sorting Test (WCST; Heaton et al., 1993) to assess concept formation because

TABLE 1 | Means and standard deviations of age for Younger, Middle, and Older groups at Time 1 and Time 2.


there are 16 sorting rules in the D-KEFS version, compared to only 3 in the WCST, increasing task sensitivity and minimizing ceiling effects (Delis et al., 2001). The Hayling Test was selected to assess inhibition because lack of inhibition has been attributed to increased risk taking in this age range (Luna and Sweeney, 2004). The Brixton Test was included to assess rule detection in a spatial format to further explore how this function develops during late adolescence and early adulthood.

# Social Cognition Measures

Previous studies often focus on one area of social cognition (Vetter et al., 2013) such as empathy (Davis and Franzoi, 1991) or perspective taking (Choudhury et al., 2006; Dumontheil et al., 2010a). The present study assessed various aspects of social cognition using different formats e.g., visual static (Reading the Mind in the Eyes Test; Baron-Cohen et al., 2001), auditory (Reading the Mind in the Voices Test; Golan et al., 2007), dynamic (MASC; Dziobek et al., 2006), and self-report empathy (IRI; Davis, 1983). Tager-Flusberg (2001) conceptualized social cognition as consisting of social-perceptual and social-cognitive processes. The selected tasks support this conceptual framework with the Reading the Mind in the Eyes Test and Reading the Mind in the Voices Test providing measures of social-perceptual processes, whereas the MASC assessed both social-perceptual and social-cognitive processes. The selected tasks support the conceptualization of social cognition as involving processes for understanding others (Eyes Test, Voices Test, and MASC) and understanding the self by including a self-report empathy measure (Beer and Ochsner, 2006).

**Table 2** summarizes the executive function and social cognition tasks included in the study.

# Data Analyses

Data were assessed for normal parametric assumptions. Mixed ANOVAs were conducted on IQ, mood, executive function, and social cognition task scores with a between group factor of age group at Time 1 and a within subjects factor of Time 1 and Time 2. Younger, Middle and Older groups refers to participants who were originally in 17, 18, and 19 year old groups at Time 1. Raw scores were analyzed, with the exception of Hayling and Brixton Tests, for ease of comparison across tests because some measures do not have standardized score equivalents. Scaled scores were analyzed for Hayling and Brixton Tests because these are reported extensively in the literature.

# Results

# Participant IQ and Mood Data

Descriptive statistics for Verbal, Performance and Full Scale IQ, and mood data are presented in **Table 3** followed by mixed ANOVAs with age group (Younger, Middle, and Older) as the between group factor and Time 1 and Time 2 as the within group factor.



TABLE 3 | Means and standard deviations for WASI Verbal IQ, Performance IQ, and Full Scale IQ in Younger, Middle, and Older groups at Time 1 and Time 2.

*Key:* ↑ *represents significantly better performance at Time 2 relative to Time 1 and* ≈ *represents no significant change in task scores between Time 1 and Time 2.*

# IQ

Participants varied between Time 1 and Time 2 by −18 to +20 on Verbal IQ, −8 to +25 on Performance IQ and −8 to +18 on Full IQ supporting other reports of variation in IQ during adolescence (Ramsden et al., 2011). All group means fell within the Average range indicating no shift in Verbal IQ category across groups. There was a significant main effect of time [F(1, 55) = 5.95, p = 0.018] for Verbal IQ score with the Younger [t(18) = 2.69, p = 0.015] and Middle [t(17) = 2.74, p = 0.014] groups scoring significantly higher on Verbal IQ at Time 2 compared to Time 1. There was no change for the Older group suggesting that Verbal IQ levels may have stabilized by age 19.

For Performance IQ score there was a significant main effect of time [F(1, 55) = 100.25, p < 0.001] with Younger [t(18) = 5.80, p < 0.001], Middle [t(17) = 3.71, p = 0.002], and Older groups [t(20) = 9.09, p < 0.001] scoring significantly higher at Time 2 compared to Time 1 on Performance IQ. The mean Performance IQ scores for the Younger and Older groups shifted from Average IQ category at Time 1 to High Average at Time 2.

For Full Scale IQ score there was a significant effect of time [F(1, 55) = 61.75, p < 0.001] with Younger [t(18) = 5.97, p < 0.001], Middle [t(17) = 4.34, p < 0.001] and Older groups [t(20) = 3.09, p = 0.006] attaining a significantly higher IQ score at Time 2 relative to Time 1. The mean Full Scale IQ score for the Younger group changed from an Average IQ category at Time 1 to High Average at Time 2. A regression was conducted with Performance IQ change score (Time 2–Time 1 score) as a predictor variable and Full Scale IQ change score as the dependent variable to examine how much of the increase in Full Scale IQ was accounted for by improved Performance IQ. This resulted in a significant model [F(1, 56) = 24.35, p < 0.001] that accounted for 29% of variance (Adjusted R <sup>2</sup> = 0.29) in Full Scale IQ change scores (β = 0.55, t = 4.94, p < 0.001). Overall IQ findings indicate linear developmental change in Verbal IQ in Younger and Middle groups and linear Performance IQ increase across all age groups indicating that this measure of IQ remains dynamic up to age 20 years and may reflect improved motor skills due to more efficient white matter pathways.

# Mood

There was no significant main effect of time for Positive Affect scores [F(1, 55) = 0.07, p = 0.788] or Negative Affect scores (F(1, 55) = 1.40, p = 0.241] of the PANAS (Watson et al., 1988) indicating that mood state was relatively stable across time points for all age ranges. There was no significant main effect of time on Anxiety scores [F(1, 49) = 1.53, p = 0.222] or Depression scores [F(1, 49) = 0.11, p = 0.737] from the HADS (Zigmond and Snaith, 1983) for all age ranges indicating that changes in mood did not account for change to other cognitive variables.

# Executive Function Measures

Descriptive statistics for executive function task scores are presented in **Tables 4** and **5**.

# Response Inhibition and Rule Detection (Hayling and Brixton Tests)

ANOVA results showed a significant main effect of time on the Hayling Test scores [F(1, 55) = 20.65, p < 0.001]. Results of paired samples t-tests showed the Middle [t(17) = 3.22, p = 0.005] and Older groups [t(20) = 3.01, p = 0.007] performed better at Time 2, indicating better inhibition, compared to Time 1. There were no other effects. A significant main effect of time was evident on Brixton Test scores [F(1, 55) = 28.54, p < 0.001] indicating developmental change. Middle [t(17) = 3.56, p = 0.002] and Older age groups [t(20) = 4.36, p < 0.001] scored significantly higher at Time 2 compared to Time 1 indicating better rule detection and linear development in these age groups whereas for the younger group these functions remained stable. This suggests ongoing change to these functions may occur later than for Full Scale IQ scores, corresponding well to morphology data and steep maturational peaks at later age.

# Strategy Generation (D-KEFS Letter Fluency Test)

A significant main effect of time was found on the Letter Fluency Test indicating developmental change between time points [F(1, 55) = 9.25, p = 0.004]. The Younger group scored significantly higher at Time 2 compared to Time 1, indicating better strategy generation and linear development [t(18) = 2.19, p = 0.042]. A significant main effect of age group showed that the Younger group scored significantly higher than the Middle age group [t(35) = 2.50, p = 0.017], indicating nonlinear development and supporting previous findings (Taylor et al., 2013). The older group showed no developmental change between time points indicating that strategy generation matures TABLE 4 | Means and standard deviations for Younger, Middle, and Older age groups at Time 1 and Time 2 on executive function tasks of inhibition, rule detection, strategy generation, and concept formation.


*Key:* ↑ *represents significantly better performance at Time 2 relative to Time 1,* ↓ *represents significantly poorer performance at Time 2 compared to Time 1, and* ≈ *represents no significant change in task scores between Time 1 and Time 2.*

TABLE 5 | Means and standard deviations for Younger, Middle, and Older age groups at Time 1 and Time 2 on an executive function task of planning.


*Key:* ↑ *represents significantly better performance at Time 2 relative to Time 1 and* ≈ *represents no significant change in task scores between Time 1 and Time 2.*

earlier than other executive functions assessed here and is stable by age 18.

# Concept Formation (D-KEFS Sorting Test)

A significant main effect of time was found on free sort description score [F(2, 55) = 9.91, p = 0.003], a measure of concept formation, with the Younger group scoring significantly lower at Time 2 relative to Time 1, indicating poorer concept formation and indicative of non-linear development of this function [t(18) = 3.68, p = 0.002]. The Middle and Older groups showed no developmental change between time points indicating that concept formation, assessed with free sort description score, stabilizes by age 18. No other effects were evident.

Developmental change was evident on the sort recognition description score between time points [F(1, 55) = 21.11, p < 0.001] with the Younger group scoring lower at Time 2 following a non-linear pattern and indicating poorer concept formation compared to Time 1 [t(18) = 4.73, p < 0.001]. Similarly, the Older group scored significantly lower on sort recognition description score at Time 2 compared to Time 1 [t(20) = 3.15, p = 0.005]. There were no other effects.

Description score for perceptual sorts showed developmental change [F(1, 55) = 62.96, p < 0.001] with the Younger [t(15) = 7.51, p < 0.001], Middle (t (13) = 4.49, p = 0.001), and Older groups [t(20) = 3.71, p = 0.001] scoring significantly lower at Time 2 compared to Time 1, indicating poorer performance and non-linear development. There were no other effects. To summarize, results of analyses indicated developmental change on description score for free sorts, sort recognition and perceptual sorts. These require several executive functions including concept formation, the ability to group cards into categories reflecting a common feature, cognitive flexibility to search for new sorts, and inhibition of repeated sorts (Delis et al., 2001). Overall findings indicate that particular aspects of concept formation are less stable at these age ranges than other executive functions.

#### Planning (D-KEFS Tower Test)

For number of towers completed there was a significant effect of time indicating developmental change in planning ability [F(1, 55) = 12.09, p = 0.001]. The Middle age group completed significantly more towers at Time 2 relative to Time 1indicating better planning and linear development [t(17) = 3.34, p = 0.004]. There were no other effects. This suggests a potential spurt in this ability between ages 18 and 19 years that is not seen at younger or older ages.

There was also a significant effect of time on Tower achievement score [F(1, 55) = 6.28, p = 0.015]. The Older group attained a significantly higher achievement score at Time 2 compared to Time 1 [t(20) = 2.16, p = 0.043] indicating linear functional development, but there were no other effects. Achievement score takes into account whether towers are completed and the number of moves, indicating that the Older group employed a better planning strategy at Time 2 relative to Time 1.

There was also an effect of time on mean first move on the Tower Test [F(1, 55) = 18.74, p < 0.001] with the Younger group significantly quicker on first move at Time 2 relative to Time 1 [t(18) = 2.47, p = 0.024] indicating linear development. A similar pattern was found in the Older group with a significantly shorter mean first move time at Time 2 compared to Time 1 [t(20) = 3.73, p = 0.001]. No developmental change was found in the Middle group [t(17) = 1.73, p = 0.102] indicating mean first move time may improve between ages 17 and 18, stabilize between ages 18 and 19, followed by further improvement. There was a significant effect of group for mean first move time [F(2, 55) = 3.25, p = 0.046] that was investigated further with post-hoc t-tests. The Younger group scored significantly lower, showing a faster mean first move time than the Older group [t(31.73) = 2.37, p = 0.024], indicating non-linear development, with no other group differences evident.

Additionally, there was a significant effect of time on mean time per move scores on the Tower task [F(1, 55) = 78.06, p < 0.001]. Younger [t(18) = 4.74, p < 0.001], Middle [t(17) = 5.32, p < 0.001], and Older groups [t(20) = 5.45, p < 0.001] showed significantly shorter time per move at the second time point compared to Time 1 indicating linear development of this index of planning.

#### Social Cognition Measures

Descriptive statistics for social cognition task scores are presented in **Table 6**.

# Emotion Recognition in Visual Static and Auditory Stimuli (Reading the Mind in the Eyes and Voices Tests)

There was no significant effect of time [F(1, 55) = 0.01, p = 0.915] or group [F(2, 55) = 1.75, p = 0.183] on the Reading the Mind in the Eyes Test. Similarly, there was no effect of time [F(1, 55) = 0.57, p = 0.454] or group [F(2, 55) = 1.03, p = 0.362] on the Reading the Mind in the Voice Test indicating that emotion recognition in visual static and auditory stimuli shows no developmental change beyond age 17.

# Dynamic Visual and Auditory Stimuli with Social Interaction (Movie for the Assessment of Social Cognition)

There was a significant effect of time on total MASC score indicating developmental change [F(1, 55) = 5.29, p = 0.025], with Middle [t(17) = 2.22, p = 0.041], and Older [t(20) = 3.20, p = 0.005] groups scoring significantly higher at Time 2, indicating better social cognition, relative to Time 1, following a linear direction. Similarly, there was an effect of time on MASC excessive mental state inference errors [F(1, 55) = 9.73, p = 0.003] with the Middle [t(17) = 2.38, p = 0.029] and Older groups [t(20) = 2.36, p = 0.029] making significantly fewer errors at Time 2 compared to Time 1 indicating linear improvements and a reduction in over-attribution of mental state content. Finally, there was no effect of time on MASC insufficient mental state inference errors [F(1, 55) = 1.15, p = 0.288] and MASC no Theory of Mind errors [F(1, 55) = 0.13, p = 0.718] with no other effects. The finding of Middle and Older groups scoring higher at Time 2 due to fewer excessive mental

TABLE 6 | Means and standard deviations for Younger, Middle, and Older age groups at Time 1 and Time 2 on social cognition tasks.


*Key:* ↑ *represents significantly better performance at Time 2 relative to Time 1,* ↓ *represents significantly poorer performance at Time 2 compared to Time 1, and* ≈ *represents no significant change in task scores between Time 1 and Time 2.*

state inference errors may indicate that social cognition develops between ages 18 and 20 years when assessed with naturalistic, dynamic and auditory stimuli.

# Self-report Empathy (Interpersonal Reactivity Index)

There was no effect of time on IRI Fantasy [F(1, 55) = 0.25, p = 0.618], Perspective Taking [F(1, 55) = 0.06, p = 0.810], Empathic concern, [F(1, 55) < 0.01, p = 0.924], and Personal Distress [F(1, 55) = 2.52, p = 0.118] scales. These findings indicate that self-report empathy is relatively stable by age 17 years.

# Gender Comparisons

Age groups at Time 1 were collapsed and Mann Whitney U tests were conducted to analyse possible gender comparisons between females (n = 47) and males (n = 11). Results showed a significant difference on two indices of concept formation. Males (Mdn = 48.0, range = 18.0) scored higher than females (Mdn = 40.0, range = 42.0) on free sorts description score (U = 158.50, z = 1.99, p = 0.047), requiring participants to sort and describe cards. Similarly, males (Mdn = 64.0, range = 18.0) scored higher than females (Mdn = 59.0, range = 46.0) on description score for perceptual sorts (U = 157.00, z = 2.02, p = 0.044), requiring participants to describe sorts based on visuo-spatial features of cards. There were no other gender group differences on executive function indices (all other ps > 0.08).

There were gender group differences on self-report empathy indices of Empathic Concern, sympathetic feelings toward other people's misfortune, and Personal Distress, feelings of apprehension in stressful situations. Females (Mdn = 21.0, range = 12.0) scored higher than males (Mdn = 19.0, range = 9.0) on Empathic Concern (U = 133.00, Z = 2.50, p = 0.012). Similarly, females (Mdn = 14.0, range = 22.0) scored higher than males (Mdn = 10.0 range = 12.0) on Personal Distress (U = 101.00, Z = 3.13, p = 0.002). There were no other gender group differences on social cognition tasks (all other ps > 0.05). Overall gender analyses indicate that males outperformed females on two indices of concept formation and females outperformed males on two indices of self-report empathy.

Overall, results of longitudinal analyses indicate that executive functions and social cognition follow divergent trajectories. Strategy generation (Letter Fluency Test) improved between ages 17 and 18 followed by no developmental change, whereas inhibition (Hayling Test) and rule detection (Brixton Test) showed later improvement between ages 18 and 20 years. Concept formation (Sorting Test) was less stable than other executive functions with some indices showing nonlinear development between time points. Planning (Tower Test) showed evidence of improvements between time points continuing into early adulthood with achievement score, mean first move time and time per move developing between ages 19 and 20. Emotion recognition with static visual stimuli (Eyes Test) and auditory stimuli (Voices Test) and self-report empathy (IRI) showed no development beyond age 17. Social cognition assessed with dynamic stimuli (MASC) showed improvements into early adulthood between ages 18 and 20 years.

# Discussion

The present study extends previous executive function and social cognition research by employing a longitudinal design across peak maturational periods of brain development with narrow age ranges allowing developmental changes to be identified. Participants aged 17, 18, and 19 years at Time 1 completed IQ, executive function and social cognition tasks 12–16 months later (interval between testing M = 14.81 months, SD = 4.01). We predicted that executive functions of strategy generation, planning, inhibition, and rule detection would improve and concept formation would decline, whereas social cognition would be relatively stable between time points. Results supported the hypotheses with strategy generation improving between ages 17 and 18 years and inhibition and rule detection developing between ages 18 and 20 years. Improvements in planning were evident across age groups on several indices (towers completed improved between ages 18 and 19 years, achievement score improved between 19 and 20 years, mean first move time reduced between ages 17 to 18 years and 19 to 20 years and time per move reduced between time points for all age groups). The hypothesis of concept formation declining was supported by description scores for free sorts, sort recognition, and perceptual sorts declining between time points, indicating non-linear development. The hypothesis of social cognition being relatively stable was partially supported with no development of emotion recognition in visual static and auditory stimuli and self-report empathy beyond age 17 years. Social cognition with dynamic stimuli showed functional improvement between ages 18 and 20 years. Overall these findings indicate that socio-cognitive and executive functions follow divergent developmental trajectories corresponding to divergent brain change based on neural topography. The finding of functions showing improvement or decline at specific ages would not have been captured with broader age ranges as used in other studies. Thus the longitudinal design with fine-grained age groups provided more specific detail about functional developmental change at these ages.

The protracted development of functions into late adolescence and early adulthood may reflect ongoing brain maturation although that is not measured here. Middle and Older age groups scored significantly higher on the Hayling Test at Time 2 compared to Time 1, indicating better inhibition. Section two of the Hayling Test requires inhibition of prepotent responses associated with activation of dorsolateral prefrontal networks (Nathaniel-James and Frith, 2002). Loss of gray matter (via pruning of obsolete cell bodies) commences in dorsolateral prefrontal networks in late adolescence (Gogtay et al., 2004), so the development of cognitive inhibition may reflect synaptic pruning resulting in more efficient neural networks (Sowell et al., 2001). In the present study, the Younger group scored significantly higher at Time 2 on the Letter Fluency Task, indicating better strategy generation, compared to Time 1. There was no developmental change on this measure in the Middle and Older groups indicating that strategy generation stabilizes by age 18. Improved strategy generation in the Younger group may reflect white matter maturation in the Posterior Limb of the Internal Capsule (Bava et al., 2010) specific to this age due to mean diffusivity, an index of white matter integrity, reaching 90% maturation in this brain region by age 18 (Lebel et al., 2008), a similar age to the Younger group at Time 2. All age groups achieved a faster time per move at Time 2 relative to Time 1 on the Tower Test measure of planning. The faster time per move could be explained by ongoing axonal myelination into early adulthood increasing transmission speed (Sowell et al., 2001). Planning tasks require widespread neural networks including frontal, parietal and premotor areas (Wagner et al., 2006), and rapid integration of different neural regions. Greater functional connectivity between these areas could result in more efficient, accurate, and automatic processing (Stevens et al., 2007) evidenced by improved planning indices at Time 2. The finding of divergent executive function developmental trajectories supports the notion of a fractionated executive function system (Miyake et al., 2000).

Present findingsshowed developmental change on description scores for free sorts, sort recognition and perceptual sorts on the D-KEFS Sorting Test measures of concept formation. Successful performance on these tasks requires participants to consider verbal and perceptual information on sorting cards and the formation of two groups with common attributes whilst concurrently inhibiting previous sorts. According to the manual, higher marks are awarded for more abstract (e.g., warm things and cool things) compared to concrete descriptions (e.g., "you like these on a cold day" and "you like these on a hot day"). Description score for perceptual sorts were significantly lower at Time 2 compared to Time 1, indicating poorer performance, across all age groups indicating non-linear development. This description score is an index of participants' descriptions of visuo-spatial features of the cards (e.g., concave shape vs. convex shape). In addition to the executive functions of concept formation, cognitive flexibility and inhibition, nonexecutive functions are also measured by the Sorting Test such as perceiving visual features of the cards, use of language and memory. This is a potential problem often highlighted in standardized executive function measures relating to task impurity (Burgess, 1997) because non-executive functions such as language, memory, and visuo-spatial processing are also measured in executive function tasks since these higher-level functions operate across/integrate other lower level functions (Gioia and Isquith, 2004). The present study showed that the Younger group scored significantly lower at Time 2 compared to Time 1 on free sort description score on the D-KEFS Sorting Test supporting the notion of non-linear development in concept formation (Kalkut et al., 2009; Taylor et al., 2013). This is an example of a transitory destabilization of functions during late adolescence / early adulthood due to functional network reorganization (Uhlhaas et al., 2009).

Our findings showed that performance on the Eyes and Voices Tests was not different across time points indicating that emotion recognition of visual static and auditory stimuli is relatively stable across late adolescence and early adulthood supporting previous cross-sectional findings (Taylor et al., 2013). At Time 2, the Middle and Older groups scored significantly higher on the MASC due to fewer excessive mental state inference errors compared to Time 1 indicating that social cognition may develop linearly in late adolescence / early adulthood when assessed with naturalistic, dynamic stimuli. The Eyes and Voice Tests assess social-perceptual aspects of social cognition (Tager-Flusberg, 2001) requiring understanding and interpretation of information from faces, voices, and body posture and mental state attribution. In addition to social-perceptual processes, the MASC is considered to assess social-cognitive processes, the use of information over time and events in the attribution of mental states. The present findings indicate that social-cognitive processes show more protracted development compared to social-perceptual processes, supporting the notion of socialperceptual and social-cognitive components showing different developmental trajectories (Tager-Flusberg, 2001). A possible explanation for the development in MASC scores across time points is the decrease in functional connectivity between adolescence and early adulthood (Burnett and Blakemore, 2009) that could reflect synaptic pruning (Boersma et al., 2011) of unused connections and strengthening of frequently used synapses, resulting in more efficient networks, with a developmental shift from diffuse, extensive activation to focal activation (Durston and Casey, 2006). Imaging studies indicate that performance on the MASC is associated with diverse neural networks including occipito-parietotemporal, temporal and prefrontal networks (Wolf et al., 2010) whereas performance on the Eyes Test in adulthood is associated with activity to the posterior temporal sulcus and inferior frontal gyrus (Moor et al., 2012). Dynamic stimuli are associated with more widespread activation than static stimuli (Trautmann et al., 2009) so it is possible that improvements on the MASC were due to the development of more efficient neural networks and myelination resulting in improved neural transmission (Sowell et al., 2001) between widespread regions.

Present findings of Verbal IQ developing between ages 17 and 19 years and Performance IQ developing between ages 17 and 20 years support the notion that IQ continues to develop into late adolescence and early adulthood (Wechsler, 1981; Ramsden et al., 2011). Verbal IQ means were within the average range so Verbal IQ change cannot account for any other developmental change on executive function and social cognition tasks. The decline in free sort description score, a measure of concept formation between ages 17 and 18 years is in contrast to developments in Verbal IQ indicating that concept formation shows a different developmental trajectory to IQ. All groups scored significantly higher at Time 2 compared to Time 1 on Performance IQ possibly reflecting an improvement in speed of processing due to increased neural transmission and white matter integrity (Sowell et al., 2001).

One issue with longitudinal research is practice effects, better performance on tests due to previous completion and becoming accustomed to the study in general (Jønsson et al., 2006). All groups had a significantly faster mean time per move at Time 2 relative to Time 1 on the Tower Test measure of planning possibly due to participants having completed the task before and already having a strategy to complete the towers. However, practice effects were reduced in the present study by giving participants no feedback about whether answers were correct. An interval of a year between testing minimized memory contributing to practice effects (Hausknecht et al., 2007) and alternative forms of the Letter and Sorting Tests were used at Time 1 and Time 2.

The present study extends previous research by employing a longitudinal design to identify whether abilities improve, decline or stabilize over time (De Luca et al., 2003; Romine and Reynolds, 2005; Waber et al., 2007). It is important to understand the developmental trajectory of functions and whether they show linear or non-linear development. It is of note that the cross sectional data with 17, 18, and 19 year olds (Taylor et al., 2013) and longitudinal analyses are not consistent. For example, no cross sectional group differences were evident on the MASC whereas longitudinal analyses showed that the Middle and Older groups scored significantly higher at Time 2, indicating better social cognition, compared to Time 1. Longitudinal and cross sectional findings are sometimes not consistent because cross sectional analyses show inter-individual (group) differences whereas longitudinal analyses show intra-individual change (Schaie, 2005). Longitudinal analyses may be considered more reliable because in the cross sectional study participants reported considerable changes to living arrangements and friendship groups (Taylor et al., 2013) and Schaie (2005) suggested cross sectional age group comparisons are only appropriate in a stable environment.

The development of social and executive functions may reflect brain maturation and environmental change (Hughes and Ensor, 2009) such as changes to living arrangements and friendship groups (Taylor et al., 2013). Tuvblad et al. (2013) reported that non-shared environmental factors contributed to 54% of variance in Iowa Gambling Test scores at age 16 to 18, indicating that environmental factors influence individual differences in decision making during late adolescence.

There was a gender imbalance with more females taking part in the study than males. Males scored higher than females on two indices of concept formation, free sort description score and description score for perceptual sorts. Females scored higher than males on two indices of self-report empathy, empathic concern, and personal distress, possibly due to social desirability (Laurent and Hodges, 2009). Importantly, results of gender analyses showed relatively few group differences at these age ranges suggesting that development (time) plays a much more important role in the emergence/stability of cognitive functions at these age ranges.

The present findings have educational and clinical implications. Blakemore (2010) proposed that adolescence is a sensitive period for teaching due to protracted neural re-organization and that education should focus on cognitive functions that are still developing. The present results suggest that late adolescence/early adulthood continues to be a sensitive period because some functions show longitudinal development. Sensitive periods can inform educational policy by suggesting at what ages particular skills should be included in the curriculum to optimize learning (Thomas and Knowland, 2009). In a longitudinal study, Miller and Hinshaw (2010) found that executive functions contribute to academic achievement. As longitudinal developmental change was evident on concept formation (Sorting Test), rule detection (Brixton Test) inhibition (Hayling Test) planning (Tower Test), and strategy generation (Letter Fluency Test), perhaps these executive functions could be incorporated more into sixth form and university curricula. Understanding developmental trajectories of functions is important because they have implications for early identification of cognitive dysfunction and treatment outcomes (Kar et al., 2011). Furthermore, the normative longitudinal social and executive function data is relevant in assessing the effectiveness of rehabilitation following Head Injury (Reynolds and Horton, 2008) or in the diagnosis of individuals with Autistic Spectrum Disorders (Brent et al., 2004) and assessing the effectiveness of interventions.

Whilst imaging data is used to explain behavioral changes, future research could combine behavioral and imaging data to map linear and non-linear development of functions onto neural networks. Future research could examine other indices of social cognition task performance such as reaction times. Faster reaction times on tasks with age may reflect increased myelination (Sowell et al., 2001). Appropriate tasks would have dynamic stimuli that show emotional expressions for a short time (Vetter et al., 2013) such as the Movie for the Assessment of Social Cognition or the Cambridge Mindreading Face Voice Battery (Golan et al., 2006). As environmental changes are common in late adolescence and early adulthood, another avenue for future research is to compare social cognition and executive function task scores in participants with constant living arrangements and friendship groups with participants who experience environmental changes.

To conclude, the present longitudinal findings provide further evidence of divergent development of social and executive functions in late adolescence and early adulthood with some functions improving, whilst others decline or stabilize. The protracted development of functions may be attributed to brain maturation including synaptic pruning (Sowell et al., 2001) and functional connectivity (Stevens et al., 2007) and environmental changes (Tuvblad et al., 2013) specific to this age group such as changing friendship groups and living arrangements.

# References


Development: Issues of Theory, Method, and Application, eds E. Amsel and K. A. Renninger (Mahwah, NJ: Erlbaum), 117–152.


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2015 Taylor, Barker, Heavey and McHale. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

# Cognitively controlled timing and executive functions develop in parallel? A glimpse on childhood research

# *Carmelo M. Vicario\**

*School of Psychology, University of Queensland, Brisbane, QLD, Australia \*Correspondence: carmelo.vicario@uniroma1.it Edited by:*

*Lynne A. Barker, Sheffield Hallam University, UK*

**Keywords: executive functions, time processing, childhood development, attention, working memory, impulsivity control**

Accurate temporal estimations are essential in order to face the surrounding variety of everyday situations (Vicario et al., 2013a). Executive functions (EF) seem strongly involved in timing ability, allowing us to codify temporal intervals, reproduce durations and/or re-call them after a previous encoding phase. In particular, time processing abilities seem related to three different domains of our EF such as working memory (WM) (Fortin and Breton, 1995; Fortin and Rousseau, 1998; Mangels et al., 1998; Lewis and Miall, 2006) attention (Rose and Summers, 1995; Casini and Ivry, 1999; Enns et al., 1999; Tse et al., 2004; Brown, 2006; Vicario et al., 2007, 2009, 2011a,b; Vicario, 2011), and impulsivity control (Reynolds and Schiffbauer, 2004; Wittmann and Paulus, 2008; Rubia et al., 2009).

The evidence in support of these relationship is provided not only by the empirical demonstration that the interference with the processing of one of these three EF affects timing performance (e.g., patients with attention or WM deficits are less accurate in time keeping functions. For instance see the works of Casini and Ivry (1999) and Mangels et al. (1998) on patients with prefrontal lesions) but also in theoretical models which explain how the brain keeps memory of time. For example, the pacemaker–accumulator model (Buhusi and Meck, 2009), assumes that the human brain has its own internal clock with a pacemaker producing subjective time units (Zakay and Block, 1997). Wittmann and Paulus (2008) argue a possible influence of impulsivity on the subjective time keeping functions. In fact, it has been suggested that impulsivity might influence the pacemaker rate of this internal clock and therefore the number of accumulated pulses for temporal units (see Wittmann and Paulus, 2008 for a review on the argument).

In this article I expand upon this idea by providing evidence in support of the suggestion that the ability in performing cognitively controlled timing tasks develop in parallel with these three domains of the EF. This hypothesis basically stems from two arguments: (i) The evidence of a close relationship, in childhood populations, between temporal accuracy and the performance in tasks involving WM, attention and impulsivity control; (ii) The evidence of age related functional differences comparing the activity of the prefrontal cortex during the execution of timing as well as WM, attentive and impulsivity control tasks.

The implications behind this hypothesis are intriguing because they may help to clarify, through the study of cognitive development models, the relationship between the development of the EF and the progression of the level of sophistication of time keeping skills. Moreover, the study of the time keeping functions in childhood populations could represent a potential element of evaluation to qualitatively determine and/or monitor the EF development during the critical phases of brain growth. Finally, one advantage in charting the developmental trajectory of time processing and EF at certain critical moments of development is that this can help to differentiate between experiencedependent versus inborn aspects of time and EF.

### **TIME KEEPING AND EXECUTIVE FUNCTIONS IN ADULTHOOD**

The literature specialized on time keeping has suggested a general distinction between "*cognitively controlled*" and "*automatic*" timing processes (Lewis and Miall, 2006). Factors such as the temporal scale (sub-seconds vs. supra-second), the task typology (motor vs. not motor) and the type of measurement (continuous vs. intermittent) are considered as (have been considered) the key factors underlying this distinction. Therefore it was asserted (Lewis and Miall, 2006) that a typical automatic timing task involves continuous measurement of a series of predictable sub-second intervals defined by movements; on the other hand, a cognitively controlled timing task requires the explicit orientation of attentional sources toward the duration of stimuli lasting more than one second and characterized by some level of discontinuity (e.g., when timing is broken into discrete measurements by the presence of unpredictable irregular intervals). In reality, this distinction may be more flexible' since cognitively controlled timing tasks may also involve non-motor timing tasks of subsecond durations (e.g., time comparison tasks which require the involvement of decision-making processes—see Vicario, 2013a,b for a complete discussion on this argument) as well as supra-second motor timing tasks (e.g., the classical time reproduction).

In the literature on cognitively controlled timing, several works have provided direct support for the relationship between some EF and time keeping performance. In particular, it has been shown that this function can be influenced by WM, attention and impulsivity/inhibition skills.

Behavioral studies conducted on adult participants have reported that WM and time measurement draw upon the same cognitive resources. For example, it has been shown that secondary tasks involving phonological WM disrupt timing skills (Fortin and Breton, 1995).

Attention manipulation also influences performance in cognitively controlled temporal tasks (Vicario et al., 2007, 2009). For example, it was shown that optokinetic stimulation, which is known to influence spatial attention (Mattingley et al., 1994), affects the participants' performance in temporal decision tasks such as the temporal discrimination of visual stimuli (Vicario et al., 2007).

Finally, evidence of a relationship between time keeping abilities and impulsivity control has been provided by the work of Reynolds and Schiffbauer (2004), which showed that impulsivity due to sleep deprivation causes temporal underestimation in the multiple-seconds range. The recent study by Vicario et al. (2010) on childhood Tourette participants provides a further insight to the link between impulsivity control and time keeping mechanisms. In fact, the authors reported an inverse correlation between temporal accuracy and tic severity scores of these patients. These results might be explained by a compensatory process of neuroplasticity, which is probably related to the gain of (inhibitory) control over tics through the development of compensatory self-regulation mechanisms.

All the studies support the central role of WM, attention and impulsivity/inhibition skills on cognitively controlled timing tasks. However, we cannot exclude that future investigations may extend the influence of EF on timing skills to other higher-level constructs classified under the umbrella EF term.

# **TIME KEEPING AND EXECUTIVE FUNCTIONS IN CHILDHOOD**

Although there is evidence supporting a very early ability of infants in detecting the temporal features of environmental stimuli (Brackbill and Fitzgerald, 1972), numerous studies have suggested that timing skills improve throughout childhood (see Allman et al., 2012 and Droit-Volet, 2013 for some recent review). For example, the Droit-Volet research team has in several occasions documented that the temporal sensitivity improves with age. By using a time bisection task which makes it possible to calculate a precise index of time sensitivity, namely the Weber ratio, the authors found an improvement with age for both sub-second and supra-second temporal intervals (Droit-Volet and Clement, 2005; Droit-Volet et al., 2008; Zelanti and Droit-Volet, 2011).

Moreover, Chatham et al. (2009) recently found that 3.5 year old fail to use proactive control, which can be interpreted as evidence of a failure to proactively prepare for the predictable future (Shallice and Vallesi, 2007). In fact, proactive control can be considered in relation to time keeping skills, since it mediates the capacity to anticipate and prepare for future events (Chatham et al., 2009). Finally, the recent longitudinal study of Forman et al. (2011) showed that the higher the gain in WM development the better the timing calibration.

Similar age related progressions have been reported for WM, attention, and impulsivity control skills. For example, Hitch and Halliday (1983) and Hulme and Tordoff (1989) have reported that 3–4 year-old children are already capable of retaining information in their phonological store. This provides evidence in support of an early development of WM skills. However, it has been noticed that children cannot perform sub-vocal rehearsal until 7 or 8 years of age; therefore until this time, the information stored in the phonological loop rapidly decays (Gathercole, 2008). This evidence is supported by Gathercole and Alloway (2008), who have found that in Anglo-Saxon participants the digit span increases with age until 15 years. However, a subsequent study on a Spanish population has found that this age limit extends to 17 years (Sebastián and Hernández-Gil, 2012).

Many studies have also demonstrated age-related improvements in selective attention (Trick and Enns, 1998; Scerif et al., 2004), sustained attention (Aylward et al., 2002) and attentional control (Jacques and Zelazo, 2001). For example, Aylward et al. (2002) used the Gordon Diagnostic System (Gordon, 1983) for testing auditory and visual vigilance and the distractibility in a sample of 643 children (Mean age 9.76). The authors found an inverse relationship between error score and the age of participants. A similar age related progression has been documented for impulsivity control, which has been reported to be quite low in children Bjorklund and Harnishfeger (1995). For example, Hughes and Russell (1993) used a 'day–night' task (Gerstadt et al., 1994) which required children to inhibit a wellestablished naming response to picture cards. Once again, the authors showed a progressive improvement in this task in children between the ages of 3 and 7 years.

Neuroimaging works provide a further support to the hypothesis that time keeping abilities and executive functions develop in tandem.

In adulthood, there is compelling evidence showing an important role of these regions in timing abilities (Koch et al., 2003; Jones et al., 2004; See Wiener et al., 2010 for review). For instance, Koch et al. (2003) have shown that repetitive Transcranial Magnetic Stimulation upon the right dorsolateral prefrontal cortex (DLPFC) causes temporal underestimation of supra-second durations.

On the other hand, this neural structure is involved in WM (Wager and Smith, 2003), attention (Peers et al., 2013) and impulsivity control (Jasinska, 2013) functions. Moreover, there is evidence documenting a co-existence of timing, WM and attentive deficits in patients with prefrontal lesions (for example see Mangels et al., 1998 and Casini and Ivry, 1999).

A prefrontal activity has been documented even in children while performing a timing task. For instance, the recent study of Smith et al. (2011) has shown an age-related increases in the activation of several regions of the prefrontal lobe, including the DLPFC, while performing a temporal discrimination of supra-second durations (i.e., cognitively controlled timing). In a similar fashion, studies on childhood populations show that the activity of prefrontal regions is influenced by WM, attentive and impulsivity control tasks (Smith et al., 2004; Scherf et al., 2006; Rubia et al., 2006). However, all these works reported a pattern of underactivation during the execution of the above mentioned tasks. Interestingly, according to what has been reported in behavioral works, these studies show evidence that the activation of the prefrontal cortex increases with the age of participants. The development of white matter in the prefrontal regions through adolescence (Schmithorst and Yuan, 2010), could be the cause of these changes in the neural activation of this area and the performance in the tasks described above.

# **CONCLUSION**

In this short overview I discussed the literature in support of the suggestion that the ability of children in performing cognitively controlled timing tasks develops in parallel to WM, attention and inhibitory control functions. Independent behavioral studies are in support of this assumption by showing the existence of age related performance improvements for all these cognitive functions. These functions have also been put in relationship with the existence of an age related prefrontal cortex activity, which might be presumably due to the white matter increment continuing through adolescence and into adulthood (Schmithorst and Yuan, 2010).

Although it is possible that other dimensions of the EF domain may have an influence on the development and support of time keeping abilities, we can only discuss the relationship between time keeping and EF functions within the limits of the available literature. This implies that cognitively controlled timing skills might develop and take place from the same basic (i.e., neural and cognitive) mechanisms involved in the formation of three dimensions of EF discussed in this article. However, cognitively controlled timing skills cannot be reduced to these three EF, considering that the representation of time is built also with the active involvement of other processes (e.g., those implied in the representation of space and quantity, see Walsh, 2003 and Vicario et al., 2013b for some review) and brain regions (e.g., parietal cortex, see Wiener et al., 2010 for a review) that cannot be directly linked to EF.

Future works devoted to exploring the developmental hypothesis discussed in this paper may wish to combine behavioral measures and brain methods in a longitudinal perspective, which may be recognized as important in addressing the link between cognitive and neural development. This approach would help to clarify whether and how these three domains of EF and cognitively controlled timing skills develop in parallel.

# **ACKNOWLEDGMENTS**

I would like to thank Anica Newman for her help in the checking of English spelling.

# **REFERENCES**


*Behav. Process.* 67, 343–356. doi: 10.1016/j.beproc. 2004.06.003


measured in fast, event-related functional magnetic resonance imaging. *Hum. Brain Mapp.* 21, 247–256. doi: 10.1002/hbm.20007


*Received: 18 September 2013; accepted: 24 September 2013; published online: 10 October 2013.*

*Citation: Vicario CM (2013) Cognitively controlled timing and executive functions develop in parallel? A glimpse on childhood research. Front. Behav. Neurosci. 7:146. doi: 10.3389/fnbeh.2013.00146*

*This article was submitted to the journal Frontiers in Behavioral Neuroscience.*

*Copyright © 2013 Vicario. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.*

# The development and malleability of executive control abilities

# *Nina S. Hsu1,2,3 \*, Jared M. Novick1,3,4 and Susanne M. Jaeggi <sup>5</sup>*

<sup>1</sup> Center for Advanced Study of Language, University of Maryland, College Park, MD, USA

<sup>2</sup> Department of Psychology, University of Maryland, College Park, MD, USA

<sup>3</sup> Program in Neuroscience and Cognitive Science, University of Maryland, College Park, MD, USA

<sup>4</sup> Department of Hearing and Speech Sciences, University of Maryland, College Park, MD, USA

<sup>5</sup> School of Education, University of California, Irvine, Irvine, CA, USA

#### *Edited by:*

Lynne Ann Barker, Sheffield Hallam University, UK

#### *Reviewed by:*

Matthias Brand, University Duisburg-Essen, Germany Franziska Plessow, Harvard Medical School, USA

#### *\*Correspondence:*

Nina S. Hsu, Center for Advanced Study of Language, University of Maryland, 7005 52nd Avenue, College Park, MD 20740, USA e-mail: ninahsu@umd.edu

Executive control (EC) generally refers to the regulation of mental activity. It plays a crucial role in complex cognition, and EC skills predict high-level abilities including language processing, memory, and problem solving, as well as practically relevant outcomes such as scholastic achievement. EC develops relatively late in ontogeny, and many sub-groups of developmental populations demonstrate an exaggeratedly poor ability to control cognition even alongside the normal protracted growth of EC skills. Given the value of EC to human performance, researchers have sought means to improve it through targeted training; indeed, accumulating evidence suggests that regulatory processes are malleable through experience and practice. Nonetheless, there is a need to understand both whether specific populations might particularly benefit from training, and what cortical mechanisms engage during performance of the tasks used in the training protocols. This contribution has two parts: in Part I, we review EC development and intervention work in select populations. Although promising, the mixed results in this early field make it difficult to draw strong conclusions.To guide future studies, in Part II, we discuss training studies that have included a neuroimaging component – a relatively new enterprise that also has not yet yielded a consistent pattern of results post-training, preventing broad conclusions. We therefore suggest that recent developments in neuroimaging (e.g., multivariate and connectivity approaches) may be useful to advance our understanding of the neural mechanisms underlying the malleability of EC and brain plasticity. In conjunction with behavioral data, these methods may further inform our understanding of the brain–behavior relationship and the extent to which EC is dynamic and malleable, guiding the development of future, targeted interventions to promote executive functioning in both healthy and atypical populations.

**Keywords: executive function, neuroimaging, fMRI, working memory, training, interventions, connectivity analysis**

# **INTRODUCTION: THE IMPORTANCE OF EXECUTIVE CONTROL**

Most of the time, people's rich experiences enable them to navigate the world using a set of habitual (or well-learned) behaviors. Situations sometimes arise, however, that necessitate on-the-fly changes to these routines. For example, an unexpected road closure can require a shift in the usual route one takes to work. In this and similar circumstances, people must deploy executive control (EC) to countermand dominant thoughts and behaviors in favor of irregular actions. In general, EC refers to the guided regulation of thought and action to match internal or task-relevant goals, particularly in novel situations. Importantly though, EC is not a unitary process but refers to a constellation of separable components that collectively work to guide goal-directed behavior (Norman and Shallice, 1986; Botvinick et al., 2001; Miller and Cohen, 2001). Some components – appearing in different theoretical frameworks under a variety of guises – include a control system to manipulate information within short-term memory (Baddeley and Hitch, 1974), and overlapping but separable processes such

as self-regulation and -awareness, task-switching, updating, and response inhibition (Miyake, 2000; Barkley, 2001; Friedman and Miyake, 2004). It is thought that these components can operate over a wide variety of domains including working memory (WM) and language processing (Smith and Jonides, 1999; Novick et al., 2005; Thompson-Schill et al., 2005; Badre and Wagner, 2007). In this review, we refer to EC in a broad sense to include attentional control, cognitive control, and self-regulatory behavior (Jonides et al., 1998; Miller and Cohen, 2001; Thompson-Schill et al., 2005).

Although researchers generally agree that the neurobiological systems underlying EC involve the prefrontal cortex (PFC), the precise manner in which regions within the PFC support cognitive components of EC is still debated (**Figure 1**). Lateral regions of PFC may become engaged under multiple EC demands in a variety of tasks (Thompson-Schill et al., 1997; Duncan and Owen, 2000; Jonides et al., 2008). The anterior cingulate cortex (ACC), a medial frontal region, is thought to be

involved in conflict monitoring, detecting situations that may be incompatible with current task goals or demands (e.g., Botvinick et al., 2001), which then signal adjustments in behavior to lateral PFC regions (Kerns et al., 2004). Altogether, these cortical regions receive multiple inputs from and project outputs to virtually all perceptual and motor cortical areas, affective/emotional networks, and subcortical structures. These connections provide an ideal infrastructure for integrating multiple sources of information and guiding subsequent thoughts and actions in a top-down fashion (e.g., Amso and Casey, 2006). We should note here: neural regions that support EC processes are not necessarily restricted to PFC. The capacity to focus attention may be constrained by parietal as well as frontal mechanisms, with increased executive demands modulating activity in posterior parietal (PPC) and dorsolateral frontal cortices (e.g., Wager and Smith, 2003). Additionally, medial temporal lobe (MTL) structures may be recruited in task-specific situations that require the establishment of novel relations (Ranganath and Blumenfeld, 2005).

A traditional assumption has been that while EC and other elements of higher cognition develop rapidly through childhood and adolescence, they remain relatively fixed throughout adulthood. However, recent evidence has challenged this assumption: first, EC processes may follow a non-linear developmental trajectory, in which the maturation of these abilities is subject to changes in brain morphology (Taylor et al., 2013). These processes – driven in part by both heritable and environmental factors – may in fact be subject to experience-dependent plasticity throughout the lifespan (Gray and Thompson, 2004; Bialystok et al., 2006; Neville et al., 2013; Rebok et al., 2014). In particular, interest has grown in lab-based interventions targeting EC components with the aim of also improving performance on other tasks that rely on similar processes (Morrison and Chein, 2010; Jaeggi et al., 2011; Hussey and Novick, 2012).

The importance of intervention work targeting EC is underscored by the accumulating evidence that EC operates across other cognitive domains, including WM (e.g, Smith and Jonides, 1999). In particular,WM abilities predict a wide range of practical

outcomes that are important in everyday life, including reading comprehension and mathematical skills (de Jonge and de Jong, 1996; Passolunghi and Siegel, 2001; Gathercole et al., 2006), planning and problem solving (Shah and Miyake, 1999), language processing (Novick et al., 2005), self-regulatory behavior (Hofmann et al., 2012), and scholastic achievement (Duncan et al., 2007; Alloway and Alloway, 2010). Further, deficits in EC and WM abilities are prevalent in a host of clinical syndromes and psychopathologies including attention deficit hyperactivity disorder (ADHD; Shah et al., 2012), depression (Christopher and MacDonald, 2005), and addiction (Khurana et al., 2013); it is also among the core domains susceptible to age-related decline (Braver and West, 2007). In sum, EC is relevant to common assessments of achievement, is sensitive to developmental changes, and many populations suffer from deficits in EC. Given the importance of these abilities to daily life, there has been growing research interest in training paradigms targeting EC abilities with the aim of boosting the improvement (and forestalling the decline) of other complex cognitive skills that rely on EC. Despite the growing literature, there is still a need to understand for whom and at what stages of development EC interventions work best, and also to understand better the cortical mechanisms that underlie training and transfer effects.

The purpose of this paper is to evaluate the current state of the intervention field focusing on selected developmental populations that demonstrate the potential for greater brain plasticity. Because these populations typically demonstrate exaggeratedly poor abilities to control cognition, they may be candidates who are most amenable to receiving maximal transfer benefits. We will review the current work on the neural correlates of training, and suggest methodological directions that may inform a better understanding of the neural mechanisms underlying EC performance. We will outline the current literature on interventions targeting EC (though we note that occasionally different terminology is used to refer to these interventions, including attentional control, and WM), placing a particular emphasis on at-risk populations (e.g., ADHD and low-socioeconomic status; low-SES) that demonstrate

performance differences in these domains relative to healthy, adult groups. Such variation suggests that these special groups might be candidates for interventions that seek to improve EC abilities through experience-based plasticity (i.e., an interaction of biological and environmental factors that results in structural and functional changes in brain morphology, as well as concomitant cognitive changes). Much of the current knowledge concerning the malleability of EC comes from the growing body of literature demonstrating that EC can be trained in healthy populations through extensive practice, and that improved performance over the course of training can generalize to novel tasks that were not part of the training regimen. Across a range of different EC tasks, examples of observed transfer benefits include tasks tapping fluid reasoning (Jaeggi et al., 2008), WM updating (Dahlin et al., 2008), task switching (Karbach and Kray, 2009), visual search (Kundu et al., 2013), and language processing (Novick et al., 2014). Although there is little debate as to whether performance on EC tasks can be improved, there is considerable debate around the transferability to novel, untrained tasks (e.g., Redick et al., 2012; Sprenger et al., 2013; Thompson et al., 2013). Transferability is critical to adjudicating between whether EC skills *per se* are affected during training versus whether people simply develop task-specific strategies. One potential explanation for these mixed findings may lie in individual differences in the training-transfer relationship, as well as in the strength of the linking assumptions that tie training and transfer tasks together in terms of shared mechanisms (Jaeggi et al., 2010, 2014). As such, a key assumption of the training literature is that these kinds of transfer benefits can occur when there is cognitive and/or neural overlap between the processes tapped in both training and transfer, irrespective of domain (Dahlin et al., 2008). This overlap is termed *process-specificity*: if a certain component of EC – e.g., updating– is targeted and improved over the course of training, then transfer tasks that also rely on updating processes should also be affected, even if the task itself appears superficially different (say, in terms of stimuli characteristics).

To inform our discussion, we reviewed the literature on EC interventions in children and adults and then sorted the papers by population (i.e., healthy versus at-risk children), and by mode of outcome measure (behavioral and/or neural). For the selected developmental populations, when necessary, we consulted reviews on each separate topic, integrating relevant information from those reviews into our own.

This paper is organized into two parts. In Part I, we discuss the development and neurobiology of EC, followed by an examination of experiences that can affect the development of EC in both negative (e.g., stress, low-SES) and positive (e.g., schooling, music education, martial arts) ways. We then review the training literature involving "at-risk" groups, focusing particularly on low-SES and ADHD, and the effectiveness of certain kinds of EC interventions, guided by an understanding of the factors that positively influence EC. This field is new, and consequently the results are still inconclusive. Note that for the purpose of this review, we will focus on select developmental groups, and we will not review ongoing EC intervention work that targets older adults or adults with psychopathologies, but rather, we refer the readers to other recent reviews on those topics (Kueider et al., 2012; Vinogradov et al., 2012; Wiers et al., 2013). In Part II, we briefly review the neuroimaging intervention literature in healthy populations to guide future training studies involving populations that are likely to demonstrate poor EC. The early state of this enterprise, however, suggests major inconsistencies: no clear picture emerges in terms of brain activity changes and reorganization post-training. This discrepancy renders it difficult to draw generalizable conclusions, but the field is emerging rapidly, requiring evaluation of the current state of affairs. Moreover, neuroimaging studies of clinical or at-risk groups in the context of intervention research are likely to be even more complicated and problematic, especially when considering atypical behavioral and neural profiles. We therefore suggest some candidate neuroimaging analyses (i.e., connectivity and multivariate approaches) that emphasize the ability to examine spatial and temporal patterns in terms of network dynamics, which can reveal a delicate interplay across brain regions and systems. Such methods have the potential to be more informative relative to traditional univariate approaches that test for pre/post activation differences within cortical patches in isolation.

# **PART I**

#### **THE DEVELOPMENT OF EC AND ITS NEURAL SUBSTRATES**

Executive control has long been associated with the PFC (Shimamura, 2000; Miller and Cohen, 2001), which is among the last cortical regions to fully develop: EC abilities undergo protracted maturation over the course of childhood and adolescence (Thompson-Schill et al., 2009). Moreover, the literature on neuroanatomical development across the whole brain points to dynamic changes that occur postnatally and throughout childhood, with initially undifferentiated regions becoming increasingly functionally specialized (Oliver et al., 2000). This development can occur at different rates: for instance, frontal brain regions undergo change up to age 25, with some frontotemporal tracts not reaching maturity until age 28. Relatively undifferentiated cortical regions co-occur with earlier (rather than later) development, providing a period in which these undifferentiated neural networks may cover larger areas of cortex. As a result, earlier targeted training might lead to more widespread transfer effects by taking advantage of this undifferentiated stage of cortical development (Wass et al., 2012). Although additional factors – including genetic and environmental predispositions as well as dynamic morphological changes – lead to a complicated interplay of developmental components to be taken into consideration (Scerif, 2010), a younger population that may generally have greater brain plasticity and thus, greater learning potential (e.g., Karmiloff-Smith, 1998; Sonuga-Barke and Halperin, 2010). Primarily for this reason, we focus on EC interventions in developmental populations.

Plasticity accompanies cortical maturation beginning at birth. Humans are born with immature brains, and it has been well established that brain maturation develops throughout childhood and adolescence, with PFC developing last (Sowell et al., 2003; Gogtay et al., 2004). Throughout postnatal development, the neocortex matures through an initial rapid growth process of cell proliferation and changes in synaptic density. During this period, the increase in synaptic connections accompanies dendritic and axonal growth (i.e., fibers for communication that extend from neurons) and myelination (i.e., insulation, thus boosting signal transmission) of the subcortical white matter (Huttenlocher and Dabholkar, 1997). Synaptogenesis is then followed by pruning, a synapse-elimination process that lasts well into the third decade of life (Huttenlocher and Dabholkar, 1997; Petanjek et al., 2011). Critically, these processes dynamically occur at differing rates throughout the brain (Huttenlocher and Dabholkar, 1997). Brain regions subserving sensory functions, such as vision and hearing, develop first, followed by development of temporal and parietal cortices – regions responsible for sensory associations. Higher order cognition areas, such as prefrontal and lateral temporal cortices, which serve to integrate information from primary sensorimotor cortices and modulate other cognitive processes, mature last (Casey et al., 2005b; Petanjek et al., 2011).

Non-invasive neuroimaging technologies have enabled researchers to learn a great deal about the anatomical and functional networks of the developing brain. Early work with positron emission tomography (PET) imaging demonstrated that human PFC metabolizes glucose at a slower rate than occipital, temporal, and parietal cortices (Chugani and Phelps, 1986). There also appears to be evidence for a "fine-tuning" of cortical structures as activation shifts from diffuse to focal recruitment as children develop, with cortical gray matter loss (i.e., a sign of cortical maturation) – occurring last (Brown et al., 2005). Taken together, structural and functional evidence suggests that prefrontal regions associated with integration and goal-directed behaviors mature after regions responsible for primary sensory functions (Casey et al., 2005a), with both progressive and regressive processes (rather than simple linear patterns of change) underlying changes in cognitive abilities (Brown et al., 2005; Amso and Casey, 2006).

Cognitive training studies in children seek to take advantage of this relatively undifferentiated brain state (cf. Wass et al., 2012) to maximize possible transfer before pruning accompanies neural specialization. These interventions can take many forms, including mindfulness training, core or supplemental curricula (see Diamond and Lee, 2011); here, we focus on cognitive interventions that specifically target EC processes. For example, guided practice can improve children's performance on a dimensional card sorting task, wherein children are given feedback when they perseverate after a dimension switch (Brace et al., 2006). Additional studies have used intervention paradigms in children to improve EC abilities through training (Holmes et al., 2009; Thorell et al., 2009; Jaeggi et al., 2011; Loosli et al., 2012). Still others have sought to improve the symptoms associated with disorders such as developmental dyscalculia (Kucian et al., 2011), reading abilities in at-risk youth (Yamada et al., 2011), and dyslexia (Temple et al., 2003). Such behavioral interventions seek to demonstrate improvement on EC – that is, those skills that are critical for complex cognitive functioning and scholastic achievement (Diamond and Lee, 2011).

Importantly, however, there are arguments for why the costs associated with late PFC development (and consequently, immature EC) may be overshadowed by learning benefits that accompany this slow maturational progression. Specifically, an underdeveloped frontal cortex (i.e., hypofrontality) might confer

ways for the developing child to increase uptake of bottomup regularities, which are important for creativity and language learning tasks (Gleitman et al., 1984; Thompson-Schill et al., 2009; Chrysikou et al., 2011). The general idea is that attending to top-down rules – while good for EC task *performance* – may interfere with important *learning* and classification procedures. For example, Ramscar and Yarlett (2007) demonstrated that children are easily able to master learning how to pluralize irregular nouns (e.g., mouse → mice) rather than adopting a pluralization dictated by the more frequent convention or rule (e.g., mouse → mouses). This result points to the idea that children maximize probabilistic input, which in this case optimizes learning. In contrast, adults tend to monitor for rules and alternative patterns using top-down strategies, which may impair certain aspects of language learning that benefit from bottom-up (data-driven) modes of thinking (Ramscar and Yarlett, 2007). This age-related difference in learning may be promoted by immature EC abilities driven by an underdeveloped PFC. In view of these trade-offs, we believe that future interventions aimed at improving performance in developmental groups must strongly consider whether, on balance, the potential costs to learning will outweigh the potential benefits to performance. We focus on select at-risk groups in this review (rather than typically developing children) because the potential performance benefits of EC interventions may outweigh the learning costs.

In sum, these examples demonstrate that an immature PFC bears negative consequences for cognitive performance in certain tasks, but that it can also provide benefits for learning and creativity. In particular, hypofrontality may confer learning benefits at the expense of performance costs, so interventions geared toward young children should consider this trade-off. Thus, the argument for accelerating maturation of EC networks that are mediated by regions of prefrontal cortices through interventions may need to be tempered with the evolutionary and developmental advantages that immature frontal lobes may confer.

Such critical periods of development suggest that children may be affected by both positive and negative factors during particular time windows that could shape cognition and EC development (Nelson, 2000; Knudsen, 2004). In the following sections, we review some of these factors. We then review the training literature of some "at-risk" groups: as part of this research, the training protocols consider the conditions that favorably affect EC performance, in hopes of offsetting the negative consequences of environmental and biological circumstances that compromise EC.

#### **FACTORS THAT NEGATIVELY AFFECT EC DEVELOPMENT** *Socioeconomic status (SES)*

Environmental factors, including SES, can significantly affect cognitive and brain development (Noble et al., 2007). SES is a composite measure of economic and non-economic factors, including material wealth, social prestige, and education. Educational advocates have long discussed the negative implications of low-SES backgrounds on cognition and, ultimately, on academic achievement (Sirin, 2005; Duncan and Sojourner, 2013). Thus, negative environmental factors such as SES play a role in shaping candidate neural pathways by which (negative) early life experiences might compromise academic achievement or increase the risk of mental illness (Hackman and Farah, 2009; Hackman et al., 2010). Because neurobiological systems may mediate these SES-cognition gradients, we focus here on research demonstrating a link between SES and assessments of EC. For example, Noble et al. (2005a) found that in kindergarteners differing in SES backgrounds, low-SES children performed worse than middle-SES children on measures of language (mediated by the left perisylvian regions) and EC (mediated by thePFC). The groups did not differ on other cognitive measures, and the authors point to the delayed maturation of the brain regions mediating these abilities as being more susceptible to environmental factors such as SES. A subsequent study of individual differences using a population of first-graders demonstrated that SES explained 30% of the variance in language performance (Noble et al., 2007). SES also explained 6% of the variance in EC performance (despite the small values, they were statistically significant in both cases).

While the role of SES has been examined in sociological and epidemiological contexts, research is just now beginning to shed light on its impact on neurobiological mechanisms. For example, in 5-year-old children, SES predicts hemispheric asymmetry of the inferior frontal gyrus – which is well known to support critical EC functions – even after controlling for scores on a standardized set of language and cognition tests, with left lateralization associated with higher SES (Raizada et al., 2008). Consistent with this result, Sheridan et al. (2012) found that right medial frontal gyrus (rMFG) activity was inversely related to accuracy in acquiring a novel stimulus–response association (e.g., through a dimensional change card sorting task). Finally, cortical thickness measures vary differentially based on individual IQ levels. Not only might prolonged cortical thickening reflect extended synaptogenesis in individuals with high IQ, but this measure is also linked to increased levels of environmental input (Brant et al., 2013).

Additionally, Stevens et al. (2009) have demonstrated that SES differences can contribute to development of neural mechanisms of selective attention by measuring event-related potentials (ERPs), which provide an electrophysiological response to stimuli that is temporally precise. In particular, ERPs can provide an index of selective attention by demonstrating an amplified neural response in the N1 early negative component, which occurs in the first 100 ms after stimulus presentation (Hillyard et al., 1973). In one study, ERPs were measured while children were cued to selectively attend to one sound source and ignore the other. Compared to high-SES children, low-SES children showed reduced effects of selective attention on neural processing, as seen in an altered N1 response over bilateral frontal and central electrodes. Specifically, children in the lower SES group had a larger amplitude response to probes in the *unattended* channel relative to children in the higher SES group, suggesting an impaired ability to suppress a response to distracting information (Stevens et al., 2009). This parallels other ERP work suggesting that low-SES populations have difficulty inhibiting distracting information (D'Angiulli et al., 2008); they also show an attenuated response to novel stimuli relative to high-SES children (Kishiyama et al., 2009).

These findings demonstrate negative behavioral and neural consequences of low SES on EC development, suggesting that this population might be a prime candidate for EC remediation. We return to this issue in the section entitled, "What Groups Might Benefit from an EC Intervention?"

#### *Early life stress*

Early life stress (ELS) is the exposure to childhood events that challenge a child's emotional and physical well-being, exceeding their ability to cope with the events (Gunnar and Quevedo, 2007; Pechtel and Pizzagalli, 2010). Some of these stressors can include abuse, neglect, social deprivation, or household dysfunction (Brown et al., 2009). Further, although acute instances of stress can activate the body's stress response resources in a beneficial manner, high and especially chronic levels of stress can perturb typical brain development (Pechtel and Pizzagalli, 2010).

Neurobiological and neuroendocrine studies suggest that ELS might interfere with typical brain development by accelerating synaptic pruning and aberrantly increasing myelination (Teicher et al., 2006; Paus et al., 2008), although magnetic resonance (MR) neuroimaging techniques do not currently have the resolution to test this empirically (Gogtay and Thompson, 2010). However, studies largely support negative impacts of ELS on cognitive function, which are accompanied by decreased intracranial volume, reduced cross-hemisphere integration, and a smaller corpus callosum (Schiffer et al., 1995; Teicher et al., 2004; Noble et al., 2005b). In line with these findings, microstructural integrity of the corpus callosum may be reduced after ELS exposure (Paul et al., 2008).

In addition to showing effects of ELS on memory (Carrion et al., 2001; Karl et al., 2006) and affective function (Dillon et al., 2009), ELS also impacts EC (Colvert et al., 2008; Bos, 2009; Pollak et al., 2010). Mueller et al. (2010) conducted an functional magnetic resonance imaging (fMRI) study in which adolescents exposed to ELS performed a variant of the go/no-go task involving "go" and "change" trials. Relative to controls, ELS adolescents had longer response times on "change" trials that were accompanied by increased activity in the inferior frontal cortex and striatum. One possible explanation for this group difference is that these frontal regions might be more active to compensate for reduced inhibitory capacity, which has also been observed in women with ELS histories (Navalta et al., 2006).

ELS is correlated with low SES, and both negative factors can spur the development of psychopathologies, including anxiety and attention deficit hyperactivity disorder (Heim and Nemeroff,2001; Lupien et al., 2009) – two clinical syndromes that we discuss in the section entitled, "What Groups Might Benefit from an EC Intervention?" Generally, this work suggests that ELS and low SES can result in altered prefrontal function, with negative consequences for various cognitive domains, including those subserving EC. Current neuroimaging approaches such as interregional connectivity network analyses might yield a better understanding of the effects of SES and ELS on neurobiology by painting a broader picture of whole brain dynamics. We return to this issue in more depth later in Part II.

#### **FACTORS THAT POSITIVELY AFFECT EC DEVELOPMENT**

Thus far, we have reviewed the ontogeny of EC abilities and the impact that negative factors can have on EC development. However, there is evidence from experiments showing that broad

training – outside the lab – can positively affect EC development. These naturalistic interventions can take the form of aerobic exercise and games (Davis et al., 2011), music training (Rauscher et al., 1997; Budde et al., 2008), being raised in a bilingual environment (Calvo and Bialystok, 2014), and yoga (Manjunath and Telles, 2001), providing some evidence for EC improvements, particularly when EC demands are the greatest. Social pretend play, in which children must inhibit acting out of character and flexibly adjust as their friends improvise, improved performance on child versions of the dot-probe and flanker tasks (Diamond et al., 2007).

These various forms of training each have EC components that are central to the tasks – for example, children in martial arts training, including taekwondo, begin each session by directing attention toward themselves (Diamond and Lee, 2011). They monitor, evaluate, and adapt their thoughts and actions, and this practice can lead to increased concentration (Konzak and Boudreau, 1984) and cultivation of mental capacity (Seitz et al., 1990). Lakes and Hoyt (2004) observed children after a 3-month taekwondo intervention, finding that they were more able to focus attention and efforts on the task at hand, and they also improved on an intellectually challenging mathematics task. By engaging in naturalistic forms of training that incorporate EC components, these interventions – might generalize to improvement on other tasks that also involve shared EC abilities. Moreover, the experience of the classroom itself might boost EC abilities beyond natural development. For example, Burrage et al. (2008) demonstrated that early schooling has a significant impact on EC abilities in a group of pre-kindergarten and kindergarten children when compared to children at the same age who did not attend school, even when assessing these abilities before schooling began and when controlling for factors such as SES and race.

In view of these findings, various kinds of interventions might be especially useful for children with poorer EC abilities, in that they may enjoy more benefits from an intervention (Diamond, 2012). Given the association between EC abilities and numerous cognitive skills, academic outcomes, and clinical psychopathologies, targeted EC interventions during childhood may be particularly useful (keeping in mind the caveats outlined earlier related to top-down/bottom-up trade-offs). To this end, we now review specific at-risk groups that could profit from an EC intervention. Although numerousfactors might result in an educational achievement gap for these populations, timely educational interventions may be able to minimize or even close this gap by positively impacting EC development.

# **WHAT GROUPS MIGHT BENEFIT FROM AN EC INTERVENTION?** *Low SES*

The susceptibility of prefrontal cortices to experiential and environmental factors such as SES and life stress described earlier raises the question of whether these cortical networks are subject to improvement through intervention, as such negative experiences can range widely in scope. In other words, the differences observed in brain and behavioral function in low- relative to high-SES children is experience-dependent, where experience is defined by real-life economic and social circumstances. Here, laboratory-developed interventions might be able to create new experiences for low-SES children to mitigate the gaps outlined above.

In a broad family-based intervention, Neville et al. (2013) developed "Parents and Children Making Connections - Highlighting Attention," or PCMC-A – a program that combines training sessions for parents with concurrent attention training exercises for children. These exercises are designed to improve regulation of attention and emotion states. Over the course of 8 weeks, low-SES preschoolers who were enrolled in Head Start completed either the PCMC-A program, an active control training program that focused on child classroom training, or remained in Head Start alone (i.e., no supplemental training). After the intervention, PCMC-A children demonstrated improvements of measures of non-verbal IQ, receptive language, and pre-literary skills; furthermore, their parents reported reduced stress levels. Before training, the groups did not show differences in ERP signatures of early attentional modulation to either attended or unattended stimuli, suggesting an inability to shift attention toward either sound source. However, after the intervention, only the ERP signatures of the PCMC-A group demonstrated improved selective attentional processing as a function of the intervention. Along with another study demonstrating that low-SES children may profit from targeted EC interventions (Goldin et al., 2014), these early results hold promise for using interventions to target at-risk children and may serve as precursors to subsequent behavioral effects.

Other types of targeted interventions may be able to counteract the negative consequences of environmental factors such as SES on distinct cognitive processes. In one study by Mackey et al. (2011), low-SES children trained for 8 weeks on either reasoning or speed processing using a battery of commercially available games. After training, reasoning-trained children completed more matrix reasoning problems, and speed-trained children improved significantly on a measure of cognitive speed that requires rapidly translating digits into symbols. Finally, although the reasoningtrained group also showed improved measures of spatialWM span, these gains did not appear to be related to reasoning gains. Taken together, the results suggest that both cognitive processes – reasoning and speed – are separately modifiable by targeted interventions, and that these improvements are seen in a special low-SES population that may need the intervention more than others in order to reduce the achievement gap (Mackey et al., 2011). Although the neural correlates of these interventions have not yet been tested, one possibility is that the interventions may alter the rate of white matter maturation, wherein the degree of white matter development influences processing speed, which might in turn support improved reasoning ability (Ferrer et al., 2013).

In sum, behavioral work is beginning to shed light on how both broad and targeted interventions can positively impact EC abilities in low-SES populations, but more work will be necessary to better understand the neurobiological changes underlying improved EC as well as the time-course for these changes. We will again return to these ideas in Part II.

# *Attention deficit hyperactivity disorder (ADHD)*

We have described some research showing that children – a population with late-developing EC abilities, partially due to immature neural "hardware" – might benefit from EC interventions that specifically target these abilities, yielding improvement on untrained tasks (at least in the lab). Intervention work may also benefit other groups who demonstrate poor EC skills relative to adults and their non-clinical counterparts, for example, certain clinical populations that are prevalent in a child population (e.g., ADHD). Similar to low-SES groups and in contrast to healthy populations, the impaired EC abilities in this clinical syndrome might make these children prime candidates for EC remediation. For this reason, we turn now to work that has examined this possibility in ADHD.

ADHD affects 3–10% of children in the United States (e.g., Merikangas et al., 2010) and is defined by inattention, impulsivity, and hyperactivity with broad deficits in EC (Barkley, 1997). ADHD, linked to impaired function of the frontal lobes (Castellanos and Proal, 2012), can negatively impact educational achievement, job success, and social well-being (Kessler et al., 2006; Loe and Feldman, 2007). The prevalence and impact of ADHD in children has led researchers to implement cognitive interventions to help this population. For example, an early study used an adaptive (i.e., adjusting for difficulty as the participant's performance improved) and intense (i.e., repeated several times a week for at least 5 weeks) intervention in ADHD children. In addition to improving on the trained WM task, participants significantly improved on an untrained WM task, as well as on Raven's Progressive Matrices (RPM), a non-verbal complex reasoning task (Klingberg et al., 2002). Using Cogmed training, one study demonstrated that across measures of WM, inhibitory control, and complex reasoning, ADHD children who trained on Cogmed outperformed those children completing a control training program (Klingberg et al., 2005). There are also indications that WM training may alter academic performance in these populations (Green et al., 2012), and some of these effects can persist for months after the end of training, suggesting that long-term changes are possible with short, intense training periods.

Despite the initial promising effects of the Cogmed WM training program, more recent studies using Cogmed in children with ADHD have been mixed (Chacko et al., 2013). For example, two studies showed improvements to neuropsychological outcomes and parent-rated ADHD symptoms relative to both wait-list control and placebo treatment conditions (Klingberg et al., 2005; Beck et al., 2010). However, a third study did find improvements to behavioral observation during an academic task but no improvements in parent-rated ADHD symptoms (Green et al., 2012). A fourth study using an active-control group found no group differences between the training and control groups (Gray et al., 2012). Other approaches have used forms of computerized attention training: training on sustained, selective, alternating, and divided attention using visual and auditory stimuli (Shalev et al., 2007). In a separate study, relative to a no-contact control group, the researchers found small to moderate improvements on EC measures of inhibition, planning, comprehension and memory of verbal instructions, and cognitive flexibility. However, these small effects were also accompanied by several transfer measures that did not reveal any significant effects at all, and the evaluators were not blind to each participant's assigned condition (Tamm

et al., 2013). Some of these behavioral study limitations and discrepancies may be attributed to the lack of alignment between treatment outcomes and the model of therapeutic benefit, a lack of theory-driven overlap between the training regimen and outcome measures (e.g., in terms of cognitive mechanisms tapped; see Dahlin et al., 2008), the equivalence of the control conditions, and examining the individual differences in treatment response (Shah et al., 2012).

In a pair of studies that combined behavioralWM training with neuroimaging in order to uncover neural changes in the ADHD population after an intervention, Hoekzema et al. (2010, 2011) observed functional and structural changes after training ADHD children on tasks tapping WM, cognitive flexibility, attention, planning, and problem solving. Functionally, during inhibition, researchers observed increased activation in orbitofrontal cortex, superior frontal cortex, middle frontal gyrus, and inferior frontal cortex. Performance during attention tasks was associated with increased cerebellar activity (Hoekzema et al., 2010). Structurally, the researchers observed volumetric gray matter increases in bilateral middle frontal cortex and right inferior–posterior cerebellum after training compared to controls. Furthermore, the extent of gray matter volume increase in cerebellum was associated with attentional performance. Interestingly, the regions demonstrating training-related changes are some of the same regions that are typically characterized by volume *reduction* in ADHD patients. If these regions subserve ADHD behavior, then cognitive training might counteract some of the neuroanatomical reductions associated with the disorder and its symptoms (Hoekzema et al., 2011), with cognitive training playing a functionally restorative role that ultimately leads to compensatory increased gray matter volume.

To conclude, although promising, EC intervention work targeting ADHD is still in its early stages and the findings are still too varied to warrant strong conclusions about positive effects. Some of the current behavioral and neurobiological limitations for this line of research might be addressed by broadening the scope and procedures of the training, as well as by embracing an interdisciplinary approach that can better conceptualize and enhance cognitive training in ADHD as a possible therapeutic target (Rutledge et al., 2012). This work may also generally benefit from novel neuroimaging techniques that can more comprehensively assess spatial and temporal brain changes that yield (and inform an interpretation of) behavioral improvements.

# **PART II**

In Part I of this review, we discussed how EC ontogeny and negative factors impacting its development might be mitigated by interventions that target EC abilities, and we described several populations who might be good candidates to focus training efforts because of the educational, economic, and social implications of poor EC. However, we do not yet have a clear understanding of the neurobiological changes that induce behavioral improvements following an EC intervention. Indeed, in some cases, we also lack a clear understanding of the cognitive mechanisms that are trained during a regimen and are common to the outcome measures to effect transfer. The behavioral work that we described previously raises particular questions about the spatial profile and time-course of the neurobiological changes. In other words, it remains untested what brain systems underlie transfer effects in special groups because the neuroimaging component of EC interventions in these populations is in a relative phase of infancy. Such research is critical to conduct to (a) validate behavioral effects (e.g., by examining common brain-behavior changes postintervention) and (b) use as a precursor to behavioral effects that have not emerged. Namely, can structural and/or functional brain activity patterns predict who within a special group is likely to benefit from training? The answer to this question could shed light on some of the mixed findings reviewed earlier.

In theory, meanwhile, studies of EC interventions in healthy populations could inform which neurobiological (and cognitive) mechanisms should be targeted in special groups in hopes of maximizing transfer success. Given some inconclusive findings reviewed in Part I, the time is ripe to consider this issue. It is widely accepted that for routine practice with a training task to confer transfer benefits to an (unpracticed) outcome measure, some underlying cognitive and neural processes must be shared across both tasks (Dahlin et al., 2008). For example, Dahlin et al. (2008) demonstrated that after 5 weeks of memory-updating training, behavioral improvements transferred to a WM 3-back updating task but not to a Stroop task. Critically, both the updating task and the WM task engaged the striatum, whereas the Stroop task—a prefrontal conflict resolution task—did not. These results provide evidence that shared neural substrates underlie process-specificity: the idea that transfer can occur when both training and transfer tasks recruit overlapping processing and neurobiological components.

We discussed earlier that EC comprises multiple components and is not, rather, a unitary construct (Friedman and Miyake, 2004). In view of this, it may be unsurprising that in studies of healthy adults, many EC interventions do not result in widespread transfer (Morrison and Chein, 2010; Hindin and Zelinski, 2012; Redick et al., 2012; Melby-Lervåg and Hulme, 2013; Sprenger et al., 2013; Thompson et al., 2013), perhaps due to a weak link between the components tapped during training and those tapped in the outcome task(s) (Jaeggi et al., 2010). A similar argument could plausibly explain some of the unreliable results described earlier in special groups. A process-specificity framework however might afford some traction in the future, particularly in special groups, to better understand the mixed results outlined in Part I. Although not the focus of those studies, the use of neuroimaging methods to evaluate common neurobiological structures and cognitive procedures could inform what range of processes are affected in clinical groups and thus what ought to be the focus of training. Moreover, because the neural mechanisms underlying a complex set of behavioral changes in these select groups is so poorly understood, a process-specific approach could also help to generate testable predictions by providing a candidate set of neural networks on which to focus analyses. This suggestion follows the tradition of lesion-deficit analyses in neuropsychological groups (e.g., aphasics), where mapping specific symptoms associated with a complex syndrome (rather

than an entire syndrome itself) onto specific brain structures has been a more fruitful approach (e.g., Dronkers, 1996; Robinson et al., 2005), as attempts to localize multifaceted disorders in the brain has yielded little consistency. Rather, the various symptoms associated with a complex syndrome typically reveal an intricate network of involved regions, and process-specific contributions to a network could provide insight into which parts of a network might change depending on what the target is of a particular intervention.

Non-invasive brain imaging is a valuable method for examining the neural mechanisms that underlie the observed behavioral changes in EC resulting from intervention. By shedding light on some of the cortical mechanisms involved in the training and transfer tasks (see **Figure 1**), it provides potential explanations for the brain-behavior relations that give rise to transfer benefits. It might also provide information on why transfer does not occur in some circumstances. In what follows, we review some research that has investigated brain activity changes in healthy groups in the context of EC intervention. To preview, although there are intriguing and interpretable trends within any given study, there is a fair amount of inconsistency in the findings across studies, thereby preventing a clear and uniform description of what happens neurobiologically following intervention. This result is somewhat problematic for a broad understanding of process-specificity, as it might apply to EC interventions for the special groups outlined in Part I. We suggest that one issue is that most studies of this sort focus analyses on brain changes in isolation, rather than on dynamic changes in a networked system of brain regions acting in concert. This limitation makes the current state of the science difficult to forge obvious paths for proceeding with special groups. We therefore conclude with some ideas about other analysis techniques that are designed to evaluate brain-network dynamics that could ultimately bypass current limitations and offer a better, more comprehensive understanding of the advantages and constraints of EC intervention.

#### **NEUROIMAGING OF INTERVENTION WORK IN HEALTHY POPULATIONS**

Neural changes that accompany behavioral results following training can take a number of forms; we focus here on functional rather than structural changes (Kelly et al., 2006; Buschkuehl et al., 2012). Using fMRI, some researchers have hypothesized that training should increase neural activation magnitude, because this direction of the effect is thought to reflect neural strengthening following practice (i.e., better behavior = more neural recruitment). Some studies have indeed reported increases in the functional activation of brain regions recruited during training (Temple et al., 2003; Shaywitz et al., 2004; Stevens et al., 2008; Hoekzema et al., 2010; Jolles and Crone, 2012). For example, Olesen et al. (2004) found WM-related activity increases in the middle frontal gyrus and parietal cortex after 5 weeks of WM training, replicating some of these results at the single-subject level (Westerberg and Klingberg, 2007). Another study trained young adults on forwards and backwards object span for 6 weeks, which resulted in increased activation in default-mode regions during the forward condition, accompanied by increased activation in the striatum and left ventrolateral PFC in the backward condition (Jolles et al., 2011). Finally, Buschkuehl et al. (2014) found increased perfusion in frontal and occipital regions as a function of a short (one-week) intervention. These activation increases might be attributed to an increase in the size of training-related cortical representations over time, or to a strengthened neural response from brain areas already active pre-training (Pascual-Leone et al., 2005; Kelly et al., 2006).

In contrast, other researchers have hypothesized that training should result in decreases in neural activation magnitude (Qin et al., 2004; Haier et al., 2009; Kucian et al., 2011), which could be attributed to increased neural efficiency that develops over the course of the intervention period (Kelly et al., 2006; Neubauer and Fink, 2009; Nyberg et al., 2009). Using an adaptive WM intervention with the n-back task, Schneiders et al. (2011) found that after training, participants showed decreased activation in the right superior middle frontal gyrus and posterior parietal regions. Another study scanned participants three times over the course of 4 weeks while they practiced the n-back task: although the researchers initially observed increased activity in the intraparietal sulcus and superior parietal lobe midway through their training protocol, these same regions demonstrated decreased activation at the end of their training protocol (Hempel et al., 2004).

Other researchers have suggested that training can result in neural re-distribution, namely, a combination of activation increases and decreases. For example, Dahlin et al. (2008) found increased activity in the striatum after participants trained for 5 weeks on an updating task, which was accompanied by activation decreases in frontal and parietal regions. In the same paper described earlier, Olesen et al. (2004) conducted a second experiment in which they trained young adults on visuospatial WM tasks for 3 weeks. After training, they found increased activation in frontal and parietal regions, as well as the basal ganglia and thalamus; this effect was accompanied by activation decreases in the anterior cingulate, post-central gyrus, and inferior frontal sulcus. In all of these studies, the decreased activation in regions that mediate attentional control might reflect a shift from sustained attention required to perform a novel task toward more automated processing. The increased activation might reflect increased neural strengthening following practice (akin to Hebbian learning principles of neuronal firing).

These mixed results make it difficult to draw definitive conclusions about the impact of training on brain function and call for neuroimaging techniques that can effectively assess dynamic neural changes over time. However, one hypothesis to explain the variation above is that training time (or task proficiency) may drive the neural changes observed after training. Specifically, short periods of training may increase neural activation—reflecting increased effort in learning and adapting to task demands whereas progressively expert task proficiency in carrying out a particular cognitive function may result in decreased activation (reflecting increased neural efficiency). Emerging techniques in neuroimaging (e.g., network analyses) are attractive methods for testing this hypothesis because changes in network functions may be able to reveal a dynamic interplay among regions that underlie plasticity.

# **CURRENT AND FUTURE DIRECTIONS** *Connectivity analyses*

Localization studies—like those described in the previous section—are useful in revealing the spatial profile of isolated brain activity of particular regions but may not yield the most comprehensive or consistent picture of neural dynamics. Moreover, it is clear that brain regions form networks for communication rather than act exclusively. Connectivity analyses assess these network dynamics, and functional connectivity analyses test the correlation of brain activity across regions that are cooperatively recruited by some mental procedure. Specifically, interregional connectivity data reveal the extent to which activity in one brain area covaries with activation in other brain areas. This analysis approach can be particularly informative because it paints a broader picture of neural dynamics—at the network level, and beyond that of activity magnitude changes—that can be altered by an intervention. Another benefit is that regional co-variation can yield insight into both process-specificity (brain areas that co-engage during a particular cognitive function) and domain-generality (brain areas that may co-engage with a specific cognitive procedure during some epoch but with another procedure during another epoch, influenced by task demands; see Federenko and Thompson-Schill, 2014). Therefore, network approaches might provide insight into the variable nature of the functional findings described above. For instance, functional connectivity analyses permit researchers to address these questions: What regions of the network change over time, and how do they change? What regions remain activation-stable despite other parts of the network showing activation-variance following intervention? At what point does the initial "ramp up" (reflected by activation increases) that corresponds to the effort associated with the novelty of a training-task procedure become less effortful, more efficient, and automatic (reflected by activation decreases)? Does such "ramp down" occur alongside behavioral improvements on the training task as well as transfer measures? Does it occur together with ramp-up elsewhere in the network, assuming that multiple cognitive procedures are tapped during training? **Figure 2** sketches some hypothetical outcomes of a connectivity approach to training.

Further, the connectivity approach allows researchers to consider intrinsic brain activity at rest in addition to taskrelated activity, and any differences in resting state connectivity after the intervention could suggest generalized effects beyond that of training task performance. Connectivity analyses can additionally converge with task-related analyses to provide meaningful information about neural changes after interventions (Buschkuehl et al., 2014). Such dynamic interplay among regions could reveal important insights into brain plasticity, which localization approaches might inherently miss.

A few studies have demonstrated increased functional connectivity following training. After a period of intensive reasoning training (i.e., an Law School Admission Test preparation course), Mackey et al. (2013) found that students who had completed the course showed strengthened fronto-parietal and parietal-striatal connections. Moreover, left rostro-lateral PFC had increased resting-state functional connectivity with parietal

**FIGURE 2 | A hypothetical network trajectory for changes in brain activation as a function of training, where T1 indicates pretest and T2 indicates posttest. (A)** T2 taken during time period 1 (lightest gray) will observe an increase in activation that could reflect neural strengthening; **(B)** T2 taken during time period 2 (dark gray) will also observe an increase in activation that is actually tied to a trending decrease in activation; **(C)** T2 taken during time period 3 (darkest gray) will observe a decrease in activation that could reflect increased neural efficiency. Panels **(A–C)** consider a single cognitive process, but important information about network connectivity and co-variation could emerge by considering multiple cognitive processes in concert with one another. **(D)** Multiple cognitive processes with different time-course trajectories may demonstrate a more complex pattern of network changes that are best detected through approaches that can assess changes in cognitive processes in addition to their relational co-variation and connectivity. Finally, note that these outcomes do not preclude the potential benefit of multiple interim assessment scans throughout training (which, while informative, may not always be feasible for logistical and financial reasons).

regions (both within the left hemisphere and between hemispheres). Increased functional connectivity in terms of efficiency (i.e., the extent to which a region connects with other regions) and degree value (i.e., the number of connections that a region has to other network regions) has also been linked to meditation training (Xue et al., 2011). Specifically, the left ACC – a key regional node in the self-regulatory network whose function may serve cognitive as well as socio-emotional purposes (Kelly et al., 2008)—demonstrated increased connectivity after a period of integrative body-mind (or meditation-type) training. Finally, after training participants on an adaptive verbal processing speed task, Takeuchi and colleagues (Takeuchi et al., 2011) observed increased functional connectivity between the left perisylvian area and regions extending to the lingual and calcarine cortex. This increased connectivity might reflect increased verbal information transfer between the regions, and was correlated with behavioral improvements on the processing speed task.

Other studies have seen a combination of increases and decreases in connectivity after an intervention (cf. **Figure 2**). For example, in the same study by Jolles et al. (2011) that found increased activation in the striatum and PFC after 6 weeks of verbal WM training, the researchers also observed increased functional connectivity between the rMFG and other regions of a fronto-parietal network, including bilateral superior frontal

gyrus, paracingulate gyrus, and ACC. Further, the degree of increased functional connectivity positively correlated with behavioral performance increases. These connectivity increases were accompanied by decreased functional connectivity between the medial PFC and the right posterior middle temporal gyrus. One potential explanation for these effects could be that the connectivity between these regions reflects both reactive task engagement as well as expectation about co-activation in the future (Körding and Wolpert, 2006; Bar, 2007; Raichle, 2010). In a second study, Takeuchi and colleagues (Takeuchi et al., 2013) administered 4 weeks of an adaptive WM training task, finding that WM-trained participants showed increased functional connectivity between mPFC and the precuneus (both regions that are part of the default mode network, or DMN), as well as decreased connectivity between mPFC and right posterior parietal cortex and right lateral PFC (nodes of the executive attention system, or EAS). The authors argue that these results reflect a shift from the EAS network (activated during task engagement) toward the more automated DMN, which tends to be activated in a task-independent manner (Chein and Schneider, 2005).

In sum, studies that examine connectivity changes – and more generally, analyses that consider network activation and covariation – may be able to clarify some of the field's mixed results by providing a broader picture of the neural dynamics that accompany behavioral training and transfer. They may be able to do this because they can assess connectivity measures that account for changes in spatial activation, time-course differences, and network and regional communication and co-activation. In particular, they could shed light on the idea that with practice, EC processes will become more automatic over time, requiring fewer cognitive and neural resources (Chein and Schneider, 2005). For example, one could test this prediction by examining the extent to which connectivity changes in regions that subserve EC components (even if activity decreases) after an intervention, and whether the degree of these changes can be predictive of behavioral performance after training as well. Within a process-specificity framework, one might further predict connectivity changes that are dependent on both training duration and the extent of overlap across cognitive processes (i.e., the shift in distance between distinct cognitive processes – see **Figure 2D**).

#### *Multi-voxel pattern analyses*

Activation increases and decreases—such as those described previously—are usually based on general linear modeling (GLM) analyses of neuroimaging data. Each volumetric unit of the imaged brain (usually termed a "voxel") carries a time-series of information, with approximately 40,000 data-points (i.e., one for each voxel in the brain) collected every few seconds over the course of an experiment. In fMRI, the GLM approach—which is the standard in the field—involves analyzing the information from these voxels to separate stimulus-induced signals from noise. However, this modeling approach comes with a set of assumptions that, when considering neurobiological mechanisms more naturalistically, may become limitations (for a comprehensive review, see Monti, 2011). Specifically, GLM approaches assume that the activity in each voxel of the brain occurs independently from every

other voxel in the brain. While the assumption is more likely to be true for voxels that are located far apart from one other, its validity is somewhat more limited when considering two voxels that sit near or adjacent to one another. In these cases, the GLM approach may not fully capture spatially distributed information that goes beyond meaningful signal in individual voxels.

Thus, analysis approaches that consider meaningful patterns of activation, rather than a set of independently activated voxels, might yield informative results with important implications. This kind of "pattern-analysis" approach, often termed multi voxel pattern analysis (MVPA), considers patterns of activation rather than individual voxels, and therefore carries the additional benefit of not having to rely solely on voxel-by-voxel activation. Moreover, the high spatial frequency information detected by MVPA is conducive to performing within-subject analyses. That is, traditional GLM analyses necessarily average activity across subjects' brains (which vary wildly in terms of size, morphology, and location of particular regions), and patterns in one individual may not generalize to others. An MVPA approach, in contrast, affords greater sensitivity toward detecting neural patterns, and thus has the potential to identify information about brain activation patterns within individual subjects and cater to individual differences in these activation patterns. MVPA has primarily been applied in the long-term memory and visual domains. For example, it has been used to predict recall of object categories (Polyn et al., 2005), demonstrate distributed neural representations of objects (Haxby et al., 2001), decode brain states during near-threshold fear detection (Pessoa and Padmala, 2006), distinguish between lexical and syntactic neural information (Fedorenko et al., 2012), and link behavioral and neural measures of conceptual similarity (Weber et al., 2009). To our knowledge, MVPA has not yet been applied to the neuroimaging of EC training studies, but might reveal subject-specific brain states that are predictive of behavioral changes following an intervention.

For example, the degree of similarity between brain activation patterns before and after training may predict the extent of observed behavioral improvement on the training tasks; additionally, similarity between these training patterns and brain patterns associated with outcome measures might be indicative of transfer success (or at least, a precursor to it). Both of these measures could ground MVPA findings within a process-specificity framework by providing a concrete measure of similarity between training and transfer brain activation patterns. Further, by being able to cater to single subjects and account for individual differences that may be masked through a group-level analysis, it may uncover neural mechanisms underlying some of the individual differences in behavioral training and transfer. Thus, its analytic appeal might make it an attractive analysis candidate for future intervention work.

#### **CONCLUSION AND CLOSING REMARKS**

In this review, we have discussed several examples of populations for which training EC might serve as a useful intervention strategy, as well as how emerging neuroimaging techniques might inform the mixed results from these groups.

Though much work has been devoted to the *behavioral* transfer effects of training, some information on the *neural* transfer effects of training in healthy adults is beginning to emerge. The neuroimaging intervention work in healthy populations, as a result, may be able to inform future work on post-intervention neural changes in select developmental and at-risk populations, although this field is relatively young and thus faces challenges. Further, we have suggested some possible neuroimaging analysis techniques – namely, connectivity of neural networks and multivariate pattern analysis – that might provide additional guidance by examining brain states as well as intra- and inter-regional connectivity patterns before and after training.

We have discussed a few selected groups whose relatively poor EC skills make them prime candidates for EC intervention, but the current results from that work are mixed. A process-specific account – not usually the focus of intervention work in these populations – could be informative in both better understanding when transfer does and does not occur, and helping to guide future neuroimaging work in this field. Emerging neuroimaging approaches – namely, connectivity and MVPA analyses – may also be able to paint a more comprehensive picture of the undoubtedly complex neural profiles in these groups (and their potential plasticity). Finally, additional research on the basic science mechanisms underlying EC training could have important social, educational, and economic implications as it works to guide and inform future training paradigms targeted toward specific populations.

#### **AUTHOR CONTRIBUTIONS**

Nina S. Hsu, Jared M. Novick, and Susanne M. Jaeggi wrote the paper.

# **ACKNOWLEDGMENT**

The authors thank Kevin Byron for helpful comments on an earlier version of this manuscript.

### **REFERENCES**


software-based gaming intervention. *Proc. Natl. Acad. Sci. U.S.A.* 111, 6443–6448. doi: 10.1073/pnas.1320217111


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

*Received: 21 February 2014; accepted: 01 June 2014; published online: 24 June 2014. Citation: Hsu NS, Novick JM and Jaeggi SM (2014) The development and malleability of executive control abilities. Front. Behav. Neurosci. 8:221. doi: 10.3389/fnbeh.2014.00221*

*This article was submitted to the journal Frontiers in Behavioral Neuroscience.*

*Copyright © 2014 Hsu, Novick and Jaeggi. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.*

# "Executive functions" cannot be distinguished from general intelligence: two variations on a single theme within a symphony of latent variance

# *Donald R. Royall 1,2,3,4\* and Raymond F. Palmer <sup>3</sup>*

*<sup>1</sup> Department of Psychiatry, The University of Texas Health Science Center, San Antonio, San Antonio, TX, USA*

*<sup>2</sup> Department of Medicine, The University of Texas Health Science Center, San Antonio, San Antonio, TX, USA*

*<sup>3</sup> Department of Family and Community Medicine, The University of Texas Health Science Center, San Antonio, San Antonio, TX, USA*

*<sup>4</sup> The South Texas Veterans' Health System Audie L. Murphy Division, Geriatric Research Education and Clinical Center, San Antonio, TX, USA*

#### *Edited by:*

*Lynne Ann Barker, Sheffield Hallam University, UK*

#### *Reviewed by:*

*Paul Richardson, Sheffield Hallam University, UK David Kevin Johnson, University of Kansas, USA*

#### *\*Correspondence:*

*Donald R. Royall, Division of Aging and Geriatric Psychiatry, The University of Texas Health Science Center at San Antonio, 7703 Floyd Curl Dr., San Antonio, TX 78229-3900, USA e-mail: royall@uthscsa.edu*

The empirical foundation of executive control function (ECF) remains controversial. We have employed structural equation models (SEM) to explicitly distinguish domain-specific variance in executive function (EF) performance from memory (MEM) and shared cognitive performance variance, i.e., Spearman's "g." EF does not survive adjustment for both MEM and g in a well fitting model of data obtained from non-demented older persons (*N* = 193). Instead, the variance in putative EF measures is attributable only to g, and related to functional status only through a fraction of that construct (i.e., "d"). d is a homolog of the latent variable δ, which we have previously associated specifically with the Default Mode Network (DMN). These findings undermine the validity of EF and its putative association with the frontal lobe. ECF may have no existence independent of general intelligence, and no functionally salient association with the frontal lobe outside of that structure's contribution to the DMN.

**Keywords: aging, cognition, dementia, executive function, functional status, g, intelligence**

# **INTRODUCTION**

Executive Control Function (ECF) is widely thought to be vital to human autonomy, and a major determinant of problem behavior and disability in neuropsychiatric disorders (Royall et al., 2002a). Nevertheless, we lack a "gold standard" ECF measure, and the construct as a whole seems to lack a coherent empirical foundation.

"Executive functions" (EF) broadly encompass cognitive skills that are responsible for the planning, initiation, sequencing, and monitoring of complex goal-directed behavior. This may explain the relatively robust associations between EF measures and Instrumental Activities of Daily Living (IADL) (Royall et al., 2007).

However, the relationship between EF and functional status is more complex. Individual EF measures empirically load on more than one "executive" factor (Miyake et al., 2000; Royall et al., 2003; Androver-Roig et al., 2012; Testa et al., 2012). Neither the EF factor nor their indicators are necessarily associated with IADL. Executive measures are therefore commonly "validated" against structural or functional frontal lobe pathology. However, these associations are statistically weak to moderate, and qualitatively non-specific. Many executive tasks and measures can be associated with non-frontal structures and lesions (Collette and Van der Linden, 2002; Alvarez and Emory, 2006).

Recently, my colleagues and I have examined the "cognitive correlates of functional status" as a latent variable (i.e., "δ" for "dementia") in a Structural Equation Model (SEM) framework (Royall and Palmer, 2012, 2013, 2014; Royall et al., 2012a,b, 2013). δ and its homologs are strongly associated with IADL, more strongly so than are any of their indicators, including EF measures.

δ's design explicitly parses a battery's shared variance (i.e., Spearman's g) into orthogonal fractions (g and δ) of which only δ is related to functional status (i.e., δ's "target indicator") (Royall and Palmer, 2012). δ "homologs" can be constructed from any battery that contains both cognitive measures and one or more measures of IADL.

By definition, dementia requires disabling cognitive impairment. Therefore, only δ's variance is both necessary and sufficient to dementia case finding. Thus, δ scores can be interpreted as a dementia phenotype. δ homologs have achieved Areas Under the Receiver Operating Curve (AUC /ROC) of 0.92–0.99 for the discrimination of well-characterized Alzheimer's Disease (AD) cases vs. controls in four datasets to date, although each δ homolog accounts for a minority of the variance in observed cognitive performance. The latent variable g- (δ's residual in Spearman's g) and measurement "error" (including domain specific variance) account for the majority of cognitive variance, yet g has an AUC of only 0.52–0.66 (Royall et al., 2012a,b, 2013; Royall and Palmer, 2013, 2014). δ has been independently validated by a second group using the National Alzheimer's Coordinating Center's (NACC) Unified Dataset (UDS) (Gavett et al., 2014). In that dataset (*N* ≈ 26,000), δ had an AUC of 0.96 for the discrimination between demented and non-demented participants, vs. g s 0.52. It is important to note that the NACC dataset is not limited to AD, but includes cases with a variety of dementing illnesses. This supports δ's association with dementia in the abstract, regardless of its etiology.

δ and its homologs are derived from Spearman's general intelligence factor, "g" (Spearman, 1904), i.e., a latent variable representing the shared variance in the dominant factor extracted from any cognitive battery. The latent variable g, in turn, has been associated with frontal lobe lesions (Duncan and Owen, 2000; Duncan et al., 2000) executive measures (Duncan et al., 1997), and frontal lobe imaging (Choi et al., 2008; Gläscher et al., 2010). Since "g" can also be associated with functional outcomes (Gottfredson, 1997), we decided to explore whether an EF specific factor can be distinguished from other domain-specific variance (i.e., memory) and/or g-δ. If not, then EF may merely represent g or δ's influence on cognitive task performance, and δ may represent the emergent "ECF" responsible for uniquely human "executive" capacities.

#### **METHODS**

#### **AIR FORCE VILLAGES' FREEDOM HOUSE STUDY**

We have studied 547 well elderly retirees as part of the Air Force Villages' (AFV) Freedom House Study (FHS). The AFV is a 1500-bed Comprehensive Care Retirement Community in San Antonio, TX that is open to Air Force officers and their dependents. At baseline, the FHS subjects represented a random sample of AFV residents over the age of 70 years living at noninstitutionalized levels of care. Informed consent was obtained prior to their evaluations.

A subset of FHS participants (*n* = 193) were administered a formal neuropsychological test battery that included standardized tests of memory, language, and ECF. This subgroup was slightly older at baseline than the larger FHS cohort (mean age of 79.0 years vs. 77.7 years, respectively), but did not differ significantly with regard to gender, education, baseline level of care, or Mini-Mental State Examination (MMSE) scores (Folstein et al., 1975). Select demographic and clinical features are presented in **Table 1**.

# **COGNITIVE BATTERY**

#### *Memory measures*

The California Verbal Learning Task (CVLT) (Delis et al., 1987) assesses learning and memory processes. Patients are asked to learn and recall two 16 item shopping lists. Each list is comprised of four words from four semantic categories. Learning takes place over five trial presentations. We modeled the summed number of correct words recalled across learning trials 1–5.

The Mattis Dementia Rating Scale: memory subscale (DRS:MEM) (Mattis, 1988) provides a brief assessment of verbal and nonverbal short-term memory. The memory subtest consists of sentence (five word) recall, design and word recognition, and orientation items.

#### *"Executive" measures*

CLOX: An Executive Clock Drawing Task (Royall et al., 1998b) is a brief ECF measure based on a clock-drawing task (CDT). It is divided into two parts. CLOX1 is an unprompted task that is sensitive to executive control. CLOX2 is a copied version that is less dependent on executive skills. Each CLOX subtest is scored on a 15-point scale. Lower CLOX scores are impaired.

The Executive Interview (EXIT25) (Royall et al., 1992) provides a standardized clinical EF assessment. It contains 25 items **Table 1 | Subject characteristics.**


*ADL's, activities of daily living; AODM, adult onset diabetes mellitus; CAD, coronary artery disease; CVA, cerebrovascular disease, IADL's, Instrumental Activities of Daily Living; HTN, hypertension; MAX, maximum; MD, physician.*

designed to elicit signs of frontal system pathology (e.g., imitation, intrusions, disinhibition, environmental dependency, perseveration, and frontal release). EXIT25 scores range from 0 to 50. High scores indicate impairment.

The Controlled Oral Word Association (COWA) (Benton and Hamsher, 1989) is a test of oral word production (verbal fluency). The patient is asked to say as many words as they can, beginning with a certain letter of the alphabet.

The WAIS-R Digit Symbol Coding (DSS) (Wechsler, 1991) is a test of psychomotor speed and attentional control the subject is asked to copy as quickly as possible, nonsense symbols corresponding to specific numbers presented in a "key" at the top of the page.

The Trail Making Test, Parts A and B (Reitan, 1958) provide a measure of conceptualization, psychomotor speed, and attention. Trails B requires the subject to connect consecutively numbered and lettered circles, alternating between the two sequences.

The abbreviated Wisconsin Card Test (Haaland et al., 1987) is an adaptation of the original two deck (128 cards) Wisconsin Card Sorting Test (WCST) (Heaton et al., 1993). The Abbreviated WCST utilizes one deck of 64 cards. The number of "categories correct" (WCAT) was used as an outcome measure.

Although the above are all widely considered to be validated "executive" measures, they empirically load on at least three factors (Royall et al., 2003).

#### **FUNCTIONAL STATUS**

Disability and comorbid medical conditions were assessed using the Older Adults Resources Scale (OARS) (Fillenbaum, 1978). The OARS is a structured clinical interview that provides selfreported information on activities of daily living (ADL), IADL, physical and mental health history, healthcare utilization, and current medications.

### **STATISTICAL APPROACH**

This analysis was performed using Analysis of Moment Structures (AMOS) software (Arbuckle, 2006). All analyses were conducted in an SEM framework.

#### *Analysis sequence*

First we examined the associations between individual cognitive performance measures and IADL in a multivariate regression model, adjusted for age, education, and gender. The covariates were entered first, and their effect on IADL established. Then the entire set of cognitive performance measures was added as predictors. IADL was used as the dependent variable. Model fit was examined.

Next, we reorganized the observed variables as a confirmatory bifactor measurement model, testing our apriori assumptions about which measures can be associated with domain specific "memory" and "executive" factors (i.e., "MEM" and "ECF," respectively). All indicators were adjusted for age, education, and gender. The relative correlations between both latent constructs and IADL were determined. Model fit was again examined.

Next, we introduced a third latent construct representing Spearman's general intelligence factor "g." The entire battery of psychometric measures was used as g's indicators. We examined the effect of g's introduction the latent domain specific factors and their indicator weights. As before, all indicators were also adjusted for age, education, and gender. The relative correlations between g, the domain specific latent constructs and IADL were determined. Model fit was again examined.

Next, we reorganized g into IADL-related and independent fractions (i.e., "d" and "g," respectively), as previously described (e.g., Royall and Palmer, 2013). By definition, g had no association with IADL. The relative correlations between d, the domain specific latent constructs and IADL were determined. Model fit was again examined.

Next, we searched for additional measure specific associations between individual cognitive measures and IADL, independent of the latent constructs. Finally, we systematically explored the possibility of significant intercorrelations amongst the indicator variables' residuals, which might suggest the existence of additional latent constructs other than g- , d, MEM, and EF. Only intercorrelations between two indicators' residuals that were statistically significant, improved model fit and did not result in negative variance or other model misspecifications, were retained.

#### *Missing data*

These models were all constructed in an SEM framework, using raw data. Modern Missing Data Methods were automatically applied by the AMOS software. AMOS uses Full information Maximum Likelihood (FIML) methods to address missing data. FIML uses the entire observed data matrix to estimate parameters with missing data. In contrast to list wise or pair wise deletion, FIML yields unbiased parameter estimates, preserves the overall power of the analysis, and is arguably superior to alternative methods, e.g., multiple imputation (Schafer and Graham, 2002; Graham, 2009).

#### *Fit indices*

Model fit was assessed using four common test statistics: chisquare, the ratio of the chi-square to the degrees of freedom in the model (CMIN /DF), the comparative fit index (CFI), and the root mean square error of approximation (RMSEA). Where two nested models were compared, the Browne–Cudek Criterion (BCC) was added (Browne and Cudeck, 1989).

A non-significant chi-square signifies that the data are consistent with the model (Bollen and Long, 1993). However, in large samples, this metric is limited by its tendency to achieve statistical significance when all other fit indices (which are not sensitive to sample size) show that the model fits the data very well. A CMIN/DF ratio <5.0 suggests an adequate fit to the data (Wheaton et al., 1977). The CFI statistic compares the specified model with a null model (Bentler, 1990). CFI values range from 0 to 1.0. Values below 0.95 suggest model misspecification. Values approaching 1.0 indicate adequate to excellent fit. An RMSEA of 0.05 or less indicates a close fit to the data, with models below 0.05 considered "good" fit, and up to 0.08 as "acceptable" (Browne and Cudeck, 1993). A lower BCC statistic indicates better fit (Browne and Cudeck, 1989). All fit statistics should be simultaneously considered when assessing the adequacy of the models to the data.

# **RESULTS**

Sample demographics are presented in **Table 1**. Clinical assessment means are presented in **Table 2**. Model 1's fit was poor (**Table 3**). Together, the cognitive performance measures and covariates explained 24.1% of variance in IADL. Age, gender, DSS (*r* = 0.224, *p* = 0.001) DRS:MEM (*r* = 0.158, *p* = 0.02) and EXIT25 (partial *r* = −0.145, *p* < 0.001), contributed significantly to IADL, similar to previous analyses in this cohort (Royall et al., 2000, 2004, 2005a,b) (**Figure 1**).

Model 2 posits two domain specific factors, MEM and EF (**Figure 2**). The fit of this model is significantly improved relative


*COWA, Controlled Oral Word Association Test; CLOX1, clock drawing to command; CVLT, California Verbal Learning Test, List A delayed recall; DRS: MEM, Mattis Dementia Rating Scale Memory subscale; DSS, WAIS-R Digit Symbol Substitution; EXIT25, Executive Interview; Trails B, Trail-making Test Part B; WCAT, Wisconsin Card Sorting Test categories achieved.*

**Table 3 | Model fit.**


*BCC, Browne–Cudek Criterion; CFI, Comparative Fit Index; RMSEA, Root Mean Square Error of Approximation.*

to Model 1 (**Table 3**). CVLT:Short, CVLT:Long, CVLT 1–5, and DRS MEM all load significantly on MEM (all *p* < 0.001). The strengths of their loadings ranged from *r* = 0.52 (DRS:MEM) to *r* = 0.90 (CVLT:Long). CLOX1, DSS, EXIT25, Trails B, and WCAT all load significantly on EF (all *p* = 0.002). The strengths of their loadings ranged from *r* = −0.25 (Trails B) to 0.66 (DSS). MEM and EF were uncorrelated. As expected, EF was significantly associated with IADL (*r* = 0.34, *p* < 0.001). MEM was weakly but significantly correlated with IADL independent of EF (*r* = 0.17, *p* = 0.02).

Model 3 posited the addition of a third factor, Spearman's g. Our first attempt at a three factor model failed (due to unsuccessful minimization and negative variance). Minimization could be achieved by correlating EF and MEM but (1) the correlation between MEM and EF was not significant (*r* = −0.04, *p* = 0.923), (2) negative variance persisted on COWA's residual, (3) EF lost its association with IADL (*r* = −0.05, *p* = 0.924), (4) EF had no significant indicators (all *p* > 0.92).

Two alternative two factor models were then tested. Model 3a omitted the factor MEM (**Table 3**). Model 3b omitted the factor EF. In each case, these models containing g fit the data better than Models 1 or 2. In each case, the latent variable g had a stronger correlation with IADL than did the second domain specific factor. In Model 3a, g fully mediated EF's previously significant association with IADL in Model 2. However, Model 3b fit the data significantly better than did Model 3a. On the basis of these findings, the latent factor EF was deleted from subsequent models.

In the adopted Model 3b (**Figure 3**), g was indicated significantly by all the cognitive measures (all *p* ≤ 0.002) ranging from Trails B (*r* = −0.23, *p* = 0.002) to DSS (*r* = 0.66, *p* < 0.001). MEM's factor loadings were slightly attenuated by g's creation, ranging from *r* = 0.23 (DRS:MEM) to *r* = 0.70 (CVLT:Long). g was significantly correlated with IADL (*r* = 0.40, *p* < 0.001). MEM had no significant association with that variable (*r* = 0.09, *p* = 0.261). Thus, g both mediates MEM's unadjusted association with IADL and better fits the variance in our putative ECF measures than would an EF domain-specific factor, whether adjusted for g or not.

Model 4 parses Spearman's g into two fractions (**Figure 3**). d is indicated by IADL and the cognitive performance measures. g- (i.e., d's residual in Spearman's g) and MEM are indicated only by cognitive performance measures. This arrangement had excellent fit, and fit the data significantly better than any previous model (**Table 3**). d was significantly indicated by all the

WAIS-R Digit Symbol Substitution; EDU, Education; EXIT25, Executive Interview; IADL, Instrumental Activities of Daily Living; TrailsB, Trail-making Test Part B; WCAT, Wisconsin Card Sorting Test categories achieved. ∗All observed indicators are adjusted for age, education, and gender (paths not shown).

cognitive measures except WCAT (*r* = 0.10, *p* = 0.30) and Trails B (*r* = 0.05, *p* = 0.63). WCAT and Trails B loaded significantly on g- (both *p* ≤ 0.002) as did all the other cognitive measures, ranging from CLOX1 (*r* = −0.27) to COWA (*r* = −0.62, both *p* < 0.001).

However, by definition, g had no association with IADL. In contrast, d was associated strongly with IADL (*r* = 0.65, *p* < 0.001). Independently of their associations with d, no cognitive performance measure was significantly associated with IADL, i.e., through their residuals. Thus, WCAT and Trails B had no significant associations with IADL at all.

recall; DRS: MEM, Mattis Dementia Rating Scale Memory subscale; DSS, WAIS-R Digit Symbol Substitution; EDU, Education; EXIT25, Executive Interview; IADL, Instrumental Activities of Daily Living; TrailsB, Trail-making Test Part B; WCAT, Wisconsin Card Sorting Test categories achieved. ∗All observed indicators are adjusted for age, education, and gender (paths not shown).

There were no significant intercorrelations amongst the residuals of the final three latent constructs' indicators, in Model 4. Specifically, none of the ECF measures' residuals were significantly correlated. This finding closes the door to the possibility of one or more unmodeled factors, including EF or processing speed. Since the modeled factors explain a minority of the variance in most ECF measures (**Figure 4**), their uncorrelated residuals may reflect measure specific "measurement error." By definition, the three latent variables d, g- , and MEM were orthogonal to each other and could not be intercorrelated.

# **DISCUSSION**

In this analysis, we have confirmed the relatively strong association between the EXIT25, and IADL, in a multivariate regression model. The EXIT25 contributed significantly to IADL independent of memory measures and a battery of other EF measures. This is consistent with several previous studies in a wide range of samples (Chan et al., 2006; Lewis and Miller, 2007; Pereira et al., 2008), including this one (Royall et al., 1998a, 2000, 2004, 2005a,b).

Together, cognitive measures and covariates explained a respectable fraction of IADL variance. However, the model did not fit the data well. The SEM approach forces our attention to the quality of a model's fit, not merely the significance of its parameters and the total variance explained in its dependent variable. In every case, the introduction of latent variables fit the data better than did our initial multivariate regression approach.

Model 2 has confirmed our apriori assumptions about the domain specific face validity of our cognitive battery. All the memory measures loaded significantly on the latent construct "MEM." All the executive measures loaded significantly on the latent construct "EF." These factors were not significantly associated with each other. As we expected, EF was more strongly associated with IADL than MEM, which was weakly associated with that construct.

However, subsequent models with better fit have forced us to abandon the EF construct. The introduction of g and d provide a much better fit, and the absence of significant intercorrelations among their indicators' residuals closes the door to the possibility of unmodeled alternative factors (e.g., processing speed, etc.).

Models 3b and 4 suggest that EF measures have no association with IADL independent of general intelligence and specifically its subfraction δ. Model 4 demonstrates that EF measures have no special or unique association with IADL, even through d. d is also indicated by memory tasks, and they load more strongly on d than any executive measure.

Independent of d, EF measures cannot be associated with IADL, either individually (through their residuals), or via g- (by definition). WCAT and Trails B load only on g and thus have no association with IADL at all. This is consistent with their failure to contribute significantly to IADL independently of the EXIT25 and other EF measures in Model 1.

Both findings also replicate our earlier factor analysis in this dataset (Royall et al., 2003). In that analysis, the variance in a battery of EF measures was empirically distributed across three factors. The first (28% of variance) was indicated by CLOX, COWA, DSS, and the EXIT25. The second (24.2% of variance) was uniquely indicated by the WCST and its subtasks. The third (12.4% of variance) was indicated uniquely by Trails B. Only the first factor was associated with IADL. The fact that d and g explain so little of the variance in our battery of otherwise non-correlated measures suggests that each EF measure may have considerable "measurement error" associated with it.

Duncan and others have previously associated g with frontal structure and function (Duncan and Owen, 2000; Duncan et al., 2000; Choi et al., 2008; Gläscher et al., 2010). Similarly, several of our EF measures have been associated with frontal structure and /or function (Royall et al., 2007; Royall, 2011). However, Model 4 demonstrates that the variance in our EF indicators is distributed across two orthogonal latent factors, d and g- . Neither is specifically associated with EF, as both are significantly indicated by also by memory tests. It is an empirical question which,

Summed learning trials 1–5, SHORT, immediate recall, LNG, delayed recall; DRS: MEM, Mattis Dementia Rating Scale Memory subscale; DSS, WAIS-R WCAT, Wisconsin Card Sorting Test categories achieved. ∗All observed indicators are adjusted for age, education, and gender (paths not shown).

if either latent construct can mediate g's observed association with frontal structure and /or function. Our dataset cannot address that question.

Because g (Model 3), g and d (Model 4) have been adjusted for memory-specific task performance (i.e., MEM), it could be argued that the loadings of memory tasks on the first three latent constructs reflects the "executive" fraction of those measures' variance (e.g., "Working Memory"). Working Memory has been related to "updating" and can be associated with measures of intelligence (Friedman et al., 2008).

However, only d is associated with IADL. Working Memory has previously been associated with IADL (Lewis and Miller, 2007) and d is more strongly indicated by memory tasks than by executive ones. Moreover, d and g are orthogonal to each other. Thus, they cannot both be "executive," and if g were to be identified as the true executive factor (after all, it is most strongly loaded

**FIGURE 4 | Model 4∗.** COWA, Controlled Oral Word Association Test; CLOX1, clock drawing to command; CVLT, California Verbal Learning Test; 1–5, Summed learning trials 1–5, SHORT, immediate recall, LNG, delayed recall; DRS: MEM, Mattis Dementia Rating Scale Memory subscale; DSS, WAIS-R

Digit Symbol Substitution; EDU, Education; EXIT25, Executive Interview; IADL, Instrumental Activities of Daily Living; TrailsB, Trail-making Test Part B; WCAT, Wisconsin Card Sorting Test categories achieved. \*All observed indicators are adjusted for age, education, and gender (paths not shown).

by COWA and the only factor associated with Trails B and WCAT) then EF can again have no impact on IADL.

d uniquely accounts for a sizable fraction of IADL's variance, and explains more variance in IADL than did the ECF factor in Model 2, or indeed the entire battery in Model 1. d is a homolog of δ, our latent dementia proxy. δ and its homologs are strongly and specifically associated with clinical dementia status, as measured by the Clinical Dementia Rating Scale (Hughes et al., 1982; Royall et al., 2012a,b; Royall and Palmer, 2013, 2014). Even in this nondemented cohort, the interindividual variance in δ scores predicts longitudinal change in ECF measures, specifically the EXIT25 and Trails B (but neither WCAT nor the CVLT) (Royall and Palmer, 2012). The fact that δ predicts longitudinal change in Trails B in this very cohort suggests that Trails B's failure to load on d in this analysis may be an artifact of its baseline distribution, which is skewed and subject to floor effects. In longitudinal analyses, each subject is its own control.

In contrast to Spearman's g, δ has been associated with atrophy in the Default Mode Network (DMN) (Royall et al., 2012b, 2013). The DMN is associated with a subregion of the frontal lobe (i.e., a small portion of the dorsolateral prefrontal cortex), but also with subregions of the temporal lobe, the parietal lobe, the cingulate gyrus and the hippocampus (Buckner et al., 2008). The latter may explain the relatively strong loadings of memory measures on d. Thus, it seems unlikely that d would localize to the frontal cortex, as might be expected of an "executive" construct (although specific frontal localizations have in fact not been shown for many executive measures).

In short, the associations between the EF factor, or its indicators and IADL are mediated uniquely through d, i.e., a fraction of Spearman's g. This result is similar to an analysis by Salthouse et al. (1996) of age's influence on cognitive task performance. They found moderately strong age-related declines on a battery of tests that included the WCST, Trails-B, and DSS, among others. However, correlation-based analyses revealed that the age-related effects on different measures were not independent. Instead, the effect of age was observed specifically in the fraction of variance (averaging 58%) shared across all the observed measures (i.e., "g"). Thus, g and δ may also mediate age-specific effects on ECF measures. This would explain age's broad effects on cognitive performance, relatively strong effects on "ECF" measures, and the disabling character of those effects (if mediated through δ).

On the other hand, aging is also characterized by a "dedifferentiation" of cognitive test performance (McArdle et al., 2002). This may favor the demonstration of global constructs such as g, g- , and δ. It remains to be seen if a δ homolog would mediate the association(s) between one or more EF factors and IADL in healthy younger adults. One potential obstacle to such a study would be selection of a valid IADL measure. The informantrated IADL measure we used here may have floor effects in highly functioning populations. Nevertheless, δ is not very sensitive to its IADL target, and has similar psychometric properties regardless of the target IADL indicators used to date (Royall and Palmer, 2012, 2013, 2014; Royall et al., 2012a,b; Gavett et al., 2014).

Our dataset is further limited by other issues. It does not contain measures of supposedly fundamental executive tasks (i.e., inhibition, categorization, and updating) (Miyake et al., 2000). Such measures are arguably less prone to measurement error than the "complex" ECF measures we have employed. They, and other executive tasks (e.g., set-shifting and delayed matching to sample) have been associated with frontal lobe lesions and structures. However, such low level cognitive abilities (which can be demonstrated in chimpanzees for example, at an estimated three year old human intelligence equivalent) (Moriguchi et al., 2011) may not be representative of the emergent ECF that characterizes adult human action.

It is arguable that δ cannot be demonstrated in any animal that is incapable of IADL (by definition). This may have a biological explanation. The human brain, uniquely among primates, exhibits frontal networks that extend beyond the frontal lobe (including the DMN). The frontal networks of other primates are localized to that structure (Wey et al., 2013). Frontal tasks not-related to IADL, and /or demonstrable in animals incapable of IADL, are arguably not "executive" but merely "frontal." They may be associated with δ in humans, but then so might any cognitive performance measure, whether executive or non-executive, and whether localizable to the frontal lobe or not. Regardless, their demonstration and functional localization to frontal structures in animals incapable of IADL will not be associated with δ, by definition.

Friedman et al. (2008) have demonstrated the existence of a latent "Common" EF factor, that is indicated by all basic EF measures, as are g, g- , and δ/d. Friedman et al. distinguished their Common factor from both intelligence and processing speed. However, they did not try to associate their Common factor with non-executive indicators, and so its specificity to EF is undemonstrated, as is its association with IADL, and therefore δ.

Ironically, the Common factor's independence from intelligence suggests that it may indeed be more likely to correspond to d in this analysis than to g- , as g would be expected to correlate more strongly with observed performance on intelligence measures. Friedman et al. also observed that a theorized "Inhibition" factor collapsed after the Common factor's introduction. That is consistent with EF's collapse in our analysis after the introduction of g.

Second, our battery is limited in its ability to assess "processing speed." Trails B is our only timed test, although some authors associate performance on the DSS with this construct. This limits our ability to speak to processing speed as a determinant of IADL. However, such a factor is unlikely to attenuate δ's association with IADL because processing speed is an intermediate "domain-specific" factor (like MEM and EF in this analysis) and thus taps a compartment of variance in cognitive performance that is orthogonal to g (and therefore both g and δ). Had our battery been better designed to assess processing speed, we expect it would have robbed MEM of its relatively weak association with IADL rather than d.

Finally, this analysis is limited to cross-sectional data. At baseline, the FHS cohort was cognitively normal for its age, relatively highly functioning and non-institutionalized. Few subjects can be expected to have been clinically demented, although a sizeable fraction might have had "mild" neurocognitive disorders. Thus, restricted range and floor effects on some measures may have affected our analysis.

Clinical dementia status was never formally adjudicated in this cohort. Never the less, we have demonstrated that there is significant variability with regard to the cohort's longitudinal rates of change in cognitive performance over time (Royall et al., 2005a). These changes are clearly related to concurrent declines in functional status (Royall et al., 2004) suggesting aging-related declines in δ-specific variance. In fact, we have shown those associations to be mediated through δ (Royall and Palmer, 2012). Gavett et al. (2014) report that the six-year prospective longitudinal change in δ scores (δ) correlates strongly (*r* = −0.94, *p* < 0.001) with change in dementia severity, as rated by the Clinical Dementia Rating scale (CDR) (Hughes et al., 1982). Similarly, in the Texas Alzheimer's Research and Care Consortium (TARCC), δ's intercept and slope explain 79% of the variance in four year prospective dementia severity, independently of baseline dementia severity, g and g- [Palmer and Royall (ICAAD abstract), 2013]. If ECF (as distinct from EF) is synonymous with δ then it likely is the major cognitive determinant of dementia status in humans and dementia, in turn, may be limited to structural and functional pathologies of the DMN (Royall et al., 2002b, 2012b).

In summary, we have used a latent variable approach in an SEM framework to construct a well fitting model that suggests that the variance in a battery of well validated "executive" measures cannot be related to a domain specific "executive" factor independent of Spearman's general intelligence factor, g. Moreover, no cognitive performance measure in our battery can be associated with IADL independently of a certain fraction of that latent construct, i.e., d. d, as a δ homolog, is likely to be associated specifically with the structure and function of the DMN. That network extends well beyond the frontal lobe, and can be related only to certain subregions in that structure. This underscores the importance of disentangling "EF" from "frontal function" (Royall et al., 2002a).

Although we again confirm that memory specific variance has no association with IADL (and by extension with dementia), memory performance measures do contribute significantly to g (as should all cognitive performance measures) and its subparts: g and d. Only their contributions to d would be salient to functional outcomes and dementia. However, memory tasks are more strongly associated with that construct than were most "ECF" measures. It is the distribution of memory task performance across three latent constructs, two of which are irrelevant to IADL and dementia case finding that weakens their performance as predictors of IADL. In contrast, a larger share, if not the majority of variance in most putative "ECF" measures (but neither Trails B nor WCAT), is invested in δ. This explains the relatively strong associations between putative "ECF" measures, IADL and dementia status in past studies. Regardless, δ homologs should have even greater potential for dementia case-finding, although they are neither indicated solely by ECF measures, nor likely to localize specifically to the frontal lobe.

#### **REFERENCES**


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

*Received: 23 January 2014; accepted: 06 October 2014; published online: 24 October 2014.*

*Citation: Royall DR and Palmer RF (2014) "Executive functions" cannot be distinguished from general intelligence: two variations on a single theme within a symphony of latent variance. Front. Behav. Neurosci. 8:369. doi: 10.3389/fnbeh.2014.00369 This article was submitted to the journal Frontiers in Behavioral Neuroscience.*

*Copyright © 2014 Royall and Palmer. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.*

# The relationship between executive functions and fluid intelligence in schizophrenia

# *María Roca1,2,3\*, Facundo Manes 1,3, Marcelo Cetkovich1,3, Diana Bruno1, Agustín Ibáñez 1,2,4, Teresa Torralva1,3 and John Duncan5,6*

*<sup>1</sup> Neuropsychology Research Department, Institute of Cognitive Neurology (INECO), Buenos Aires, Argentina*

*<sup>2</sup> Laboratory of Cognitive and Social Neuroscience (LaNCyS), UDP-INECO, Foundation Core on Neuroscience (UIFCoN), Diego Portales University, Santiago, Chile*

*<sup>3</sup> Neuropsychology Department, Institute of Neurosciences Favaloro University, Buenos Aires, Argentina*

*<sup>4</sup> Universidad Autónoma del Caribe, Barranquilla, Colombia*

*<sup>5</sup> MRC Cognition and Brain Sciences Unit, Cambridge, UK*

*<sup>6</sup> Department of Experimental Psychology, University of Oxford, Oxford, UK*

#### *Edited by:*

*Lynne A. Barker, Sheffield Hallam University, UK*

#### *Reviewed by:*

*Francoise Schenk, Université de Lausanne, Switzerland Alicia Izquierdo, University of California, Los Angeles, USA*

#### *\*Correspondence:*

*María Roca, Neuropsychology Research Department, Institute of Cognitive Neurology (INECO), Pacheco de Melo 1854, Buenos Aires C1126AAB, Argentina e-mail: mroca@ineco.org.ar*

An enduring question is unity vs. separability of executive deficits resulting from impaired frontal lobe function. In previous studies, we have asked how executive deficits link to a conventional measure of fluid intelligence, obtained either by standard tests of novel problem-solving, or by averaging performance in a battery of novel tasks. For some classical executive tasks, such as the Wisconsin Card Sorting Test (WCST), Verbal Fluency, and Trail Making Test B (TMTB), frontal deficits are entirely explained by fluid intelligence. However, on a second set of executive tasks, including tests of multitasking and decision making, deficits exceed those predicted by fluid intelligence loss. In this paper we discuss how these results shed light on the diverse clinical phenomenology observed in frontal dysfunction, and present new data on a group of 15 schizophrenic patients and 14 controls. Subjects were assessed with a range of executive tests and with a general cognitive battery used to derive a measure of fluid intelligence. Group performance was compared and fluid intelligence was introduced as a covariate. In line with our previous results, significant patient-control differences in classical executive tests were removed when fluid intelligence was introduced as a covariate. However, for tests of multitasking and decision making, deficits remained. We relate our findings to those of previous factor analytic studies describing a single principal component, which accounts for much of the variance of schizophrenic patients' cognitive performance. We propose that this general factor reflects low fluid intelligence capacity, which accounts for much but not all cognitive impairment in this patient group. Partialling out the general effects of fluid intelligence, we propose, may clarify the role of additional, more specific cognitive impairments in conditions such as schizophrenia.

#### **Keywords: executive function, fluid intelligence, schizophrenia, frontal lobe, multitasking, decision making**

# **INTRODUCTION**

Although the efforts of multiple disciplines have brought substantial advances in the comprehension on frontal lobe functioning, much remains unclear in how this brain region participates in the organization of effective behavior. In general, prefrontal cortex (PFC) is supposed to participate in distributed brain circuits underlying "executive" functions, broadly conceived as processes that organize and control cognitive activity. Often, however, theoretical frameworks of frontal and executive functions give little detailed explanation for the wide variety of deficits produced by frontal lobe lesions. Furthermore, advances in cognitive neuroscience have rarely translated into improved clinical analysis and management of pathologies affecting frontal lobe functions.

In this paper we discuss previous results coming from basic and clinical neuroscience in order to shed light on the diverse clinical phenomenology observed in frontal dysfunction. Additionally, we present new data on a group of schizophrenic patients, aiming to demonstrate how basic cognitive neuroscience and clinical neuropsychology can converge to explain the broad cognitive deficit observed in this population. In particular, our results cast light on the balance between global cognitive deficit and specific functional impairments.

Commonly, different regions of the PFC are supposed to participate in different kinds of executive function. One important framework, for example, proposes functions such as planning, monitoring, energizing, switching, and inhibition (Stuss et al., 2002; Stuss, 2007). Though some kind of functional specialization in different regions of PFC seems certain, its detailed nature remains elusive. In recent studies addressing a variety of frontal lobe pathologies, we have proposed a framework that combines elements of broad cognitive impairment—important in many different tasks and contexts—with additional, more specific deficits (Roca et al., 2010, 2012, 2013).

One motivation for our proposal comes from functional imaging. In some frontal regions, similar activity is seen while performing very diverse cognitive tasks (Cabeza and Nyberg, 1997, 2000; Duncan and Owen, 2000). This finding of activity in many different tasks suggests what we have called a multiple-demand or MD system, important in organization of many kinds of behavior (Duncan, 2005, 2010). MD activity is commonly seen in circumscribed regions of lateral frontal cortex, dorsomedial frontal cortex, and anterior insula, with accompanying similar activity around the intraparietal sulcus (Fedorenko et al., 2012). A hint of why such extensive brain regions could be activated in so many different kinds of tasks comes from single cell electrophysiology studies in monkeys. Neurons of lateral PFC adapt their properties to task context, coding the specific information required in current behavior (Rao et al., 1997; Freedman et al., 2001). In this way they may represent a general neural resource, adapting to contribute to many different kind of task. Elsewhere we have suggested that the core function of the MD system is to organize complex cognition into a structured series of attentional episodes, assembling the selected contents of each episode, and managing transitions from one episode to the next (Duncan, 2013).

Closely related to these findings is the role of frontal functions in general intelligence or Spearman's *g*. Based on the universal positive correlations typically found between different cognitive tests, Spearman (1904, 1927) proposed that a general or *g* factor contributes to all cognitive activities. On this theory, one way to measure *g* is simply to average performance across a diverse set of tests (Spearman, 1927), the approach taken in traditional IQ tests such as the WAIS (Wechsler, 1997). Another is to find single tests that, on their own, correlate strongly with the average across multiple tests. The best single tests, often called tests of "fluid intelligence," call for novel problem-solving using visual, verbal or other materials (e.g., Raven et al., 1988). Common activity in MD regions for many kinds of behavior suggests a plausible basis for *g*, and in line with this, frontal lesions impair performance on classic fluid intelligence tests (Duncan et al., 1995). This may be especially so when lesions affect MD regions (Woolgar et al., 2010). Strong MD activity is also seen while performing fluid intelligence tests (Prabhakaran et al., 1997; Esposito et al., 1999; Duncan et al., 2000; Bishop et al., 2008). In line with our proposals concerning MD function, a salient requirement of fluid intelligence tests is that complex problems must be divided into simpler parts, calling for a novel structure of attentional episodes (Duncan, 2013).

For some years, our group has been investigating the clinical relevance of these findings in different frontal pathologies. To this end, we have investigated fluid intelligence tests and executive deficits in several clinical conditions, including frontal lobe lesions (Roca et al., 2010), Parkinson's Disease (Roca et al., 2012), and Frontotemporal Dementia (Roca et al., 2013). In particular, we have asked how much fluid intelligence loss contributes to deficits in classical executive tasks. Consistently, our results (Roca et al., 2010, 2012, 2013) have shown the same picture. For some classical executive tasks, such as the Wisconsin Card Sorting Test (WCST; Nelson, 1976), Verbal Fluency (Benton and Hamsher, 1976), and Part B of the Trail Making Test (TMTB; Partington and Leiter, 1949), executive deficits are entirely explained by fluid intelligence. Once a measure of fluid intelligence is partialled out, no differences between patients and normal controls remain (for an exception in the case of Verbal Fluency see Robinson et al., 2012). These results suggest that frontal deficits in such tasks are associated with rather general cognitive processes rather than the specific content of individual tests. Other executive tests, however, show a different picture, with clinical deficits not explained by fluid intelligence. Most consistently falling in this group are multitasking tests, which assess the ability to select and maintain higher order internal goals while other sub-goals are being performed, and tests of social cognition. In these tasks, our data suggest that deficits are related to damage outside the classical MD regions, in particular the most anterior part of the frontal cortex, known as the frontal pole or APFC (Roca et al., 2010, 2011). Intriguingly, we have found mixed results with a further test, the Iowa Gambling Test (IGT) of probabilistic decision-making. Previous data suggest a multicomponent test, with prominent contributions from ventromedial frontal cortex but also other frontal areas (Bechara et al., 2000; Manes et al., 2002), and we found deficits beyond those explained by fluid intelligence only for pathology with strong ventromedial involvement (Roca et al., 2013).

Based on these findings, we propose a new way to dissociate executive impairments in frontal lobe pathology. While damage to MD regions affects fluid intelligence, impairing performance on many classical executive tests, other regions of damage produce deficits that go beyond those explained by fluid intelligence and that cannot be captured by classical executive tests. Our approach bears on one of the oldest and most central problems in clinical neurology. Some patients with frontal lobe dysfunction present obvious cognitive and behavioral deficits, even if they perform remarkably well in many neuropsychological examinations. This so-called "frontal lobe mystery" has been at the center of clinical neuropsychology debates and has been particularly described in patients with damage involving APFC regions (Burgess, 2000).

Here, we apply this framework to the question of global cognitive deficit vs. specific executive impairments in schizophrenia. As in various psychiatric conditions, cognitive deficits in schizophrenia have been commonly explained in terms of frontal dysfunction (e.g., Lewis and González-Burgos, 2008; Lewis, 2012), and in anatomical terms, a range of frontal abnormalities have been described (Ellison-Wright et al., 2008; Haukvik et al., 2013). Importantly, some recent factor analytic studies have described a single principal component that accounts for much of the variance of patients' cognitive performance (e.g., Dickinson et al., 2006, 2008; Keefe et al., 2006; Harvey et al., 2011, 2013). For example, Dickinson et al. (2008) used structural equation modeling to show that approximately 63% of the schizophrenia diagnosis-related variance in cognitive performance was mediated through a single general factor. Using a very large sample size (*n* = 1493), Keefe et al. (2006) compared the efficacy of various competing models in their ability to account for the structure of cognitive deficits in schizophrenia. Their exploratory principal component analysis showed that a unifactorial structure, which accounted for 45% of the variance of patients' cognitive performance, was the best fitting model to describe patients' cognitive functioning. Deficits in fluid intelligence tests have also been reported in this population (e.g., Caspi et al., 2003; Zanello et al., 2006; Johnson et al., 2013). Accordingly, a general "*g* like" deficit seems to explain much of the cognitive deficit in this patient group.

In parallel to this factor analytic work, neuropsychological studies of schizophrenia show deficits in a range of executive tests, including the WCST, multitasking and tests of social cognition (e.g., Liddle and Morris, 1991; Frith et al., 1992; Thoma and Daum, 2005; Thoma et al., 2007; Kim et al., 2009; Egan et al., 2011; Banno et al., 2012; Cochrane et al., 2012; Erol et al., 2012; Baez et al., 2013; Fond et al., 2013; Sánchez-Torres et al., 2013). Again there is evidence for some link to fluid intelligence. For example, as part of a broader study aimed at describing the pattern of relationships between measures of cognitive performance and symptom subtypes in schizotypy and schizophrenia, Cochrane et al. (2012) examined both fluid intelligence (measured using the Matrices test of the Wechsler Intelligence Scale; Wechsler, 1997) and Verbal Fluency. Low fluency scores were associated with the negative factor of a clinical interview schedule (SANS), and this association was reduced when fluid intelligence was introduced as a covariate. Also, executive functions and fluid intelligence impairments were associated variables of the same latent factor using structural equation models in schizophrenia patients, as well as in their first degree relatives and other psychiatric disorders (Ibáñez et al., 2013).

Here we test the prediction that a broad, "*g* like" cognitive decline accounts for some but not all of the executive impairment in schizophrenia. As in our previous work, we ask how strongly executive deficits remain once fluid intelligence is introduced as a covariate. We predict that, for some classical executive tests, impairments in schizophrenia will be fully explained by reduced fluid intelligence. For other tests—here including multitasking and probabilistic decision-making—we predict that deficits will remain even once the effects of fluid intelligence are removed.

# **MATERIALS AND METHODS**

#### **PARTICIPANTS**

Fifteen patients with a diagnosis of schizophrenia according to DSM-IV criteria were recruited for a broader ongoing study of schizophrenia at the Institute of Cognitive Neurology (INECO). The Positive and Negative Syndrome Scale (PANSS) was used for measuring symptom severity. All patients gave informed consent and underwent a detailed examination of their psychiatric and neuropsychological profile, supported by EEG, SPECT, and MRI as needed for diagnosis. All patients were receiving antipsychotic treatment and showed stable psychotic symptoms for a period of at least 8 weeks, over which no change in medication dose or type was indicated. Mean (*SD*) age for the patient population was 36.7 (8.6). Mean PANSS total score was 68.0 (16.9).

Fourteen healthy controls were randomly recruited from a larger pool of volunteers who had neither a history of abuse of recreational drugs nor a family history of neurodegenerative or psychiatric disorders. Mean (*SD*) age for the control group was 42.6 (14.7) years. All participants in the study were examined to assure they had no comorbidity with other psychiatric or neurological disorders.

# **NEUROPSYCHOLOGICAL ASSESSMENT**

#### *Wisconsin Card Sorting Test (Nelson, 1976)*

For the WCST we used Nelson's modified version of the standard procedure. Cards varying on three basic features—color, shape, and number of items—must be sorted according to each feature in turn. The participant's first sorting choice becomes the correct feature, and once a criterion of six consecutive correct sorts is achieved, the subject is told that the rules have changed, and cards must be sorted according to a new feature. After all three features have been used as sorting criteria, subjects must cycle through them once again in the same order as they did before. Each time the feature is changed, the next must be discovered by trial and error. Score was total number of categories achieved before completing a maximum of 48 cards.

# *Verbal Fluency (Benton and Hamsher, 1976)*

In Verbal Fluency tasks, the subject generates as many items as possible from a given category. We used the standard Argentinean phonemic version (Butman et al., 2000), asking subjects to generate words beginning with the letter P in a one-minute block. Score was the total number of correct words generated.

# *Trail Making Test B (Partington and Leiter, 1949)*

The Trail Making Test consists of two parts. In the present study part B was administered (TMTB). In this test the subject is required to draw lines sequentially connecting 13 numbers and 12 letters distributed on a sheet of paper. Letters and numbers are encircled and must be connected alternately in increasing/alphabetical order (i.e., 1, A, 2, B, 3, C, etc.). Score was the total time (s) required to complete the task, given a negative sign so that high scores meant better performance.

# *Hotel Task (Manly et al., 2002; Torralva et al., 2009)*

The task comprised five primary activities related to running a hotel (compiling bills, sorting coins for a charity collection, looking up telephone numbers, sorting conference labels, proofreading). The materials needed to perform these activities were arranged on a desk, along with a clock that could be consulted by removing and then replacing a cover. Subjects were told to try at least some of all five activities during a 15 min period, so that, at the end of this period, they would be able to give an estimate of how long each task would take to complete. It was explained that time was not available to actually complete the tasks; the goal instead was to ensure that every task was sampled. Subjects were also asked to remember to open and close the hotel garage doors at specified times (open at 6 min, close at 12 min), using a coloured button. Of the several scores possible for this task, we used time allocation: for each primary task we assumed an optimal allocation of 3 min, and measured the summed total deviation (in seconds) from this optimum. Total deviation was given a negative sign so that high scores meant better performance.

# *Iowa Gambling Task (Bechara et al., 2000)*

In the IGT, subjects are required to pick cards from four decks and receive rewards and punishments (winning and losing abstract money) depending on the deck chosen. Two "risky" decks yield greater immediate wins but very significant occasional losses. The other two "conservative" decks yield smaller wins but negligible losses that result in net profit over time. Subjects make a series of selections from these four available options, from a starting point of complete uncertainty. Reward and punishment information acquired on a trial by trial basis must be used to guide behavior toward a financially successful strategy. Normal subjects increasingly choose conservative decks over the 100 trials of the task. Our score was the total number of conservative minus risky choices.

### *General Test Battery (GTB)*

All participants were also assessed with a general test battery used to derive a measure of fluid intelligence. As noted above, fluid intelligence can be measured either using a standard psychometric test such as Raven's Matrices (Raven et al., 1988) or simply by averaging performance on a diverse battery of novel tasks; in practice, these two approaches give largely similar results, and we used the latter method here. The general test battery included Forward Digit Span (Wechsler, 1997), the Rey Auditory Verbal Learning Test (Rey, 1941), the Rey Complex Figure Test (Rey, 1941), and Trail Making Test A (Partington and Leiter, 1949). For this set of tests, principal component analysis produced a first component accounting for 55.61% of the total variance. Loadings on this component were moderate to high for all tests (range = 0.45– 0.88). The general or *g* score for each participant (*g*GTB) was defined as the score on this first principal component.

# **RESULTS**

Groups were matched for age and premorbid IQ as measured with the Word Accentuation Test (Burin et al., 2000). For all tests, the mean scores of each group are shown in **Table 1**. For all cognitive tasks, two-tailed *t*-tests were used to compare patients and controls. As expected, the schizophrenic group was significantly impaired on all tests, including the classical executive tests [WCST: *t(*27*)* = −3*.*31, *p <* 0*.*01; Verbal Fluency: *t(*27*)* = −2*.*86, *p <* 0*.*01; TMTB *t(*27*)* = 2*.*48, *p* = 0*.*02] and the tests of multitasking and decision making [Hotel: *t(*27*)* = 3*.*34, *p <* 0*.*01; IGT: *t(*27*)* = −3*.*47, *p <* 0*.*01]. For *g*GTB significant differences between groups were also found, [*t(*27*)* = −4*.*33, *p <* 0*.*01].

Scatterplots relating *g*GTB to the three classical executive tests are shown in **Figure 1**. For all classical executive tests, scores were heavily dependent on *g*GTB, and once this influence was removed by ANCOVA, group differences were no longer significant (**Table 1**). Regression lines in **Figure 1** come from the standard ANCOVA model, reflecting the average within-group association of the two variables and constrained to have the same slope across groups. As calculated from the corresponding variance terms of the ANCOVA, average within-group correlations with *g*GTB were 0.44 for WCST, 0.51 for Verbal Fluency, and 0.68 for TMTB.

Scatterplots relating *g*GTB to the other executive tests are shown in **Figure 2**. For these tests results were very different. Scores were barely related to *g*GTB, with average within-group correlations of 0.05 for the Hotel Task and 0.11 for the IGT. In both cases using ANCOVA to remove the influence of *g*GTB left significant group differences (**Table 1**).

# **DISCUSSION**

Though it seems certain that the frontal lobes contribute to multiple cognitive functions, it remains unclear how these functions should be separated and defined. Recently, we have proposed a novel parcellation, based on the role of fluid intelligence. For one set of executive tests, deficits in a variety of neurological and neuropsychiatric conditions are entirely explained by loss of fluid intelligence. For these tests, once the effects of fluid intelligence are partialled out, no clinical deficit remains. These deficits, we propose, are explained by damage to the distributed frontoparietal MD system, important in constructing any cognitive activity from a structured series of attentional episodes. For other tests, in contrast, deficits remain even after effects of fluid intelligence are removed. For one group of tests, including tests of multitasking and social cognition, deficits may relate to impaired function in the APFC. For the IGT, deficits not explained by fluid intelligence may reflect the value-based decision-making functions of ventromedial frontal cortex (Bechara et al., 2000). More broadly, we propose that removing the effects of fluid intelligence may clarify relations between other more specific executive impairments and frontal regions outside the MD system.

Here, we apply this novel parcellation of frontal lobe functions to the case of schizophrenia. Consistent with our results in multiple neuropsychiatric conditions with frontal involvement (Roca



*gGTB, g score for each participant derived from the general cognitive battery; WAT, Word Accentuation Test; WCST, Wisconsin Card Sorting Test; TMTB, Trail Making Test part B; IGT, Iowa Gambling Task. Statistically significant differences (p <* 0*.*05*) are marked in bold.*

et al., 2010, 2012, 2013), fluid intelligence proves to be a substantial contributor to many executive deficits in this disease. For some widely used executive tasks, including the WCST, Verbal Fluency, and TMTB, patients' deficits are entirely explained by individual fluid intelligence scores. Once fluid intelligence is partialled out, no differences between patients and normal controls remain. For a second set of executive tasks, however, deficits persist even after fluid intelligence is statistically controlled. In the present data, such results were seen for tests of multitasking (Hotel Task) and decision-making (IGT).

Our results are consistent with investigations suggesting that a single factor accounts for much of the cognitive impairment in schizophrenia (Dickinson et al., 2004; Keefe et al., 2006; Dickinson and Harvey, 2009; Dickson et al., 2011). Following our previous findings, we propose that this general factor links closely to standard measures of fluid intelligence, suggesting impaired function in the distributed, frontoparietal MD system. Importantly, our results show that this is only a part of the picture. Some other deficits observed in this population exceed this general deficit, arguing in favor of specific functional impairments that are not so closely related to *g*. In this investigation, we show that the multitasking and decision making deficits evidenced in schizophrenic patients represent specific cognitive deficits, rather than general ones.

Our multitasking results have important implications for the overall functional architecture of cognitive control in schizophrenia. Neuropsychological (Roca et al., 2011) and neuroimaging studies (e.g., Burgess et al., 2007; Badre and D'Esposito, 2009) have related multitasking with the correct functioning of APFC. In this regard, our results are consistent with recent studies which have proposed that in schizophrenic patients, the organization of cognitive control follows a rostro-caudal organization within the PFC (Barbalat et al., 2011), with APFC being at the top of a frontal processing hierarchy (Koechlin et al., 2003; Badre and D'Esposito, 2007).

Decision making deficits have also been described in schizophrenia (e.g., Kim et al., 2009; Fond et al., 2013). Classically, impairments on the IGT have been related to ventromedial frontal damage (Bechara et al., 2000), but deficits also follow lesions in other frontal regions, suggesting a multicomponent test (Manes et al., 2002). In patients with focal frontal lesions, rarely extending to ventromedial cortex, we found IGT deficits to be fully explained by fluid intelligence (Roca et al., 2010). In Frontotemporal Dementia, a disease with a strong ventromedial component, the IGT fell into the alternative group of tests, with deficits that remained even once effects of fluid intelligence were removed (Roca et al., 2013). The present results suggest that, in schizophrenia also, decision making deficits represent a specific rather than a general impairment, possibly reflecting ventromedial PFC dysfunction.

The fact that general and specific deficits are found in schizophrenia—with several classical executive deficits falling in the first group and multitasking and decision making deficits falling in the second—clarifies the understanding of cognitive deficits in this pathology. Coincidently with the widespread pathology of this disease, the associated cognitive deficits seem to represent a multicomponent cognitive dysfunction, but with a strong *g*-like element.

We believe that our results raise serious methodological issues related to testing procedures used in patients with schizophrenia. Both in clinical practice and research with this patient group, many different tests are used as measures of frontal dysfunction. Deficits in the WCST (e.g., Egan et al., 2011; Banno et al., 2012; Sánchez-Torres et al., 2013), Verbal Fluency (e.g., Liddle and Morris, 1991; Frith et al., 1992; Cochrane et al., 2012), and TMTB (e.g., Chan et al., 2006; Erol et al., 2012; Sánchez-Torres et al., 2013) have been consistently reported. Impairments in such tasks are often interpreted with close reference to specific test content. In the WCST, for example, deficits are ascribed to impaired switching of cognitive set (Milner, 1963) while in verbal fluency, the frontal impairment (Benton, 1968) is commonly interpreted as a failure in spontaneous generation of new search strategies. However, in the present study we show that deficits in these tests are entirely explained by a fluid intelligence loss. For these tests, it seems likely that specific test content is unimportant for interpreting the schizophrenic deficit. Instead, deficits reflect a broad cognitive disorganization affecting many different kinds of complex cognition. Equally significant are the implications for identifying deficits that go beyond those explained by fluid intelligence, including deficits in multitasking and probabilistic decisionmaking. Given the importance of fluid intelligence in tasks of many kinds, removing its affects statistically may be important in allowing other, more specific deficits to be uncovered.

Our results have important clinical implications, both for the use of appropriate assessment tools and for the implementation of adequate rehabilitation strategies in schizophrenia. In our view, an optimal neuropsychological assessment in this population should include a fluid intelligence test in order to capture the general factor affected in the disease. Though the present results are limited by the particular group of executive tests employed, the role of fluid intelligence in predicting executive deficits is likely to be widespread. A recent study, for example, shows that a conventional intelligence test explains most or all frontal deficits in the Delis Kaplan Executive Function System (D-KEFS; Barbey et al., 2013). A fluid intelligence test must be supplemented, however, by tasks that are able to detect additional deficits, including multitasking and decision making tasks. From a rehabilitation perspective, it can be inferred that strategies targeting fluid intelligence deficits (e.g., Jaeggi et al., 2008; Jaušovec and Jaušovec, 2012) should be separated from strategies targeting specific multitasking and decision-making deficits (e.g., Manly et al., 2002). Further investigations should explore the differential impact of fluid intelligence and other frontal deficits in patients' daily living and quality of life.

In the present paper we propose an approach that can explain, at least in part, some of the variety of deficits associated with frontal functions. The broad approach addresses clinical findings in diverse pathologies with frontal dysfunction and, in particular, we demonstrated how this model could explain previous results coming from factor analytic studies in schizophrenia. We believe that our approach represents a step forward toward the required trans-disciplinary unification of theory and practice in the investigation of frontal lobe function. Here, we show how results coming from basic and clinical neuropsychology can converge to address neurological and neuropsychiatric conditions related to frontal functioning, particularly in schizophrenia.

# **AUTHOR CONTRIBUTIONS**

All authors participated in the writing of the manuscript. María Roca designed the study, completed the research, and led the writing of the manuscript. Diana Bruno, Teresa Torralva, Marcelo Cetkovich, and Agustín Ibáñez supervised data collection and helped with the writing. Facundo Manes and John Duncan supervised the design and writing of the manuscript.

# **ACKNOWLEDGMENTS**

This work was supported by Medical Research Council (UK) intramural program MC-A060-5PQ10 and by grants CONICYT/FONDECYTRegular (1130920), FONCyT-PICT 2012-0412, FONCyT-PICT 2012-1309, CONICET, and INECO Foundation.

### **REFERENCES**


Spearman, C. (1927). *The Abilities of Man*. New York, NY: Macmillan.


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

#### *Received: 28 October 2013; paper pending published: 29 November 2013; accepted: 30 January 2014; published online: 24 February 2014.*

*Citation: Roca M, Manes F, Cetkovich M, Bruno D, Ibáñez A, Torralva T and Duncan J (2014) The relationship between executive functions and fluid intelligence in schizophrenia. Front. Behav. Neurosci. 8:46. doi: 10.3389/fnbeh.2014.00046 This article was submitted to the journal Frontiers in Behavioral Neuroscience.*

*Copyright © 2014 Roca, Manes, Cetkovich, Bruno, Ibáñez, Torralva and Duncan. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.*

# Investigating the role of executive attentional control to self-harm in a non-clinical cohort with borderline personality features

# **Jennifer Drabble \*, David P. Bowles and Lynne Ann Barker**

Brain, Behavior and Cognition Group, Faculty of Development and Society, Sheffield Hallam University, Sheffield, UK

#### **Edited by:**

Carmen Sandi, École Polytechnique Fédérale de Lausanne, Switzerland

#### **Reviewed by:**

Frauke Nees, Central Institute of Mental Health, Germany Nader Perroud, Hôpitaux Universitaires de Genève, Switzerland

#### **\*Correspondence:**

Jennifer Drabble, Brain, Behavior and Cognition Group, Faculty of Development and Society, Sheffield Hallam University, Collegiate Crescent Campus, Sheffield S10 2BP, UK e-mail: j.drabble@shu.ac.uk

Self-injurious behavior (or self-harm) is a frequently reported maladaptive behavior in the general population and a key feature of borderline personality disorder (BPD). Poor affect regulation is strongly linked to a propensity to self-harm, is a core component of BPD, and is linked with reduced attentional control abilities. The idea that attentional control difficulties may provide a link between BPD, negative affect and self-harm has yet to be established, however. The present study explored the putative relationship between levels of BPD features, three aspects of attentional/executive control, affect, and self-harm history in a sample of 340 non-clinical participants recruited online from self-harm forums and social networking sites. Analyses showed that self-reported levels of BPD features and attentional focusing predicted self-harm incidence, and high attentional focusing increased the likelihood of a prior self-harm history in those with high BPD features. Ability to shift attention was associated with a reduced likelihood of self-harm, suggesting that good attentional switching ability may provide a protective buffer against self-harm behavior for some individuals. These attentional control differences mediated the association between negative affect and self-harm, but the relationship between BPD and self-harm appears independent.

**Keywords: executive control, attention, self-harm, borderline personality disorder**

#### **INTRODUCTION**

Self-harm, intentional injuring of one's body tissue, is a core feature of Borderline Personality Disorder (BPD) and may be also seen in a diverse range of psychiatric disorders (Briere and Gil, 1998). Self-harm is thought to have a general population prevalence of around 4%, rising to 21% in clinical populations (Briere and Gil, 1998), and 89% in individuals diagnosed with BPD (Zanarini et al., 2008). Estimates show that there are 140,000–170,000 admittances to UK hospitals for self-inflicted injury per year (Hawton et al., 2007), and selfharm constitutes one of the commonest reasons for hospital admission (Weston, 2003). While the exact role of self-harm to the maintenance or attempted management of psychiatric symptoms remains to be established, it may represent a maladaptive form of affect regulation (see Klonsky, 2009, for a review).

Self-harm comprises one of several key diagnostic criteria for BPD together with frantic efforts to avoid real or imagined abandonment, unstable interpersonal relationships, impulsivity, suicidality, identity disturbance and marked inappropriate anger (Lieb et al., 2004; American Psychiatric Association, 2013). BPD affects between 1.2–6% of the general population (Crowell et al., 2009), and around 10–20% of psychiatric populations, a relatively large proportion of total number of individuals referred to psychiatric services (Lieb et al., 2004). The elevated risk for individuals with BPD to be admitted to hospitalization for selfharm is considerable: Sansone et al. (2005) found that BPD patients reported more than twice the number of self-harm behaviors than patients diagnosed with another psychiatric disorder. However despite prevalence of mutilative acts and high risk of suicidality in BPD patients, self-harming behaviors need not be present to merit a diagnosis of BPD. It is likely that propensity for self-harm in BPD is a poor prognostic indicator compared to BPD patients who do not self-harm, as BPD patients with self-harm tend to be significantly more symptomatic, prone to suicide ideation, and have more recent suicide attempts than those BPD patients without self-harm (Dulit et al., 1994; Soloff et al., 1994).

Prevalence and frequency of self-injurious behavior in normal and psychopathological groups suggest that some individuals may engage in self-harm to serve some adaptive function, at least in the short-term. This behavior may be "adaptive" insofar as it operates as an anti-dissociation mechanism that re-affirms an individual's desire to feel (Klonsky, 2007; American Psychiatric Association, 2013). Additionally, self-harm may serve as a means to elicit a response from others and avoid abandonment. However, the most frequently reported reason for engaging in self-harm in chronic BPD patients (Brown et al., 2002) and non-clinical BPD samples (Gratz and Roemer, 2008; Klonsky, 2009) is relief of negative emotion. Hence, for some individuals self-harm appears to be a way of self-soothing and coping with stress and negative affect (Gallop, 2002). Of course, chronic self-harm is a dangerous method of emotion regulation (Mikolajczak et al., 2009), and there is increased likelihood of suicide in self-harmers compared to non-self-harmers in the general population (Hawton et al., 2003; Hawton and Harriss, 2007). Additionally, research shows that self-harmers have significantly worse physical and social functioning and reduced quality of life compared to non-selfharmers in the general population (Sinclair et al., 2010).

Despite the prevalence of self-harm in individuals with BPD and in the general population, and the subsequent burden on health care services, it is surprising that potential pathways to selfharm behavior are not well understood (Glassman et al., 2007). One possibility is that reduced executive function ability may underlie self-harm by diminishing the capacity to self-regulate (LeGris and Van Reekum, 2006). Executive function(s) refers to a range of metacognitive capacities (higher-order attentional and control processes) that co-ordinate/maintain, initiate or inhibit other cognitive and emotional processes (Miyake et al., 2000; Barker et al., 2010; Morton and Barker, 2010) and govern selfordered, context-appropriate and goal-directed activity (Baddeley and Wilson, 1988; Burgess and Shallice, 1996; Burgess, 2003; Strauss et al., 2006). Several theories posit a central role of attention to executive function (Stuss and Alexander, 2000; Anderson, 2003; Giesbrecht et al., 2004; Jurado and Roselli, 2007; Muscara et al., 2008; Daches et al., 2010; Spada et al., 2010). Derryberry and Reed (2002) defined attentional control as comprising three factors: (a) ability to focus and sustain attention; (b) ability to shift attention from one task to another requiring inhibition of response contingencies to the first task in order to engage with the second task; and (c) flexible thought generation.

The notion that impaired executive/attentional control processes might mediate self-harm in BPD individuals deserves further investigation because key symptom clusters characterizing the disorder indicate poor behavioral regulation (Coolidge et al., 2004), an important marker of executive dysfunction in other patient groups (Morton and Barker, 2010). Affective instability indicated by inappropriate anger, impulsivity and risk-taking behavior are core features of BPD, and are also seen, to a lesser or greater degree, in neuropathological groups with executive dysfunction (Barker et al., 2010, 2011). Diminished inhibitory capacity increases the likelihood that individuals act on dominant and potentially maladaptive tendencies; in the case of individuals with BPD this may be self-harm. However, that said the precise executive processes diminished in BPD individuals remains to be established, although evidence suggests that they may generally comprise diminished attentional control.

LeGris and Van Reekum (2006) conducted a meta-analysis and found that 86% of studies reviewed confirmed some degree of executive function impairment in BPD individuals; the deficits most often reported fell within the category of attentional impairment. Ayduk et al. (2008) investigated the relationship between attentional control, rejection sensitivity and BPD features in a non-clinical sample using the Attentional Control Scale (Derryberry and Reed, 2002). Results showed that the association between BPD features and level of rejection sensitivity was attenuated in individuals with good attentional control. This finding suggests that good attentional control may provide some emotional buffer to override prepotent maladaptive thought patterns and inhibit dominant and maladaptive behavioral patterns in the face of perceived rejection/abandonment.

In other work Posner and Petersen (1990) defined attentional control as comprising three different but interrelated functions; alerting (achieving and maintaining an alert state), orientating, and executive control (conflict resolution/inhibition). Although the Posner and Petersen (1990) model is somewhat conceptually distinct from Derryberry and Reed (2002) model of attentional control, both share some definitional overlap and correspond well with Miyake et al. (2000) categorization of executive functions. Importantly, executive control, including orientating to, switching, focussing, and/or inhibiting attention and other cognitive processes, is integral to each theory (Posner and Petersen, 1990; Derryberry and Reed, 2002; Miyake et al., 2000).

To summarize, there is evidence to suggest that individuals with BPD have diminished executive functions; specifically they seem to exhibit deficits in attentional control and inhibiting maladaptive thoughts and behaviors. It has been suggested that they may self-harm in order to compensate for diminished affective/executive control, thus providing an outlet for emotional distress that cannot be regulated by normal cognitive and affective regulatory processes. However, less is known about what functions might contribute to self-harm in non-clinical groups with and without BPD features.

The present study investigated whether components of attentional control (shifting, focusing and flexibility) as measured by the Attentional Control Scale (ACS; Derryberry and Reed, 2002) along with BPD features would be associated with self-harm likelihood in a non-clinical sample. We predicted that deficits in specific components of attentional control (focusing, shifting, and flexibility) would be related to BPD features and self-harm. We also anticipated that attentional control would moderate the association between BPD features and self-harm.

# **MATERIALS AND METHODS**

#### **PARTICIPANTS**

A self-referring non-clinical sample (*N* = 340) of participants was recruited via advertisements placed on general social networking sites such as Facebook and Twitter, and in topic-relevant forums such as the "self-harm awareness group".<sup>1</sup> Participants were aged 16–62 (*M* = 26.94, *SD* = 10.14), and 279 (82%) were women; 117 (34.41%) participants reported previous self-harm. The two groups did not differ significantly by gender (*X* 2 (1, *N* = 340) = 0.35, *p* > 0.05), but there were significant age differences (*U* = 9251.00, *Z* = −4.41, *p* < 0.001); participants who reported prior self-harm were significantly younger. This corresponds to the pattern of diminished BPD symptoms with advancing age shown in the literature and clinical populations (Zanarini et al., 2007).

#### **MATERIALS AND PROCEDURE**

All measures were completed online via SurveyMonkey. The current research project was approved by the University's Research

<sup>1</sup>http://www.facebook.com/SHAwareness

Ethics Committee. Informed consent was obtained via an information screen containing details of the study, issues of confidentiality and the right to withdraw. Potential participants recruited to the study progressed beyond the initial consent screen to provide gender and age details before completing the self-report measures described below.

# **Attentional control measure**

*The Attentional Control Scale (ACS; Derryberry and Reed, 2002).* The ACS is a 20-item, self-report measure of attentional control. Participants respond to a four-point response scale (almost never, sometimes, often, always). High scores on the ACS represent good capacity to voluntarily control attention, whereas low scores are associated with attentional rigidity. A psychometric analysis of the scale by Fajwowska and Derryberry (2010) suggest that the ACS has three subscales; the attention focusing subscale has nine items and refers to the ability to focus and maintain attention (example item: "It's very hard for me to concentrate on a difficult task when there are noises around"). The attention shifting subscale has six items and refers to the ability to shift attention between focal points (example item: "I can quickly shift from one task to another"). The flexibility/divided attention subscale has five items (example item: "I have trouble carrying on two conversations at once"). In the current study, the focusing subscale demonstrated good internal consistency (α = 0.75), and shifting and flexibility subscale alphas demonstrated acceptable internal consistency for small scales (α = 0.58 and 0.56, respectively).

# **Measures of Borderline Personality features**

*Structured Clinical Interview for DSM-IV Axis II Personality Disorders Screening Questionnaire (SCID-II-SQ; First et al., 1997).* The SCID-II-SQ is a self-report screening measure used to assess broad personality disorder features. The current study used the 15 item BPD subscale (example item: "Have you often become frantic when you thought that someone you really cared about was going to leave you?") and was modified from the original "yes/no" response option to measure symptoms dimensionally on a four-point response scale (0 = never or not at all, 1= sometimes or a little, 2 = often or moderately, 3 = very often or extreme) based on previous work with non-clinical samples (e.g., Dreessen et al., 1999; Meyer et al., 2005; Bowles and Meyer, 2008). Two selfharm related items were removed ("Have you tried to hurt or kill yourself or threatened to do so?" and "Have you ever cut, burned, or scratched yourself on purpose?") to avoid colinearity with the measure of self-harm leaving 13 items in the scale. Internal consistency for this version of the BPD subscale has been reported as good (Cronbach's α = 0.83, Meyer et al., 2005), and the 13-item version used in the present study was at least as reliable (α = 0.90).

*The Short Coolidge Axis Two Inventory (SCATI; Coolidge, 2001).* The SCATI is also a self-report measure of personality disorder features. The five-item BPD scale was used (example item: "I am very afraid of being abandoned by someone"), and participants responded on a four-point scale (strongly false, more false than true, more true than false, strongly true). There is one self-harm related item on the scale ("I have repeatedly made suicidal threats or gestures, or I have repeatedly hurt myself on purpose"), which was removed prior to analyses to again avoid colinearity with the self-harm measure leaving four remaining items which demonstrated acceptable internal consistency in the current study (α = 0.70). The total raw scores on the Personality Assessment Inventory (PAI) can be converted to T-Scores, which are calibrated with reference to a matched community sample. Individuals with scores <60 T are considered to have fairly healthy personality organization. Scores of 60–69 T is a moderate elevation and individuals may display increasing anger and dissatisfaction. Scores of 70 T and above are considered elevated with problematic symptoms of impulsivity and interpersonal relationships. Scores greater that 90 T are generally seen only in clinical samples and suggest markedly elevated symptoms, possibly an individual in crisis.

*The Personality Assessment Inventory (PAI; Morey, 1991).* This measure is a self-administered scale used for clinical assessment of adults. The borderline features scale (PAI-BOR) includes four subscales: affective instability, identity problems, negative relationships and self-harm. The self-harm subscale was removed from analyses, and internal consistency for the remaining 18 items was good (α = 0.84).

The total raw scores on the PAI can be converted to a T-Score based on normative data and uses T-scores that have a mean of 50 and a standard deviation of 10. Individuals with scores <60 T are considered to have fairly healthy personality dimensions. Scores of 60–69 T represent a moderate elevation and may indicate tendency to anger and dissatisfaction. Scores of 70 T and above are indicative of problematic symptoms in interpersonal relationships and impulsivity. Scores greater that 90 T are generally seen only in clinical samples and indicate markedly elevated symptoms, possibly an individual in crisis.

PAI-BOR T-scores for the no self-harm group ranged from 37–90 (*M* = 60.47, *SD* = 10.27) representing moderate elevation of personality traits, which is consistent with other non-clinical samples (e.g., Trull, 1995; Gardner and Qualter, 2009). In the no self-harm group, 42 participants (18.83%) had T-scores of 70 or above, which is considered to be the cut-off point that indicates presence of significant BPD features (Trull, 1995). T-Scores for the prior self-harm group ranged from 45–100 (*M* = 73.42, *SD* = 12.50), which is consistent with T-scores observed in clinical BPD samples (e.g., Jacobo et al., 2007). In the prior self-harm group, 72 participants had T-scores of 70 or above, likely reflecting problematic elevation of BPD features and indicating that individuals in non-clinical samples may show relatively high levels of borderline PD traits. T-scores differed significantly between the prior selfharm group and the no self-harm group (*U* = 5602, *Z* = −8.65, *p* < 0.001).

# **Measure of affect**

*The Positive and Negative Affect Schedule (PANAS; Watson et al., 1988).* The PANAS consists of two 10-item scales measuring positive (e.g., "enthusiastic", "proud") and negative (e.g., "irritable", "nervous") affect. Participants rate to what extent they generally experience each item on a five-point response scale ranging from "not at all" to "extremely". Data from the current study showed high internal consistency for negative and positive scales (both αs = 0.92).

#### **Self-harm measure**

*Deliberate Self-Harm Inventory (DSHI;Gratz, 2001).* The DSHI is a 17-item self-report questionnaire developed to measure frequency, severity and type of self-harming behavior. Participants' rate how often they have intentionally engaged in each of the 17 behaviors (e.g., "Have you ever intentionally, on purpose, cut your wrist, arms, or other areas of your body without intending to kill yourself? If yes, how many times have you done this?"). Following completion of the measures, participants were encouraged to comment on their participation in the study (e.g., "Do you have anything you would like to add that was not asked about in this questionnaire?") A number of participants reported they had difficulty estimating the number of times they had engaged in each of the behaviors, therefore using total number of self-harm injuries as a variable proved to be problematic. Consequently, we used the DSHI to distinguish between participants who selfharmed and those who did not.

# **RESULTS**

Items relating to self-harm behaviors were removed from BPD scales to avoid colinearity with the outcome measure. Given that the three BPD scales were measuring the same underlying construct, and the correlations between the measures were moderate to large in magnitude (*r*s = 0.56−0.84), we created a composite variable representing BPD features, in order to provide more reliable measures (e.g., Cheavens et al., 2005; Sprague and Verona, 2010). Individual scores were standardized (*Z*-transformed) and then summed in order to create an overall index of BPD features. This standardized BPD scale with the self-harm related items removed demonstrated good internal consistency (35 items, α = 0.93).

**Table 1** shows descriptive data for measures used in the current study by self-harm group (prior self-harm vs. no self-harm). Individuals who reported previous self-harm had significantly higher scores on BPD features and negative affect, and significantly lower scores on positive affect, shifting, and flexibility compared to the non-self-harm group.

**Table 2** shows correlations between the scales used in the current study. Results of Pearson's correlational analyses showed that ACS subscales were generally weakly to moderately correlated, indicating that ACS subscales indexed some shared processes. Negative affect scores correlated with BPD features, and negatively correlated with all three of the ACS subscale scores in the prior self-harm group, whereas it was positively correlated with focusing and shifting in the no self-harm group. The flexibility subscale scores of the ACS correlated negatively with BPD feature scores, suggesting low flexibility ability in the presence of BPD features. BPD features scores also correlated with negative affect, and inversely with positive affect.

A hierarchical logistic regression model was used to examine possible contribution of affect, BPD features, and attentional control to the probability of reporting previous episodes of selfharm (see **Table 3**). Self-harm was, therefore, the criterion variable. We decided that a binary variable simply indicating whether individuals had ever engaged in self-harm was most appropriate (see Section Materials and Methods). The variable was coded 1 to indicate prior self-harm and 0 to indicate no prior selfharm. Affect (as measured by the PANAS) was entered in the first block due to the important role negative affect plays in selfharm behavior. The BPD variable was entered in the second step to examine whether BPD features predicted self-harm likelihood separately from affect. The attentional control variables (focusing, shifting, and flexibility) as measured by the ACS were entered in the final step of the regression to examine whether deficits in specific components of attentional control would partially explain the association between BPD and self-harm.

The full model containing all predictors was significant (χ 2 (5) = 140.79, *p* < 0.001) compared to the constant only model, indicating that the full model distinguished between participants who reported instances of self-harm and those who did not. The model as a whole explained between 34% (Cox and Snell *R* 2 ) to 47% (Nagelkerke *R* 2 ) of the variation in self-harm and correctly classified 80.60% of cases. There were six independent variables at the final step, four of which made a unique and significant contribution to the probability of reporting self-harm. (see **Table 3**).

In the final step of the regression odds ratios indicated that BPD features most strongly predicted likelihood of self-harm, and no mediating effects of the added attentional control variables were indicated. Focusing and shifting variables were associated with prior self-harm likelihood. Higher shifting scores were associated with lower rates of self-harm, and focusing appeared to have a positive association with self-harm. These associations were independent of BPD and raise the possibility that they may interact with BPD features in their association with self-harm.


\* p < 0.05; \*\* p < 0.001.


**Table 2 | Correlations between measurement scale scores**.

\* p < 0.05; \*\* p < 0.001. Key: ACS = Attentional control scale; BPD features = Combined borderline scales.

**Table 3 | Hierarchical Logistic Regression testing main effects of affect; attentional control; and BPD features on prior incidence of self-harm**.


Note: R<sup>2</sup> <sup>a</sup> = Cox and Snell, R<sup>2</sup> <sup>b</sup> = Nagelkerke, \* p < 0.05; \*\* p < 0.001.

**Table 4 | Hierarchical Logistic Regression testing interaction effects of BPD features and focusing on prior incidence of self-harm**.


Note: R<sup>2</sup> <sup>a</sup> = Cox and Snell, R<sup>2</sup> <sup>b</sup> = Nagelkerke, \* p < 0.05; \*\* p < 0.001.

The positive association between self-harm and focusing is in the opposite direction to that shown in simple *t*-tests (**Table 1**), and may suggest a suppressor effect of either affect or BPD that is only apparent when analyzed together in a regression. Alternatively, there may be an interactive effect of BPD and focusing, and this, along with a similar interaction between shifting and BPD was explored.

To do this, interaction terms were created as the products of standardized (*Z*-transformed) versions of the BPD variable and the focusing and shifting variables. The interactive effects of BPD



Note: R<sup>2</sup> <sup>a</sup> = Cox and Snell, R<sup>2</sup> <sup>b</sup> = Nagelkerke, \* p < 0.05; \*\* p < 0.001.

and focusing ability and of BPD and shifting ability were tested in two separate hierarchical logistic regressions. In each regression the two predictor variables were entered in the first step, and the interaction term was entered into the second. In both cases the interaction terms were uniquely significant (see **Tables 4**, **5**).

Plots were created to help interpret the interactions (see **Figures 1**, **2**). The plots indicate that those two attentional control factors differentially moderated the association between BPD and rates of self-harm. For individuals low in BPD, high focusing ability appears to reduce the risk of self-harm, yet increase the risk for those high in BPD features. One possibility is that focusing is a protective factor for some, and a rumination-like risk factor for others.

The picture with shifting ability is somewhat different. The plot suggests that for those with pronounced BPD features shifting ability has little bearing on self-harm risk. However, among

**prior self-harm.**

those individuals with few BPD features, reduced shifting ability may pose a slightly elevated self-harm risk.

### **DISCUSSION**

The current study investigated the relationship between BPD features, three aspects of attentional/executive control (shifting, focusing and flexibility), affect, and self-harm in a large nonclinical sample. The hierarchical logistic regression showed that BPD ratings and attentional focusing predicted self-harm incidence, although the pattern of data was not entirely as anticipated with high attentional focusing scores increasing the likelihood of a prior self-harm history in those rating high BPD features. The ability to shift attention was associated with a reduced likelihood of self-harm.

As hypothesized, high BPD scores were associated with greater likelihood of an individual reporting previous self-harm. Our findings demonstrate the importance that BPD features play in propensity to self-harm in a non-clinical sample. There is evidence that individuals drawn from non-clinical populations with high levels of BPD features show social and occupational problems along with impaired executive function ability compared to those with few or no BPD features (Trull et al., 1990; Fossati et al., 2004; Ayduk et al., 2008). Most research with BPD groups has centered on those with a clinical diagnosis meaning that less is known about how BPD features might drive maladaptive behavior in non-clinical groups. Our findings reproduce the strong association shown between BPD features and self-harm likelihood in clinical cases indicating that despite possible differences between clinical and non-clinical BPD there are also some shared processes that potentially transcend a BPD diagnosis in relation to self-harm. Most psychiatric disorders can be considered on a continuum from complete absence of symptoms, for example in remittance, to clinically severe (Tyrer, 2009). Our findings support the dimensional approach to psychiatric disorders and illustrate the importance of investigating functions in a range of participants who may present along the BPD spectrum.

Our results showed that high focusing ability reduced selfharm likelihood for individuals low in BPD features but increased the risk for those rating themselves highly on BPD features. Thus when high BPD features are present a good capacity to focus attention is likely directed in some maladaptive way. BPD features also correlated with negative affect: these findings raise the possibility that high focusing might manifest as ruminative perseverative thought patterns that influence behavior and affect. What is not clear is whether high focusing is targeted at potential self-harming behavior or instead functions to precipitate selfharm. The former is more plausible because self-harmers tend to report immediacy and urgency when self-harming that is then followed by catharsis. Arguably, it might be the case that high focusing ability functions to maintain some BPD features. Key features of BPD measured by our composite scale include fluid sense of self, emotional instability, feelings of and expression of rage, fear of abandonment, unstable but intense relationships and impulsivity. Thus, intenseness of relationships for example might be a consequence of over-focusing on the other, and also over-focusing on the possibility of abandonment. Likewise exaggerated anger responses might arise due to over-focusing on perceived slights or suspected indications of future abandonment. In addition, the finding that low flexibility in attentional control is associated with high BPD features supports the notion the high focusing might drive and/or maintain perseverative and anxiety inducing cognitions that ultimately lead to self-harm because the individual cannot switch attention "off topic". High levels of focusing in people with low BPD feature ratings may protect against self-harm risk by enabling the individual to override prepotent and maladaptive thought patterns.

Present findings indicate that attentional shifting ability had little bearing on self-harm risk in those who rated themselves high on BPD features. This finding corresponds well to the notion that those high in BPD features may be highly focused upon thoughts that precipitate negative affect and self-harm. Thus, we see a pattern of relationships emerging whereby the "maintaining" function of high focusing makes most demand on capacity constrained attentional resources in those with high BPD features, at the expense of attentional flexibility and attentional switching. Our findings also show an association between low attentional shifting ability and slightly elevated self-harm risk for those individuals with few or no BPD features. Attentional shifting is not a unitary process: ability to reallocate capacity-constrained attentional resources to a different intrinsic or extrinsic stimulus depends upon inhibition of earlier focus. Thus inhibitory capacity will affect attentional shifting ability, when reduced it should make attentional switching difficult due to resource competition. In addition, emotional stimuli have been shown to be more resistant to inhibition than non-emotional stimuli (Schulz et al., 2007), and this may be particularly salient for those high in BPD features. There is also some suggestion that low inhibitory ability and high urgency may mediate rash behavior across a range of groups and disorders (Gay et al., 2008). Consequently, good attentional switching ability may provide a protective buffer against self-harm behavior for some individuals by reducing the likelihood of pathological focusing and perseverative thought patterns (Judah et al., 2014).

Individuals may self-harm for a variety of reasons including reducing negative affect and arousal, as an anti-dissociation mechanism (also referred to as "feeling generation"), as a way of avoiding suicide, reinforcing personal boundaries, as selfpunishment, or as a method of sensation seeking (Klonsky, 2007). Within this framework anti-dissociation refers to capacity of self-harm to ameliorate sense of depersonalization in BPD (Klonsky, 2007; American Psychiatric Association, 2013), and is generally considered to be distinct from the graver and psychotic disconnect from reality defined as "dissociation" in other disorders such as schizophrenia and Bipolar Disorder. Although the current study did not include a specific measure of social functioning, the literature suggests self-harmers have significantly worse physical and social functioning and reduced quality of life compared to non-self-harmers in the general population (Sinclair et al., 2010). This includes a significant and persistent risk of suicide 15 years after presenting at hospital with a selfharm injury (Hawton et al., 2003). However, it is important to note that in the current study, the sample of participants likely consisted of relatively higher-functioning individuals, as participants were not recruited from mental health services or hospitals, which are typical treatment sites for lower functioning individuals with a BPD diagnosis (Sansone et al., 1998). Despite this, participants did endorse a high number of BPD features, particularly in the self-harm group. Research suggests that high BPD features (e.g., individuals who score above the clinically significant cut-off point of 70 T on the PAI-BOR) are associated with poorer outcomes such as academic difficulties, meet criteria for a mood diagnosis, and experience interpersonal dysfunction, even within a nonclinical population (Trull et al., 1997).

The development of adaptive flexible attentional control might pose a potentially useful therapeutic goal for those high in BPD features. Mindfulness refers to the practice of non-reactive attention to the present moment, focusing on thought, emotions and bodily sensations as well as environmental stimuli (sounds and smells) even if they are unwanted or unpleasant whilst accepting their impermanence (Linehan, 1993). Increased mindfulness skills appear to improve psychological functioning by cultivating an adaptive form of self-focused attention that reduces rumination and emotional avoidance, and improves behavioral selfregulation (Lynch et al., 2006; Baer, 2009; Selby et al., 2009). This may be a fruitful area for future work in non-clinical self-harming groups.

Borderline Personality Disorder is also known to share some affect regulation and impulse control features with attentiondeficit/hyperactivity disorder (ADHD) and ADHD may be comorbid with BPD (Philipsen, 2006). Additionally, ADHD may be a risk factor for the development of BPD in adulthood (Philipsen et al., 2008). However, it is possible that attentional control problems may underlie both conditions, constituting the shared processes of each condition, and that the emergence of one disorder rather than the other, or one main disorder with ADHD co-morbidity, is driven by the selective constellation of personality, developmental and familial factors combined with attentional control problems. Future work might explore the potential shared contribution of executive/attentional control problems to personality disorders and co-morbid conditions.

A limitation of the current study was the use of self-report measures of attentional control although other work also indicates that ACS scores are associated with behavioral and neurophysiological indicators of executive control (e.g., Derryberry and Reed, 2002). It is possible that subjective reports differ about attentional control are not similar from objective indices of attentional control (Verwoerd et al., 2008). Consequently, our ongoing work is developing new experimental paradigms and using a comprehensive raft of standardized cognitive tests to investigate these assumptions and further tease apart the putative relationship between executive control and self-harm likelihood.

To summarize, present findings support the notion of a multicomponential executive system by demonstrating different patterns of relationship among attentional variables on likelihood of self-harm in those with BPD features. Of note, those high in BPD features showed high focusing scores indicating no impairment in this capacity as we anticipated, although flexibility and shifting scores were significantly lower in those with a self-harm history compared to non self-harmers. This finding seems to indicate that it is the content of attentional focusing rather than the process that may be pathological in those high in BPD features.

The high incidence of self-harm cases reported each year beyond psychiatric groups suggests a need for improved pathways to diagnosis and treatment for those who self-harm. Our data indicate that BPD features might play a role in mediating these behaviors and also that attentional control factors, as measured by our variables also contribute to self-harm likelihood. Overall, our findings indicate that personality and attentional control factors interact to determine self-harm likelihood whereby high attentional focusing and shifting abilities are protective when BPD features are low but high focusing may be a possible maintaining factor when BPD features are high.

# **REFERENCES**


Convergence with measures on a non-emotional analog. *Arch. Clin. Neuropsychol.* 22, 151–160. doi: 10.1016/j.acn.2006.12.001


**Conflict of Interest Statement**: The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

*Received: 08 April 2014; accepted: 24 July 2014; published online: 20 August 2014*. *Citation: Drabble J, Bowles DP and Barker LA (2014) Investigating the role of executive attentional control to self-harm in a non-clinical cohort with borderline personality features. Front. Behav. Neurosci. 8:274. doi: 10.3389/fnbeh.2014.00274*

*This article was submitted to the journal Frontiers in Behavioral Neuroscience*. *Copyright © 2014 Drabble, Bowles and Barker. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms*.

# Basal Forebrain Cholinergic System and Orexin Neurons: Effects on Attention

Ines Villano1† , Antonietta Messina1† , Anna Valenzano<sup>2</sup> , Fiorenzo Moscatelli 2,3 , Teresa Esposito<sup>1</sup> , Vincenzo Monda<sup>1</sup> , Maria Esposito<sup>4</sup> , Francesco Precenzano<sup>4</sup> , Marco Carotenuto4,5 , Andrea Viggiano<sup>6</sup> , Sergio Chieffi<sup>2</sup> , Giuseppe Cibelli <sup>2</sup> , Marcellino Monda<sup>1</sup> and Giovanni Messina1,2 \*

<sup>1</sup>Department of Experimental Medicine, Second University of Naples, Naples, Italy, <sup>2</sup>Department of Clinical and Experimental Medicine, University of Foggia, Foggia, Italy, <sup>3</sup>Department of Motor, Human and Health Science, University of Rome, "Foro Italico", Rome, Italy, <sup>4</sup>Department of Mental Health, Physical and Preventive Medicine, Second University of Naples, Naples, Italy, <sup>5</sup>Neapolitan Brain Group (NBG), Clinic of Child and Adolescent Neuropsychiatry, Department of Mental, Physical Health and Preventive Medicine, Second University of Naples, Naples, Italy, <sup>6</sup>Department of Medicine, Surgery and Dentistry "Scuola Medica Salernitana", University of Salerno, Salerno, Italy

The basal forebrain (BF) cholinergic system has an important role in attentive functions. The cholinergic system can be activated by different inputs, and in particular, by orexin neurons, whose cell bodies are located within the postero-lateral hypothalamus. Recently the orexin-producing neurons have been proved to promote arousal and attention through their projections to the BF. The aim of this review article is to summarize the evidence showing that the orexin system contributes to attentional processing by an increase in cortical acetylcholine release and in cortical neurons activity.

#### Edited by:

Lynne A. Barker, Sheffield Hallam University, UK

#### Reviewed by:

Cliff H. Summers, University of South Dakota, USA Birendra N. Mallick, Jawaharlal Nehru University, India

> \*Correspondence: Giovanni Messina gianni.messina@unina2.it

†These authors have contributed equally to this work.

Received: 16 July 2016 Accepted: 12 January 2017 Published: 31 January 2017

#### Citation:

Villano I, Messina A, Valenzano A, Moscatelli F, Esposito T, Monda V, Esposito M, Precenzano F, Carotenuto M, Viggiano A, Chieffi S, Cibelli G, Monda M and Messina G (2017) Basal Forebrain Cholinergic System and Orexin Neurons: Effects on Attention. Front. Behav. Neurosci. 11:10. doi: 10.3389/fnbeh.2017.00010 Keywords: attention, orexin, basal forebrain, lateral hypothalamus, acetylcholine

# INTRODUCTION

Attention may be defined as the behavioral and cognitive process that allows us to select the information present in our environment on the basis of their relevance along with the ability to ignore irrelevant stimuli (Sarter et al., 2001). It consists of several components such as sustained, selective and divided attention, which are responsible for the control of the flow of information in the cognitive system (Rieger et al., 2003). Attention involves both top-down processes (knowledgedriven mechanisms) and bottom-up processes (mechanisms driven mainly by the characteristics of the target stimulus and its sensory context; Sarter et al., 2001; Chieffi et al., 2004, 2012). These two processes drive the attentive focus control (Gazzaniga et al., 2002). Attentional processing comprises some generalized states of arousal which refers to the state of physiological reactivity ranging from sleep to excitement or panic (Coull, 1998; Fadel and Burk, 2010).

Changes in arousal typically are deduced from brain activity data (EEG), whereas the study of attention is based on behavioral studies. Moruzzi and Magoun (1949) first demonstrated that cerebral activation is related to changes in EEG waves and has a brainstem origin. The discovery and localization of the brainstem reticular arousal system (RAS) was subsequently made by Moruzzi and Magoun (1949). The more evident arousal effect on EEG activity is the ''desynchronization'' phenomenon. It refers to the rapid shift from high-amplitude low-frequency EEG activity, typical of sleep, to low-amplitude high-frequency electroencephalographic activity, typical of wakefulness. EEG was the earliest measure used to systematically examine human brain cortical activity. After a long period of decline in clinical interest, EEG is now attracting increasing scientific and clinical interest. This resurgence is due to ongoing advances in signal processing and visualization that increase the spatial resolution of EEG imaging and exploit its ability to image quick transient cortical events and more precise regional changes in cortical activity. For the last few decades, scalp channel EEG data have been analyzed principally either in the time domain via ERP trial averaging, or in the frequency domain using FFT that estimate spectral power within a given frequency. Although phenomena and definitions may vary, EEG spectral power variations are typically dominated by distinct changes in power in few frequency bands. The standard terminology for these bands is: delta (<4 Hz), theta (4–7 Hz), alpha (8–12 Hz), beta (13–25 Hz; often split into beta-1/sensorimotor rhythm (SMR), 13–16 Hz, and beta-2, 17–25 Hz), and gamma (25–50 Hz or even higher frequency broadband activity extending to 200 Hz or greater). Cerebral activation is also detected during rapid eye movement (REM) sleep. Experimental evidence suggest that arousal systems work differently during the wake state and the REM sleep (Krueger et al., 2016). Arousal effects arise from the stimulation of the mesopontine cholinergic nuclei (Montplaisir, 1975; Jones and Webster, 1988) and the locus coeruleus (LC; Steriade and McCarley, 1990), which consist principally of noradrenergic neurons. Conversely, during REM the monoaminergic (noradrenergic and serotoninergic) neurons are silent (Hobson et al., 1975; McGinty and Harper, 1976). It is possible to distinguish brain mechanisms involved in attention from arousal, thanks to changes in task performance following manipulations known to affect attention (De Gangi and Porges, 1990; Schiff and Plum, 2000; Fadel and Burk, 2010). In this case, the interpretation of data resulting from these manipulations is primarily based on behavioral performance data (e.g., detection rates, false alarm rates, etc.) (Sarter and Bruno, 1999; Sarter et al., 2001). The relationship between arousal and attention is not simple. Attentional performance improves with a moderate increase of arousal but drops dramatically during high excitement state (Easterbrook, 1959). On the other hand, sustained attention reduces arousal and induces drowsiness (Babkoff et al., 1991).

Furthermore, physiological studies and data collected on patients suffering from injuries or neurological diseases provide a wealth of information on the neural mechanisms of attention processes. Thus, it is important to determine the brain networks mediating attention both to understand the neural mechanisms underlying these cognitive functions to expand knowledge on neurodevelopmental disorders characterized by impairments in attentional functions (Sarter et al., 2001; Esposito and Carotenuto, 2010, 2014; Carotenuto et al., 2016). Among these networks, the basal forebrain (BF) cholinergic system is considered as a major component of top-down processes in the mediation of attention, it is known to play a role in several aspects of attentional function (Fadel and Burk, 2010; Viggiano et al., 2014) and to be necessary for normal attentional performance (Sarter et al., 2001; Boschen et al., 2009). This system can be activated by different afferent inputs and can influence how attentional resources are allocated (Chieffi et al., 2009; Fadel and Burk, 2010). Among the various afferent inputs to the BF cholinergic projection system, the hypothalamus represents an important source of projections. The available data demonstrate that orexin neurons, whose cell bodies are present in the lateral hypothalamus, contribute substantially to these projections (Cullinan and Záborszky, 1991). Orexin neurons have widespread projections to a number of brain regions, including cholinergic BF structures. In the last decade, several studies have focused on specific neuronal pathways through which the orexin-producing neurons may promote not only arousal, but also attention. Their results suggest that the basal forebrain may be a key site through which these neurons act. In this article, we review the effects of orexin-producing neurons and their projection to the BF to support the hypothesis that orexin system may contribute to attentional processing through increased cortical-acetylcholine (Ach) release.

# THE CHOLINERGIC BASAL FOREBRAIN SYSTEM

In the BF, cholinergic neurons are codistributed with several other cell populations, including GABAergic and various neurons containing calcium binding protein for example calbindin, calretinin or parvalbumin (Fadel and Burk, 2010). These neurons project to all areas and layers of the cortex (Sarter and Bruno, 1997). The cholinergic projections modulate the response of pyramidal cells to other corticalglutamatergic inputs (McCormick, 1993), facilitating the bottom-up sensory information processing within the cortex (**Figure 1**; Muir et al., 1994; Sarter et al., 2001). Furthermore, the long radiating dendrites of the cholinergic BF neurons receive inputs from all the brainstem and hypothalamic arousal systems, for example cholinergic ponto-mesencephalic neurons, noradrenergic LC neurons, dopaminergic ventralmesencephalic neurons, histaminergic tubero-mammillary neurons and orexinergic perifornical neurons (Jones and Cuello, 1989; Panula et al., 1989; Zaborszky and Cullinan, 1996; Peyron et al., 1998; Semba et al., 1998).

The cholinergic basal forebrain neurons have been implicated in mechanisms of synaptic plasticity, learning, memory, arousal and attention (McCormick, 1993; Leanza et al., 1996); all these functions are related to cortical activation (Jones, 2003). For instance, pharmacological manipulations of cholinergic receptors in extra-striate occipital and superior-medial parietal cortices affect attentional performance (Bentley et al., 2004), and lesions of the BF in monkeys also interfere with attention (Voytko et al., 1994).

The cholinergic basal forebrain neurons are hyperpolarized by ACh released by brainstem or forebrain neurons; both muscarinic and nicotinic receptors are involved in this effect, and could modulate the cortical and forebrain activity during particular states across the sleep–waking cycle (McCormick, 1993; Khateb et al., 1997). In the cerebral cortex and in the hippocampus, Ach release is maximal during wakefulness and REM sleep (Jasper and Tessier, 1971; Marrosu et al., 1995), while it decreases during non-REM sleep (Arrigoni et al., 2010).

Inglis et al. (1994) suggested that the BF neurons may be involved, together with dopaminergic neurons, in the regulation of attention and in rewarding activities including food intake because high amount of Ach is released during eating. Many other neurotransmitters can excite the BF neurons, for example glutamate (Khateb et al., 1995a), noradrenaline (NA; Fort et al., 1995), histamine (Khateb et al., 1995a,b), orexin (Eggermann et al., 2001), or can inhibit them for example serotonin (Khateb et al., 1993).

# OVERVIEW OF THE OREXIN NEURONS

The orexin/hypocretins are neuropeptides synthesized by a cluster of neurons within the postero-lateral hypothalamus that produce excitatory effects on target neurons. Two independent research groups discovered simultaneously these neuropeptides in the late 1990s. One group named these peptides orexins, from the Greek word ''orexis'', meaning ''appetite'', because they seemed to be involved in the control of feeding and metabolism (Sakurai et al., 1998; Sakurai, 2007). The other group named these peptides hypocretins, because these peptides share significant sequence homology with the members of the glucagon/vasoactive intestinal polypeptide/secretin (incretin) family (de Lecea et al., 1998). Therefore, as hypocretin, de Lecea et al. (1998), intended to indicate a hypothalamic member of the incretin family. However, the terms are interchangeable in the literature. Orexin-A (orexin-A/hypocretin-1, Orx-A) and orexin-B (orexin-B/hypocretin-2, Orx -B) are cleaved from a single gene product, prepro-orexin (Sakurai et al., 1998). Orexins act on two different G-protein coupled receptors: orexin 1 receptor (Orx1R), which binds selectively Orx A, and orexin 2 receptor (Orx2R), which binds both Orx-A and Orx-B with equal affinity (Sakurai et al., 1998; Sakurai, 2007). Orexin neurons also release other neurotransmitters, such as glutamate on histamine tubero-mammillary neurons (critical for the maintenance of arousal), the inhibitory neuropeptide dynorphin (which also modulate appetite) and pentraxin (regulator of AMPA receptors clustering; Chou et al., 2001; Reti et al., 2002; Schone et al., 2012). The orexin neurons may integrate a variety of interoceptive and homeostatic signals related to environmental, physiological and emotional stimuli to promote wakefulness and behavioral arousal in response to emotions, stress, hunger and circadian rhythms (Yoshida et al., 2006; Viggiano et al., 2009). Furthermore several brain regions involved in the central regulation of autonomic and endocrine processes or attention are targets of extensive orexin projections (Horvath et al., 1999; Chieffi et al., 2014a,b). Neurons containing the neuropeptide orexin send axons to numerous regions, throughout the central nervous system; their projections are widely distributed in the brain (Chemelli et al., 1999). These neurons innervate all brain regions known to promote wakefulness and arousal (Saper et al., 2005) including the cerebral cortex, BF, tuberomammillary nucleus (TMN), LC, and dorsal raphe (DR; Peyron et al., 1998; Yoshida et al., 2006). Furthermore, they innervate brain nuclei that regulate motivation and emotions (Sakurai and Mieda, 2011; Thompson and Borgland, 2011; Di Bernardo et al., 2014), and brain regions that regulate motor and autonomic functions (Nattie and Li, 2012). Thus, the orexin system is anatomically well positioned to coordinate many aspects of arousal and attention (Alexandre et al., 2013). Indeed, the orexin neurons are important in regulation of sleep/wakefulness states and lack of the peptide or the receptor caused narcolepsy in humans, dogs and mice (Chemelli et al., 1999; Lin et al., 1999; Thannickal et al., 2000).

# OREXIN AND ATTENTION

In attention regulation, orexins play a significant role likely via interactions with multiple ascending neuromodulatory systems, including dopamine neurons in the ventral midbrain (Vittoz and Berridge, 2006), noradrenergic neurons in the LC (Horvath et al., 1999; Espana et al., 2005) and the basal forebrain cholinergic system (Fadel and Burk, 2010). In the BF, orexin peptides increase cell activity and Ach release, thus they modulate attentional mechanisms (Fadel and Burk, 2010). Attentional deficits present in neurodegenerative conditions such as Alzheimer's disease, schizophrenia, drug addiction, and age-related cognitive decline may be related with alterations in the interactions between orexin neurons and cortical ACh neurons (Fadel and Burk, 2010). An imbalance in orexin regulation may also be involved in the pediatric Attention Deficit Hyperactivity Disorder syndrome (ADHD), comprising cognitive alterations, and in narcoleptic and/or obese and/or migrainous subjects as summarized in the Prader-Willi syndrome (Cortese et al., 2008; Carotenuto et al., 2009; Verrotti et al., 2013, 2015a,b; Morandi et al., 2015; Miano et al., 2016). Orexin neurons activity varies with the degree of arousal and is linked to heightened attentional states. Their activity promotes arousal, with maximal discharge during active wakefulness (Lee et al., 2005; Mileykovskiy et al., 2005; Viggiano et al., 2010), while their discharge decreases during quiet waking, in the absence of movement, and are silent in slow wave sleep and tonic periods of REM sleep, with occasional burst of activity during REM sleep (Lee et al., 2005; Mileykovskiy et al., 2005). In addition to arousal, orexins promote eating and are likely to have a role in physiological functions such as regulation of blood pressure, the neuroendocrine system, body temperature, and energy homeostasis (Peyron et al., 1998; Hara et al., 2001; Jones, 2003; Messina et al., 2014). Blouin et al. (2013) demonstrated that, in the human brain, Orx-A levels are maximal during positive emotion, social interaction, anger and increase at wake onset, suggesting that these levels are linked to specific emotions and state transitions.

# NETWORK REGULATION OF OREXIN NEURONS

Orexin neurons are controlled by positive and negative feedback mechanisms mediated by the lateral hypothalamus/perifornical area (LH/PFA; **Figure 2**). The Orx2R, Orx-A or Orx-B form a positive-feedback loop which opens nonselective cation channels and depolarizes orexin neurons and modulates presynaptic glutamate release (Yamanaka et al., 2010). Indirectly, glutamatergic transmissions stimulate orexin neurons through glutamate activation of astrocytes that release lactate and protons into the extracellular space through monocarboxylate transporters (MCTs; Pellerin et al., 1998; Burt et al., 2011). Furthermore, to sustain physical activity, orexin neurons metabolize astrocyte-derived lactate as an energy source; moreover, the release of protons due to MCT activity causes a local decrease in extracellular pH that can result in depolarization of orexin neurons (Williams et al., 2007). Even adenosine triphosphate (ATP), released by astrocytes and neurons, has an excitatory effect on orexin neurons through the ionotropic P2X receptors (Wollmann et al., 2005). ATP can be hydrolyzed by ectonucleotidases releasing adenosine in the extracellular space (Wall and Dale, 2008) which inhibits voltage-gated Ca2<sup>+</sup> currents in orexin neurons leading to their inhibition (Liu and Gao, 2007). Negative feedback pathways have also been identified, for example Dynorphin and Nociceptin/Orphanin FQ (N/OFQ), either co-expressed by orexin neurons (Chou et al., 2001; Maolood and Meister, 2010). Dynorphin attenuates glutamate release acting on presynaptic excitatory terminals, while N/OFQ inhibits both excitatory and inhibitory transmission (Li and van den Pol, 2006). The balance between the excitatory and inhibitory effects determines the activity levels of the postsynaptic cell (Burt et al., 2011). In addition, the glutamate released synaptically creates a negative feedback loop acting on presynaptic autoreceptors to inhibit glutamate and GABA released through group III metabotropic glutamate receptors (mGluRs; Acuna-Goycolea et al., 2004). Even other distinct neuronal populations in the LH/PFA create synaptic contacts with orexin neurons for example neurons expressing melanin concentrating hormone (MCH) and leptin receptor-expressing (LepRb+) neurons. The MCH neurons form reciprocal connections with orexin neurons and are directly depolarized by Orexin A and B which stimulate presynaptic glutamate release, whereas dynorphin and N/OFQ directly induce hyperpolarization of MCH neurons (Li and van den Pol, 2006). MCH can reduce the presynaptic glutamate release induced by orexin receptors to antagonize the excitatory effects on orexin neurons (Rao et al., 2008). Leptin receptor-expressing (LepRb+) neurons are excited by leptin and use GABA as a neurotransmitter (Leinninger et al., 2009). Leptin inhibits orexin neurons through hyperpolarization of these neurons (Yamanaka et al., 2003), because the activation of the LepRb+ neurons produces an inhibition of orexin neurons (Burt et al., 2011). Orexin neurons are also innervated by afferents of non-cholinergic terminals from the BF cholinergic cell area. BF glutamatergic neurons can excite orexin neurons involved in arousal, whereas GABAergic neurons can inhibit orexin neurons promoting behavioral quiescence and sleep (Henny and Jones, 2006). In summary, different peptides released by orexin neurons or distinct populations of LH/PFA neurons modulate orexin neurons and exert different excitatory and inhibitory influences during wake or sleep states. Others studies showed that LC neurons have an important role in waking and REM sleep (REMS; Mallick et al., 2012; Choudhary et al., 2014). Kumar et al. (2012) have constructed a mathematical model of waking, NREMS and REMS, showning the importance

neurons. (G) Leptin receptor-expressing GABAergic neurons are excited by leptin and use GABA as a neurotransmitter. Leptin inhibits indirectly orexin neurons by activating these inhibitory LepRb+ neurons. In summary, the balance between the excitatory and inhibitory effects determines the activity levels of the orexin neurons.

of orexinergic neurons in stabilizing the wake-sleep cycle and demonstrating that even small changes in inputs to or from those neurons can have a large impact on the ensuing dynamics. The results from this model help to understand the neural mechanisms of regulation and the patho-physiology of REMS.

# MODULATION OF THE BASAL FOREBRAIN CHOLINERGIC SYSTEM BY OREXIN NEURONS: EFFECTS ON ATTENTION

# Orexin Receptors in the Basal Forebrain

Orexin neurons have widespread projections to the basal forebrain that may promote arousal by activating the cortex. Orexin neurons also project onto BF cholinergic neurons and release orexins in the BF. Both Orx1R and Orx2R are expressed in the BF, and can activate cholinergic afferents (Marcus et al., 2001) and narcoleptic dogs lack OX2R (Lin et al., 1999). However, there are conflicting results from in vitro electrophysiological studies and BF orexin administration with regard to what type of orexin receptor subtypes are involved in the activation of cholinergic fibers. in vitro electrophysiological data indicate that both Orx-A and Orx-B can excite BF cholinergic cells, and that their effects are primarily Orx2R-mediated (Eggermann et al., 2001; Gotter et al., 2016). On the other hand, other studies suggested that the effects of orexin administration in the BF are primarily Orx1R-mediated (Espana et al., 2001). Using mice lacking orexin receptors, Alexandre et al. (2012) found that focal restoration of Orx1R and Orx2R in the substantia

Glut, glutamate; (+), stimulation; (−) inhibition.

innominate (SI) partially rescued their ability to produce long bouts of wakefulness. Furthermore, Boschen et al. (2009) have blocked in rats the Orx1Rs through the administration of the Orx1R antagonist SB-334867 prior to a two-lever sustained attention task performance. Their results showed that Orx1R blockade decreased the accuracy in attention-demanding tasks and that some of these effects on attention may be mediated by BF corticopetal neurons. In summary, the two receptors may play different and complementary roles in response to varying types of homeostatic challenges (Fadel and Frederick-Duus, 2008).

# Orexin Activation of the Basal Forebrain

Different in vitro studies have tried to understand how orexin neurons activate the BF focusing primarily on the effects of orexins on medial septum (MS) neurons that project to the hippocampus and to the cortically projecting neurons of the BF. In the MS, orexins directly excite septo-hippocampal cholinergic neurons through the activation of the sodium, calcium exchanger and inhibition of potassium channels, presumably an inward rectifier, increasing hippocampal acetylcholine release and promoting arousal (Wu et al., 2004). Orx-A excites BF cholinergic neurons inducing cortical release of acetylcholine and increasing attention (see **Figure 3**; Arrigoni et al., 2010). Cortical acetylcholine levels increase even more under demanding attention tasks (Hasselmo and McGaughy, 2004) and orexin neurons increase firing to promote arousal and during exploratory behaviors in response to salient external stimuli (Mileykovskiy et al., 2005). It is also important to consider that local application of orexins to the BF promotes wakefulness and improves cognitive performance. In fact, the administration of orexins into the BF excites cholinergic neurons that release acetylcholine in the cerebral cortex and thereby promotes wakefulness (Eggermann et al., 2001; Espana et al., 2001; Fadel et al., 2005). Within the prefrontal cortex, orexins can also directly improve attentional processes relevant to executive aspects of attention. Lambe et al. (2005) demonstrated that infusions of Orx-B into the prefrontal cortex improved accuracy under high attentional demand by exciting the same thalamocortical synapses that are activated by acetylcholine from the BF. Thus, through an increased cortical acetylcholine release and a direct action on thalamo-cortical projections, orexins may promote cortical activation and attention. Orexin A can also modulate cholinergic neuron activity indirectly, because it increases local glutamate release within the basal forebrain. Indeed, Fadel and Frederick-Duus (2008) demonstrated that the administration of Orx-A in the BF increases local glutamate efflux. Furthermore, via an excitatory autoreceptor mechanism, Orx-A might increase BF glutamate release. Even non-cholinergic neurons of the BF may be excited by orexins, for example most of the GABAergic neurons of the BF (Fadel and Frederick-Duus, 2008; Arrigoni et al., 2010). In fact, orexin excites GABAergic neurons of the MS that project to the hippocampus (Wu et al., 2004) and GABAergic neurons of the magnocellular preoptic nucleus and substantia innominata (MCPO/SI; Blanco-Centurion et al., 2006). Blanco-Centurion et al. (2006) studied the relative contribution of non cholinergic neurons to arousal; they found that after selective lesion of the basal forebrain-cholinergic neurons in the rats., the microinjection of Orx-A into the BF increased waking and still promoted arousal; this finding indicates that cholinergic neurons are not essential for the effects of Orx-A and, thus, suggests that Orx-A acts also on non-cholinergic neurons (Blanco-Centurion et al., 2006). These findings suggest that orexins may contribute to attentional processing in the BF, not excluding, however, that other neural circuits outside the BF may contribute to these effects (Boschen et al., 2009).

# Orexin, Circulating Factors and Basal Forebrain

Orexin neurons are responsive to circulating factors related to metabolic state, for example low plasma glucose, and are activated by food deprivation (Cai et al., 1999). It has been suggested that these neurons can provide a crucial regulation of arousal level in response to signals of energy balance, such as blood glucose, leptin, and food intake (Yamanaka et al., 2003; Esposito et al., 2014; Messina et al., 2014). Furthermore, Frederick-Duus et al. (2007) demonstrated that the administration of the toxin orexin B–saporin (which produces loss of orexin neurons in the LH/PFA) in food-restricted rats lead them to be insensible to the cholinergic response to presentation of palatable food. Moreover, animals pretreated with the Orx1R antagonist, SB-334867, show an increased feeding latency demonstrating that Orx1R activity is required for an appetitive food stimulus to increase cortical Ach released. Fadel and Frederick-Duus suggest that orexins are necessary for the activation of BF cholinergic neurons in response to a food-related stimulus and they are important for biasing the allocation of attentional resources toward cues related to the physiological status (Fadel and Frederick-Duus, 2008).

# Influence of Other Orexin Neuron Neurotransmitters on the Basal Forebrain

To better understand how orexin-producing neurons promote cortical activation, some studies have focused also on the neuropeptide dynorphin, which is synthesized in orexin neurons and have specific effects on different classes of BF neurons. Cholinergic neurons do not respond to dynorphin but are directly excited by Orx-A, but there are two more populations of non-cholinergic BF neurons. One of these populations is excited by Orx-A and do not respond to dynorphin; the other population of non-cholinergic sleep-promoting neurons, is inhibited by dynorphin but do not respond to Orx-A (Arrigoni et al., 2010). Therefore, the co-release of orexins and dynorphin can activate a synergistic mechanism that excites cholinergic and non-cholinergic wake-active neurons and inhibits non-cholinergic sleep-active neurons promoting attention and improving cognitive performance. In addition to dynorphin, orexin neurons also co-release glutamate that acts synergistically to excite BF via presynaptic glutamatergic

mechanisms (Arrigoni et al., 2010; Fadel and Burk, 2010). Orexin neurons express also the neuronal activity-regulated pentraxin (Narp) that is involved in clustering of glutamatergic α-amino-3 hydroxy-5-methyl-4-isoxazolepropionic acid (AMPA) receptors (Reti et al., 2002) and may potentiate pre- or post-synaptic responses to glutamate (Arrigoni et al., 2010). Despite the need of more studies to understand the role of dynorphin, glutamate and NARP, targeted deletion of orexin seem to have different functional deficits than those induced by selective ablation of orexin neurons (Chemelli et al., 1999; Hara et al., 2001; Reti et al., 2002) suggesting that other secreted signaling molecules expressed in these neurons are involved in their effects.

# Involvement of the Orexin-Basal Forebrain Interactions in Narcolepsy

Narcolepsy is a disease characterized by excessive daytime sleepiness, sleep paralysis, instability of sleep onset and REM periods, and cataplexy (Weinhold et al., 2014). Reduced orexinergic function, due to a reduction of orexin peptides, or orexin neurons, or orexin receptors, is assumed to be a major cause of narcolepsy, clearly demonstrated by post mortem studies (Arrigoni et al., 2010). Neuropsychological impairments have been found in narcoleptic patients, for example a reduced performance in attention-demanding tasks (Fulda and Schulz, 2001). Some studies suggest that narcoleptic patients show deficits in attention even during normal wakefulness periods (Rieger et al., 2003). Furthermore, BF degeneration is associated with canine narcolepsy suggesting that a postsynaptic degeneration and, in turn, impaired ACh-dependent cognitive function may be caused by a deficit in orexin stimulation (Siegel et al., 1999; Fadel and Frederick-Duus, 2008; Monda et al., 2014). In human narcolepsy there are deficits in selective processing of relevant stimuli. It is possible that these deficits are due to a reduction in BF cholinergic signaling to the cortex (Fadel and Burk, 2010). Weinhold et al. (2014) have investigated the effect of intranasal administration of Orx-A in narcoleptic patients with cataplexy on sleep behavior and cognitive functions. In the test of divided attention these patients showed enhanced performances after orexin-A administration, as indicated by their mean reaction time and fewer false reactions. Their results confirmed the role of Orx-A as a REM sleep stabilizing factor and provided functional evidence for the Orx-A effects on attention in narcolepsy with cataplexy (Weinhold et al., 2014). Other studies demonstrated that the intranasal administration of Orx-A in sleep-deprived rhesus monkey is able to relieve the cognitive deficits produced by the loss of sleep (Deadwyler et al., 2007). Moreover, it has been suggested that nasal Orx-A administration may be an effective approach to the treatment of orexin deficiency in narcolepsy (Peyron et al., 2000; Thannickal et al., 2000).

# CONCLUSION

Collectively, the available data strongly support the hypothesis that orexin stimulation of the BF is able to promote cortical activation and attention by acting on cholinergic and non-cholinergic neurons in response to salient stimuli. In fact, orexins excite cholinergic neurons, thus the increase in acetylcholine release within the cerebral cortex contributes to the cortical activation associated with attention. We have reviewed evidence suggesting that the BF may be a key target through which orexin neurons promote attention, even if many questions remain to be answered. Defining if orexin signaling in the BF is sufficient to maintain attention and the interaction between orexin and dynorphin within the BF should provide novel data to explain the role of orexin in several aspects of attention.

# REFERENCES


# AUTHOR CONTRIBUTIONS

IV, AM, MC and AVa carried out the study; AVi, FM, TE, VM, ME and FP participated in the design of the study; SC, GC, MM and GM participated in the design and coordination and helped to draft the manuscript. All authors read and approved the final manuscript.

# ACKNOWLEDGMENTS

This Review was financially supported by University of Foggia 5 × 1000 IRPEF funds in memory of Gianluca Montel.


basal forebrain arousal-related structures. J. Comp. Neurol. 481, 160–178. doi: 10.1002/cne.20369


Jones, B. E. (2003). Arousal systems. Front. Biosci. 8, s438–s451. doi: 10.2741/1074

Jones, B. E., and Cuello, A. C. (1989). Afferents to the basal forebrain cholinergic cell area from pontomesencephalic-catecholamine, serotonin and acetylcholine—neurons. Neuroscience 31, 37–61. doi: 10.1016/0306- 4522(89)90029-8


**Conflict of Interest Statement**: The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2017 Villano, Messina, Valenzano, Moscatelli, Esposito, Monda, Esposito, Precenzano, Carotenuto, Viggiano, Chieffi, Cibelli, Monda and Messina. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution and reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

# Event-Related Potentials Altered in Patients with Borderline Personality Disorder during Working Memory Tasks

Ying Liu1†‡ , Mingtian Zhong2†‡ , Chang Xi <sup>1</sup> , Xinhu Jin<sup>1</sup> , Xiongzhao Zhu1,3 , Shuqiao Yao1,3 and Jinyao Yi 1,3 \*

<sup>1</sup>Medical Psychological Center, Second Xiangya Hospital, Central South University, Changsha, China, <sup>2</sup>Center for Studies of Psychological Application, School of Psychology, South China Normal University, Guangzhou, China, <sup>3</sup>Medical Psychological Institute, Central South University, Changsha, China

Whereas some studies have demonstrated impaired working memory (WM) among patients with borderline personality disorder (BPD), these findings have not been consistent. Furthermore, there is a lack of neurophysiological evidence about WM function in patients with BPD. The goal of this study was to examine WM function in patients with BPD by using event-related potentials (ERPs). An additional goal was to explore whether characteristics of BPD (i.e., impulsiveness and emotional instability) are associated with WM impairment. A modified version of the N-back task (0- and 2-back) was used to measure WM. ERPs were recorded in 22 BPD patients and 21 age-, handedness-, and sex-matched healthy controls (HCs) while they performed the WM task. The results revealed that there were no significant group differences for behavioral variables (reaction time and accuracy rate) or for latencies and amplitudes of P1 and N1 (all p > 0.05). BPD patients had lower P3 amplitudes and longer N2 latencies than HC, independent of WM load (low load: 0-back; high load: 2-back). Impulsiveness was not correlated with N2 latency or P3 amplitude, and no correlations were found between N2 latency or P3 amplitude and affect intensity scores in any WM load (all p > 0.05). In conclusion, the lower P3 amplitudes and longer N2 latencies in BPD patients suggested that they might have some dysfunction of neural activities in sub-processing in WM, while impulsiveness and negative affect might not have a close relationship with these deficits.

Keywords: borderline personality disorder, working memory, event-related potential, N-back task, workload

# INTRODUCTION

Borderline personality disorder (BPD) is a serious mental disorder that is characterized by a pervasive pattern of instabilities in affect regulation, impulse control, interpersonal relationships, and self-image (van Zutphen et al., 2015; Chanen and Thompson, 2016). Clinical theoreticians and researchers have proposed that the symptoms and behaviors of BPD are associated, at least in part, with disruptions in basic neurocognitive processes, and those neurocognitive impairments may moderate development of BPD (Judd, 2005; Fertuck et al., 2006). Neurocognitive deficits associated with BPD include dysfunctions in attention, concentration, memory and executive functions, such

#### Edited by:

Nicholas Morton, Tickhill Road Hospital, UK

#### Reviewed by:

Alexander Easton, Durham University, UK Johannes Schiebener, University of Duisburg-Essen, Germany

#### \*Correspondence:

Jinyao Yi jinyaoyi2001@163.com

†These authors have contributed equally to this work. ‡Co-first authors.

> Received: 10 October 2016 Accepted: 03 April 2017 Published: 18 April 2017

#### Citation:

Liu Y, Zhong M, Xi C, Jin X, Zhu X, Yao S and Yi J (2017) Event-Related Potentials Altered in Patients with Borderline Personality Disorder during Working Memory Tasks. Front. Behav. Neurosci. 11:67. doi: 10.3389/fnbeh.2017.00067 as impulse control, planning and problem solving (Judd, 2005). These dysfunctions might be for a faulty allocation of processing resources (Bazanis et al., 2002), which would be exemplified by a deficit in the efficacy of the central executive component of human working memory (WM; Oberauer et al., 2003).

WM, which supports online maintenance and manipulation of information (Baddeley, 2003), consists of three subcomponents (central executive, phonological storage and spatial information), and provides attention control over other cognitive abilities. But only a few studies have considered WM in BPD, and the results have been inconsistent. For example, whereas some studies found a WM deficit in BPD populations (O'Leary, 2000; Stevens et al., 2004; Lazzaretti et al., 2012), others did not (Sprock et al., 2000; Judd, 2005; Gvirts et al., 2012). When normal populations engage in a WM task, the parietal cortex and dorsolateral prefrontal cortex are activated (Curtis, 2006; Fang et al., 2016; Ng et al., 2016). The function of dorsolateral prefrontal cortex is abnormal in BPD patients (Lis et al., 2007; Rossi et al., 2015; Krause-Utz et al., 2016). Thus, WM deficits observed in BPD patients might implicate dysfunction in the dorsolateral prefrontal cortex or parietal cortex. However, there is insufficient direct neurophysiological evidence of WM impairment in BPD patients. Previous studies using accuracy rate and reaction time only generally explored whether WM deficits exist in BPD patients, without examining possible WM subcomponent deficits. Meanwhile, some of these studies didn't eradicate the influence of medicine, comorbidity with other disorders, and could not conclude the real features of WM in BPD.

Event-related potentials (ERPs) measure electrical brain activity that is time-locked to sensory or cognitive events. ERPs elucidate the time course of ongoing brain activity during WM tasks and reflect the spatiotemporal sequence of cortical information processing (Kayser et al., 2006). ERPs are superior to behavioral or other neuroimaging measures, the latter of which have poor temporal resolution, when seeking information about the cognitive processing stages that contribute to WM abnormalities in BPD. Among the ERP components, N2 and P3 amplitudes have been reliably associated with WM function (Kim et al., 2014; Stroux et al., 2016). As a negative component was typical elicited between 200 ms and 350 ms poststimulus, N2 reflects retrieval of memory representations and perceptual comparisons (Patel and Azzam, 2005; Folstein and Van Petten, 2008). N2 is assumed to be an index of interference control (Donkers and van Boxtel, 2004; Folstein and Van Petten, 2008) and conflict monitoring/resolution. P3, a positive component typically elicited between 300 ms and 600 ms poststimulus, is associated with the general processes of attention control and stimulus categorization/evaluation (Bledowski et al., 2004; Rueda et al., 2004; Neuhaus et al., 2010a,b; Dai and Feng, 2011; Rossi et al., 2015).

Some studies have investigated N2 and P3 in subjects with BPD. For example, Houston et al. (2004) reported that adolescents who had BPD characteristics exhibited decrements in P3 amplitude. Ruchsow et al. (2008) recently found that BPD patients showed reduced P3 amplitude in a Go/Nogo task. However, there have been no studies investigating the N2/P3 characteristics in WM tasks in BPD patients. Several studies have focused on the performances of BPD patients in tasks with variable complexity or cognitive load. Stevens et al. (2004) reported that WM was impaired in BPD subjects, but WM function did not worsen when cognitive load was increased. Lazzaretti et al. (2012) only observed WM deficit in BPD patients when WM demands were high.

In light of this background, we recruited BPD patients and used the N2 and P3 components of ERPs to investigate the WM mechanism of BPD patients while performing an N-back task, while the effects medicine and comorbidity were controlled. Early sensory processing is the foundation of more complex cognition, and it influences WM performance (Tek et al., 2002). To examine whether there are early sensory defects in BPD, we recorded P1 and N1, which might be related to the early sensory stages of information processing (Thomas et al., 2013). Besides, impulsiveness and emotional instability are two core characteristics of BPD (Berlin et al., 2005; Domes et al., 2006; Jacob et al., 2010), and the degree of instability of these characteristics may be correlated with cognition impairment (Ruchsow et al., 2008; Svaldi et al., 2012; Hagenhoff et al., 2013). Therefore, we also examined whether impulsiveness and emotional instability influence WM performance of BPD patients.

# MATERIALS AND METHODS

# Participants

Two professional psychiatrists diagnosed patients with BPD using Diagnostic and Statistical Manual of Mental Disorders-IV (DSM-IV) criteria (Maffei et al., 1997). Subjects were excluded from this study if they had schizophrenia, schizoaffective disorder, attention-deficit hyperactivity disorder (ADHD), delusional (paranoid) disorder, bipolar disorder, psychotic disorder, hypothyroidism, or seizure disorder, or any history of head injury, neurosurgery, or substance abuse. Patients with other Axis I/II disorders were also excluded. Patients were not taking any medication at the time of enrollment. The BPD group included 22 right-handed young subjects (14 males, 8 females; age: 22–27 years).

The control group comprised 21 age-, education- and handedness-matched healthy subjects (10 males, 11 females; age: 22–25 years). All subjects in the control group were interviewed by two professional psychiatrists to exclude DSM-IV criteria and other Axis I/II disorders. Criteria for inclusion in the control group were as follows: no current medical problems, no history of substance or alcohol abuse, and no history of psychiatric disorders among first-degree relatives.

All subjects had normal or corrected-to-normal vision. This study was carried out in accordance with the recommendations of ''ethics committee of Central South University''. The protocol was approved by the ethics committee of Central South University. Every subject signed an informed consent form in accordance with the Declaration of Helsinki. In the case of BPD patients, this consent form was also signed by a well-informed relative.

# Psychometric Instruments

# Center for Epidemiological Studies Depression Scale (CES-D)

The CES-D was used to assess depressive levels of participants. This scale consists of 20 items, each of which is assigned a value from 1 to 4. Four items are positively worded and reverse-scored. The total score is computed by summing the 20 items, such that the range of scores is 20–80 (Natamba et al., 2014). The Chinese version of the CES-D has high internal consistency and construct validity (Xiao et al., 2016).

# Barratt Impulsiveness Scale—11th Version (BIS-11)

The BIS-11 is one of the most widely used measures of impulsive personality traits. The 30 items are rated from 1 (rarely/never) to 4 (almost always/always). Items are summed to determine the overall impulsiveness score, with higher scores indicating greater impulsivity (Patton et al., 1995). The Chinese translation of the BIS-11 shows sufficient reliability and validity (Yao et al., 2007).

# Childhood Trauma Questionnaire (CTQ)

The CTQ is a self-administered questionnaire that addresses childhood trauma in the following five areas: physical abuse, emotional abuse, sexual abuse, physical neglect and emotional neglect. The short form of this questionnaire includes 28 items that are scored on a five-point Likert scale ranging from 1 (never true) to 5 (very often true). There are five questions for each type of trauma, scored 5–25, with an additional three questions to assess minimization/denial (Lee et al., 2015). The Chinese version of the CTQ has good reliability and validity (Zhao et al., 2005).

# Short Affect Intensity Scale (SAIS)

The SAIS has 20 items that are scored on a six-point Likert scale ranging from 1 to 6 (Geuens and De Pelsmacker, 2002). Three factors are analyzed with this scale: positive intensity (8 items), negative intensity (6 items), and serenity (6 items). There are no total scale scores. The score for each factor is a mean score of the items in that factor; consequently, the scores for individual factors range from 1 to 6. The Chinese version of the SAIS has good reliability and validity (Zhong et al., 2010).

# Stimulation and Task Procedures

A modified version of the N-back task was used to measure WM (Blokland et al., 2008; Baller et al., 2013). Participants performed a 0-back task and a 2-back task. As shown in **Figure 1**, numbers (1–4) of the N-back task were in a fixed position in one of four large white circles. Circles were positioned at each of the corners of a diamond-shaped square on a gray background of the screen. Stimuli were projected by using E-prime 2.0. Using their right index or middle finger, participants pressed one of four buttons to match the target stimulus. In this study, the 0-back task required a simple button press in response to the number displayed, while the 2-back task required participants to press the key corresponding to the number presented in two trials before the current one, therefore, the 2-back task requires on-line monitoring, updating and manipulation of remembered information and is assumed to place great demands on a number of key processes within WM (Glahn et al., 2005; Owen et al., 2005). Differing from traditional N-back task which judged if present number was similar to the n-back one, our task demanded participants to pressed one of four buttons to match the

pressed the key corresponding to the number presented two trials before the current one.

n-back one, which meant that the participants just had 25% chance to guess right, which was more difficult than traditional task.

There were 16 blocks and 8 blocks per condition. Each block consisted of 16 trials with a stimulus presentation time of 200 ms and interstimulus interval of 800 ms. There were 16-s rest intervals for subjects when they finished each block. The total experimental length was 8.2 min (492 s).

# Electrophysiological Recording and Analysis

As participants completed the N-back task, continuous EEG signals were acquired with a 64 channel electroencephalographeic system (EEG Nuamps, NeuroScan, Inc., El Paso, TX, USA). Electrodes were placed by using a 10/20 extended QuikCap system (NeuroScan, Inc., El Paso, TX, USA). References were placed at a vertex by default and off-line re-referenced to averaged mastoids. Impedanc values were kept lower than 5 KΩ for all electrodes. Horizontal electrooculograms (EOGs) were recorded with electrodes placed on the bilateral external canthi. Vertical EOGs were recorded from electrodes placed above and below the left eye. EEG data were sampled at 1000 Hz and analyzed offline with a 30-Hz low pass filter. Trials with undesired eye movements and eye blink artifacts were removed from analysis by a semiautomatic and manual block rejection procedure.

The continuous EEG was subsequently segmented beginning at 200 ms before stimulus onset and lasting for 800 ms. The baseline for ERP analysis was 100 ms before appearance of the target stimuli. Individual segments were excluded if the absolute voltage of each channel exceeded 100 µV. In each subject, artifact-free trials were averaged for each task (0-back/2-back) to obtain the corresponding ERP waves. Subjects with less than 30 epochs for each task were excluded. All analyses were performed by using Scan 4.3 and Curry 7.0 (NeuroScan, Inc., El Paso, TX, USA).

Three electrode positions (frontal: Fz; central: Cz; parietal: Pz) were chosen for statistical analyses of N2 and P3. Occipital electrodes (O1, Oz and O2) were selected for P1 and N1 because these components are usually maximal at these electrodes. ERP waves were analyzed in terms of peak latency and baselineto-peak amplitude, as determined by visual inspection. Latency ranges for potentials were designated as follows: 50–150 ms for P1, 140–200 ms for N1, 200–400 ms for N2, and 250–500 ms for P3.

# Data Analysis

We used SPSS 17.0 to analyze all data. Demographic data of participants were analyzed by using a Chi-Square or ttest. Reaction time and accuracy data were analyzed by 2 (group: BPD vs. control) × 2 (WM load: 0-back vs. 2 back) repeated-measures ANOVA with WM load as a withinsubject factor and group as a between-subject factor. ERP components (latency and amplitude) were analyzed by 2 (group: BPD vs. healthy control (HC)) × 2 (WM load: 0-back vs. 2-back) × 3 (electrodes: O1, Oz, O2/ Fz, Cz, Pz) repeatedmeasures ANOVA with WM load and electrodes as withinsubject factors, and with group as a between-subject factor. Greenhouse–Geisser was used to correct compound symmetry violations in the ANOVAs. Correlations between psychological measures (impulsiveness and different affect intensity) and ERP components were calculated by Pearson's correlation. Where appropriate, Cohen's d and η <sup>2</sup> were calculated as indices of effect size.

# RESULTS

# Clinical Data

Clinical data are reported in **Table 1**. Higher scores for impulsiveness (t = 3.94, p < 0.001) and depression (t = 5.22, p < 0.001) were obtained for BPD patients compared to HCs. BPD patients had higher scores on CTQ subscales of emotional abuse (t = 2.20, p = 0.034), emotional neglect (t = 2.14, p = 0.040), and physical neglect (t = 3.68, p < 0.001), as well as on the SAIS subscale of negative intensity (t = 3.40, p < 0.001).


Age, CES-D, BIS, CTQ and SAIS express as mean (SD); CES-D, The Center for Epidemiological Studies Depression Scale; BIS, The Barratt Impulsiveness Scale11th version; CTQ, The Childhood Trauma Questionnaire-Short Form; SAIS, The Short affect intensity scale.

TABLE 2 | Reaction time [ms] and performances [%] for N-back tasks.


Reaction time and Performances express as mean (SD).

# Behavioral Results

Mean reaction time and accuracy data for both groups under each condition are reported in **Table 2**. On mean reaction time, there were no significant group differences (F(1,41) = 0.001, p = 0.979), no significant effects of WM load (F(1,41) = 0.79, p = 0.380), and no significant Group × WM load interaction effects (F(1,41) = 0.22, p = 0.644). However, WM load had a significant effect on accuracy (F(1,41) = 312.38, p < 0.001, η <sup>2</sup> = 0.88). Low WM load had a higher accuracy rate (97.7% ± 3.5%) than high WM load (60.9% ± 12.3%), but no significant group differences (F(1,41) = 0.21, p = 0.652) or significant main Group × WM load interactions (F(1,41) = 0.04, p = 0.848) were observed on accuracy.

# ERP Components

Mean amplitudes and latencies of P1, N1, N2 and P3 for each WM load are shown for both groups in **Table 3**. P1 and N1 grandaverage ERPs are shown in **Figure 2**. N2 and P3 grand-average ERPs are shown in **Figure 3**.

### P1 Amplitude

The main effect of WM load in P1 amplitude was significant (F(1,41) = 18.38, p < 0.001, η <sup>2</sup> = 0.31). P1 amplitude was more positive under low WM load (4.16 ± 3.15 µV) compared to high WM load (2.67 ± 2.56 µV). The effect of electrodes was also significant (F(2,82) = 14.06, p < 0.001, η <sup>2</sup> = 0.26), the amplitude of Oz was the minimum (2.61 ± 2.49 µV). The group difference in P1 amplitude, interactions of WM load × Electrodes, WM load × Group, Electrodes × Group, and WM load × Group × Electrodes were not significant (all F < 2.60, p > 0.08).

#### P1 Latency

The repeated-measures ANOVA on P1 latency showed significant main effect of electrodes (F(2,41) = 10.52, p < 0.001, η <sup>2</sup> = 0.20) and interaction of WM load × Electrodes (F(2,41) = 8.58, p < 0.001, η <sup>2</sup> = 0.17). Further analysis revealed that O2 had a shorter latency (102.20 ± 14.25 ms) compared to O1 (110.06 ± 16.47 ms, p < 0.001, Cohen's d = 0.51) or Oz (108.87 ± 21.19 ms, p = 0.017, Cohen's d = 0.37) under low WM load. Oz had a shorter latency (102.31 ± 20.22 ms) compared to O1 (110.40 ± 17.33 ms, p < 0.001, Cohen's d = 0.43) or O2 (106.48 ± 16.34 ms, p = 0.016, Cohen's d = 0.23) under high WM load. The group difference, main effect of WM load and remaining interactions were not significant (all F < 1.09, p > 0.336).

# N1 Amplitude

The ANOVA conducted on N1 amplitude revealed significant main effects of WM load (F(1,41) = 27.03, p < 0.001, η <sup>2</sup> = 0.40), electrodes (F(2,82) = 12.18, p < 0.001, η <sup>2</sup> = 0.23). The interaction of WM load × Electrodes was also significant (F(2,82) = 5.47, p = 0.011, η <sup>2</sup> = 0.12), post hoc analysis showed that the N1 amplitude was smallest in O1 under both low (−2.58 ± 3.47 µV) and high (−4.19 ± 3.39 µV) WM load. The group difference and the remaining interactions were not significant (all F < 2.01, p > 0.152).

#### N1 Latency

Electrodes exhibited a main effect on N1 latency (F(2,82) = 4.66, p = 0.015, η <sup>2</sup> = 0.10). The N1 latency of Oz (149.26 ± 23.31 ms) was shorter than the N1 latency of O1 (156.39 ± 20.41 ms, p = 0.001, Cohen's d = 0.33) or O2 (154.86 ± 17.36 ms, p = 0.040, Cohen's d = 0.27), but there was no significant difference between O1 and O2 (p = 0.572). None of the group difference, main effect of WM load and remaining interactions were significant (all F < 1.80, p > 0.188).

#### N2 Amplitude

There were significant main effects of WM load (F(1,41) = 30.54, p < 0.001, η <sup>2</sup> = 0.43) and electrodes (F(2,82) = 10.47, p = 0.001, η <sup>2</sup> = 0.20). The interaction of WM load × Electrodes (F(2,82) = 12.12, p < 0.001, η <sup>2</sup> = 0.23) was also significant, N2 amplitude was smaller for Pz (−3.35 ± 4.35 µV) than for Fz (−5.83 ± 3.60 µV, p < 0.001, Cohen's d = 0.62) or Cz (−5.85 ± 4.38 µV, p = 0.001, Cohen's d = 0.57) under low load, while the amplitude of Fz (−3.13 ± 2.77 µV) was larger than that of Pz (−1.78 ± 3.34 µV, p = 0.010, Cohen's d = 0.44) or


P1, N1, N2, P3 amplitude and Latency express as mean (SD).

Cz (−2.22 ± 3.09 µV, p < 0.001, Cohen's d = 0.31) under high WM load. The group difference and other interactions were not significant (all F < 2.02, p > 0.151).

#### N2 Latency

#### A significant group difference was found in N2 latency (F(1,41) = 4.33, p = 0.044, η <sup>2</sup> = 0.10). BPD patients had longer latencies (257.34 ± 20.03 ms) than HC (244.66 ± 19.91 ms). The main effect of electrodes (F(2,82) = 23.97, p < 0.001, η <sup>2</sup> = 0.37) and the Group × Electrode interaction (F(2,82) = 3.95, p = 0.034, η <sup>2</sup> = 0.09) were also significant, indicating that BPD patients had a longer latency (249.64 ± 35.28 ms) than HC (226.21 ± 33.47 ms, p = 0.011) in the Pz electrode only. The main

effect of WM load and other interactions were not significant (all F < 3.63, p > 0.064).

#### P3 Amplitude

Significant group difference was found in P3 amplitude (F(1,41) = 5.62, p = 0.023, η <sup>2</sup> = 0.12), BPD patients had a smaller P3 amplitude (5.03 ± 2.61 µV) compared to controls (7.24 ± 3.46 µV). WM load (F(1,41) = 66.46, p < 0.001, η <sup>2</sup> = 0.62), electrodes (F(2,82) = 41.71, p < 0.001, η <sup>2</sup> = 0.50) and WM load × Electrodes interaction (F(2,82) = 20.68, p < 0.001, η <sup>2</sup> = 0.33) all had significant effects. Post hoc analysis showed that P3 amplitude was significant larger for Pz (10.92 ± 5.18 µV) compared to Fz (5.73 ± 4.24 µV, p < 0.001, Cohen's d = 1.10)

or Cz (8.99 ± 5.36 µV, p < 0.001, Cohen's d = 0.37) under low WM load, the P3 amplitude for Fz (2.85 ± 3.19 µV) was smaller than for Cz (3.96 ± 3.12 µV, p = 0.003, Cohen's d = 0.35) or Pz (4.34 ± 3.04 µV, p = 0.0311, Cohen's d = 0.48) under high WM load. The remaining interactions were not significant (all F < 1.36, p > 0.258).

# P3 Latency

Electrodes significantly affected P3 latency (F(2,82) = 6.98, p = 0.004, η <sup>2</sup> = 0.15). Fz had a longer latency (372.37 ± 27.46 ms) than Cz (365.03 ± 22.76 ms, p = 0.030, Cohen's d = 0.29) or Pz (357.85 ± 24.51 ms, p = 0.005, Cohen's d = 0.56). Furthermore, Cz had a longer latency than Pz (p = 0.033). There was no significant group difference or main effect of WM load, as well as no significant interactions for WM load × Group, Electrodes × Group, WM load × Electrodes, or WM load × Electrodes × Group (all F < 3.02, p > 0.09).

# Correlations among N2 Latency, P3 Amplitude and Psychological Measures

Pearson's correlation analyses showed that N2 latency was not significantly correlated with P3 amplitude in the 0-back task (r = −0.05, p = 0.731) or 2-back task (r = −0.25, p = 0.100).

Pearson's correlation analyses showed that impulsiveness, as detected by the BIS, was not correlated with N2 latency under any WM load. There were no significant correlations between N2 latency and affect intensity scores, or between P3 amplitude and impulsiveness or affect intensity scores (all p > 0.05).

# DISCUSSION

To the best of our knowledge, this study is the first investigation of WM in BPD by ERPs. Our behavioral results showed no differences between BPD patients and HCs in N-back task. Although there were no significant group differences in accuracy or mean reaction time, the BPD group showed lower P3 amplitude and longer N2 latency results compared to the control group. As expected, we found that the accuracy and the P1, N2 and P3 amplitudes decreased as WM load increased. Nevertheless, the reduced P3 amplitude and longer N2 latency in the BPD group were independent of WM load. Meanwhile, in this study, neither impulsiveness nor negative affect was the main factor lead to the deficits of WM in BPD.

Despite the lack of general agreement about its functional role, N2 has been correlated with ease of visual information encoding (Nittono et al., 2007). N2 latency has been used as a physiological marker of the timing of access to different properties of a stimulus (Folstein and Van Petten, 2008). The finding that N2 latency was longer in BPD patients than in HCs suggests that stimulus analysis and evaluation during information encoding in WM might be slower in BPD patients compared to healthy individuals.

BPD patients showed lower P3 amplitudes than HCs. P3 amplitude in the WM task is related to the allocation of attention necessary for WM functioning. Kok (2001) reported a relationship between P3 amplitude and attentional resource allocation. Linden (2005) found P3 to be related to brain regions involved in attention, such as the parietal lobe, temporo-parietal junction, lateral prefrontal areas, and cingulate gyrus. Studies about ADHD (Kim et al., 2014; Stroux et al., 2016), which is often comorbid with BPD (Speranza et al., 2011), concluded that a diminished P3 amplitude could be interpreted as an inefficient allocation of attention in WM. These observations suggest that there is abnormal neural activity on allocating attention resources in BPD patients.

The obvious WM load effect in our study proved the effectiveness of the N-back task. However, the reduced P3 amplitude and longer N2 latency results were independent of WM load, consistent with a previous report that WM is impaired in BPD subjects regardless of WM load (Stevens et al., 2004). The N2 latency WM load-independence might occur because the speed of processing for a 0-back task is not different from that of a 2-back task. For P3 amplitude, consistent with other studies (Posner et al., 2002; Stevens et al., 2004; Lazzaretti et al., 2012), we found no correlation between P3 amplitude and affect intensity or impulsiveness suggested that the diminished P3 amplitude was not modulated by affect intensity and impulsiveness. Previous studies found that the information processing stages are not parallel, and those impairments in one stage may affect other stages (Di Russo et al., 2010; Portella et al., 2012). We did not observe any correlation between N2 latency and P3 amplitude, which suggests that the attenuated P3 amplitude might not be related to the compensation of the longer N2 latency. Based on these findings, we speculate that there was a general dysfunctional attention allocation in WM rather than specific problems due to increased demand on the WM load in BPD patients.

Hagenhoff et al. (2013) reported that perceptual processing and response selection are unaffected in BPD. Similarly to that report, our findings suggested that there were no group differences in P1 and N1. Both P1 and N1 are associated with early, rapid processing of stimuli, encoding, attentional focus and discrimination. These results suggest that BPD patients might not have deficits in the early phase of WM.

No differences of behavioral results were found between BPD group and HC group, which were inconsistent with our ERPs results. These inconsistent findings between behavioral data and imaging data were also found in many previous imaging studies (Karayanidis et al., 2000; Ruchsow et al., 2008; Myatchin et al., 2009), which suggested that EPRs results might be more sensitive, efficient and convinced than behavioral data.

The results of lower P3 amplitudes and longer N2 latencies in BPD patients revealed that BPD patients might have some abnormal brain activities in sub-processing of WM, especially in the process of allocating attention resources and the speed of stimulus analysis and evaluation during information encoding, furthermore, these abnormalities were independent of WM load. Although these detriments might not always be observed in behavioral performance, these findings could also provide some theoretical supports for the dysfunction of WM or cognition in patients with BPD. Even no significant correlations were found between ERPs indexes (N2 latency and P3 amplitude) and impulsiveness or negative affect intensity, we still could not conclude that the abnormalities of ERPs during WM tasks are independent of the symptoms of BPD, just as Fertuck et al. (2006) concluded that cognitive impairment is a key moderator in the development of BPD, influences the formation of insecure disorganized attachment and dissociation and interferes with cognitive development in the interpersonal arena. In the future, more attention should be paid to the relationship between impairment of WM and symptoms of BPD, and the WM and other cognition function should be considered in the diagnosis and the development of BPD. Fertuck et al. (2006) has found that BPD patients with higher executive control and higher performance on visual memory tasks were more likely to finish treatment, in this study, we also found that BPD patients had some dysfunctions of neural activity when finishing WM tasks, which suggested that BPD patients might be hard to allocate more attention resources effectively on the process of treatment, therefore, WM training and other cognitive training might have some advantages in improving the effectiveness of treatment in BPD.

This study has several limitations. First, although there is high temporal resolution, the poor spatial resolution of ERPs makes it difficult to identify which brain regions are related to the impairment of WM in BPD patients. Second, no patients had taken medicine in this cross-sectional study. Therefore, it is unclear whether medicine or psychological treatments would reduce impairment of WM. Further studies tracking the performance of BPD patients after taking medicine would help us to identify which impairment of WM are related to symptoms of BPD. Third, previous research suggests there are sex differences in BPD (Hoertel et al., 2014), whereas we did not analyze differences of WM between males and females with BPD. Therefore, sex differences of WM in BPD should be considered in the future. Finally, although no correlation was found between negative emotion and BPD performance in WM, an emotional N-back task should be used to explore the relationship between impaired WM and emotional dysfunction directly, since instability in affect regulation is a core symptom in BPD patients.

# REFERENCES


# AUTHOR CONTRIBUTIONS

YL was responsible for analyzing data and writing the manuscript. MZ designed the experiment and wrote the manuscript with YL. CX, XJ, XZ and SY were responsible for data collection. JY was responsible for designing the study and revising the manuscript.

# ACKNOWLEDGMENTS

The authors are grateful to all research members for their help, and all the subjects for their participation. This study was supported by grants from the National Natural Science Foundation of China (Grant nos. 81370034 and 81000590).


an fMRI study. Eur. Arch. Psychiatry Clin. Neurosci. 266, 291–305. doi: 10.1007/s00406-015-0593-1


personality disorder. Eur. Psychiatry 30, 221–227. doi: 10.1016/j.eurpsy.2014. 11.009


**Conflict of Interest Statement**: The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2017 Liu, Zhong, Xi, Jin, Zhu, Yao and Yi. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

# Altered Distant Synchronization of Background Network in Mild Cognitive Impairment during an Executive Function Task

Pengyun Wang1,2 , Rui Li 1,2 , Jing Yu<sup>3</sup> , Zirui Huang<sup>4</sup> , Zhixiong Yan<sup>5</sup> , Ke Zhao1,2 and Juan Li 1,2,6 \*

<sup>1</sup>Center on Aging Psychology, CAS Key Laboratory of Mental Health, Institute of Psychology, Beijing, China, <sup>2</sup>Department of Psychology, University of Chinese Academy of Sciences, Beijing, China, <sup>3</sup>Faculty of Psychology, Southwest University, Chongqing, China, <sup>4</sup> Institute of Mental Health Research, University of Ottawa, Ottawa, ON, Canada, <sup>5</sup>School of Education Science, Guangxi Teachers Education University, Nanning, China, <sup>6</sup>State Key Laboratory of Brain and Cognitive Science, Institute of Biophysics, Chinese Academy of Sciences, Beijing, China

Few studies to date have investigated the background network in the cognitive state relying on executive function in mild cognitive impairment (MCI) patients. Using the index of degree of centrality (DC), we explored distant synchronization of background network in MCI during a hybrid delayed-match-to-sample task (DMST), which mainly relies on the working memory component of executive function. We observed significant interactions between group and cognitive state in the bilateral posterior cingulate cortex (PCC) and the ventral subregion of precuneus. For normal control (NC) group, the long distance functional connectivity (FC) of the PCC/precuneus with the other regions of the brain was higher in rest state than that working memory state. For MCI patients, however, this pattern altered. There was no significant difference between rest and working memory state. The similar pattern was observed in the other cluster located in the right angular gyrus. To examine whether abnormal DC in PCC/precuneus and angular gyrus partially resulted from the deficit of FC between these regions and the other parts in the whole brain, we conducted a seed-based correlation analysis with these regions as seeds. The results indicated that the FC between bilateral PCC/precuneus and the right inferior parietal lobule (IPL) increased from rest to working memory state for NC participants. For MCI patients, however, there was no significant change between rest and working memory state. The similar pattern was observed for the FC between right angular gyrus and right anterior insula. However, there was no difference between MCI and NC groups in global efficiency and modularity. It may indicate a lack of efficient reorganization from rest state to a working memory state in the brain network of MCI patients. The present study demonstrates the altered distant synchronization of background network in MCI during a task relying on executive function. The results provide a new perspective regarding the neural mechanisms of executive function deficits in MCI patients, and extend our understanding of brain patterns in task-evoked cognitive states.

Keywords: mild cognitive impairment, executive function, background network, degree of centrality, working memory

#### Edited by:

Lynne Ann Barker, Sheffield Hallam University, United Kingdom

#### Reviewed by:

Etsuro Ito, Waseda University, Japan Eleni Konsolaki, American College of Greece, Greece

> \*Correspondence: Juan Li lijuan@psych.ac.cn

Received: 31 December 2016 Accepted: 05 September 2017 Published: 22 September 2017

#### Citation:

Wang P, Li R, Yu J, Huang Z, Yan Z, Zhao K and Li J (2017) Altered Distant Synchronization of Background Network in Mild Cognitive Impairment during an Executive Function Task. Front. Behav. Neurosci. 11:174. doi: 10.3389/fnbeh.2017.00174

# INTRODUCTION

The term mild cognitive impairment (MCI) is generally used to refer to a transitional zone between normal cognitive function and clinically diagnosed Alzheimer's disease (AD; Winblad et al., 2004). Individuals with MCI display certain forms of cognitive dysfunction, but still maintain the intact ability to perform basic daily activities (Winblad et al., 2004).

Executive function is considered to be the mechanism that integrates the operations of various neural systems (McCabe et al., 2010; Funahashi and Andreau, 2013). Individuals with MCI show significant cognitive deficits in executive function compared to age-matched controls (Saunders and Summers, 2011; Clément et al., 2013). Generally, executive function is thought to include three components, mental set shifting, inhibition of prepotent responses and working memory (Garon et al., 2008). Working memory component in the framework of executive function refers to information updating and monitoring (Garon et al., 2008; McCabe et al., 2010). Many studies have been conducted to investigate working memory deficits in MCI patients (Klekociuk and Summers, 2014; Kirova et al., 2015). In terms of the corresponding neural mechanism, previous studies mainly investigated altered activation during working memory tasks (Bokde et al., 2010; Lou et al., 2015; Migo et al., 2015), which reflect the response of specific external stimuli. In recent years, several studies have investigated the background network during working memory task in MCI patients. For example, Lou et al. (2015) investigated the background network efficiency during the working memory state, and indicated that MCI patients showed increased background network efficiency to compensate the decreased activity and to maintain the working memory state (Lou et al., 2015).

In our previous study, we found altered local synchronization (indexed by regional homogeneity, ReHo) in MCI patients during working memory task relative to resting state (Wang et al., 2016). The index used in this study was the ReHo which employed Kendall's coefficient of concordance to measure the coordination of activity between the voxel's BOLD time series and those of its 26 nearest neighboring voxels to yield a voxel-wise ReHo map (Zang et al., 2004). Thus, it reflected a very local characteristic of the background network in MCI patients during working memory. However, whether the distant synchronization of the background network is changed in MCI patients during working memory functions is still unknown.

In graph theory, a complex system is modeled as a graph, which is defined as a set of nodes linked by edges. For a binary graph, degree of centrality (DC) is the number of edges connecting to a node. For a weighted graph, DC is defined as the sum of weights from edges connecting to a node, which is also sometimes referred to as the node connectivity strength (Zuo et al., 2012). This measure can be formalized as follows:

$$\text{DC}(i) = \sum\_{j=1}^{N} w\_{ij}$$

where i is the focal node, j represents all other nodes, N is the total number of nodes, w is the weighted adjacency matrix. wij is greater than 0 if the node i is connected to node j, and the value represents the weight of the tie (for detail graph demonstration, see Opsahl et al., 2010). On a voxel-wise base, we distinguished local connectivity and distant connectivity using 14 mm as a demarcation point (within a voxel's 14 mm neighborhood, or outside of its 14 mm neighborhood). A 14 mm radius was chosen following Sepulcre et al. (2010) as they observed stable estimates of distant connectivity for radius more than 10–14 mm. In this principle, the long DC (radius more than 14 mm) can be used as the index of distant (short and long) synchronization of the brain network (for detail, see Huang et al., 2016). Additionally, both positive and negative functional connectivity (FC; based on Pearson correlation coefficients) are reported here.

In the present study, using the DC index, we investigated distant background connectivity in MCI patients during working memory in the four conditions: correlations (positive vs. negative) by distance (local vs. distant). We hypothesized that distant connectivity in some regions would be impaired in MCI patients compared with normal healthy older adults during working memory functions.

# MATERIALS AND METHODS

# Participants

In total, 17 MCI patients (age, 70.53 ± 4.54 years) and 16 healthy normal control (NC) elderly adult subjects (age, 68.56 ± 5.76 years) participated in this study. Participants were recruited from a community-based screening data pool in Beijing (healthy older adults, n = 865; MCI, n = 115; dementia, n = 21; Yu et al., 2012, 2014; Yin et al., 2015). MCI was diagnosed according to the diagnostic criteria for MCI (Petersen et al., 1999, 2001) and supplemented by scores from the Montreal Cognitive Assessment (MoCA; Nasreddine et al., 2005), Mini-Mental Status Examination (MMSE; Folstein et al., 1975) and Clinical Dementia Rating (CDR; Morris, 1993). For detail, please see the previous study (Wang et al., 2016). This study was approved by the research ethics committees of the Institute of Psychology, Chinese Academy of Science (H11036). Written informed consent was obtained from each participant.

# The Hybrid Delayed-Match-to-Sample Task (DMST)

Participants performed a modified hybrid delayed-match-tosample task (DMST; Jiang et al., 2000; Lawson et al., 2007; Guo et al., 2008) in an functional magnetic resonance imaging (fMRI) scanner. The DMST is an executive function task and relies on the working memory component. The DMST was described in detail in our previous study (Wang et al., 2016). Briefly, during each trial, participants were asked to memorize two target objects which were presented side-by-side for 3500 ms. Then, the test objects (matching target objects or non-matching distractor objects) presented for 1000 ms. Participants needed to indicate whether a test object matched a target object by pressing corresponding button. The task requires participants to update and monitor the target and distractor pictures.

# Image Acquisition

Participants were scanned using a Siemens Trio 3.0 tesla scanner (Erlangen, Germany) at the Beijing MRI Center for Brain Research. During resting state scanning, participants were instructed to lie quietly with their eyes closed and not to think of anything in particular. For each participant, 200 resting state functional volumes were collected using the following parameters: TR = 2000 ms, flip angle = 90◦ , time echo (TE) = 30 ms, thickness = 3.0 mm, field of view (FOV) = 200 × 200 mm<sup>2</sup> , 33 axial slices, acquisition matrix = 64 × 64, gap = 0.6 mm, and in-plane resolution = 3.125 × 3.125. During working memory task scanning, the same parameters were used to collect 163 functional volumes for each run. Additionally, high-resolution 3-dimensional T1-weighted structural images were obtained for each participant using the following parameters: TR = 1900 ms, TE = 2.2 ms, flip angle = 9◦ , acquisition matrix = 256 × 256, 176 slices, and voxel size = 1 × 1 × 1 mm<sup>3</sup> .

# Behavioral Data Analysis

The response accuracy of working memory was calculated according to the hit rate (correct target detection) minus the false alarm rate (false report for distractors). Response times (RTs) were calculated as the mean response time for all test stimuli (targets and distractors). To consider the trade-off between accuracy and response time, working memory performance was further indexed using response accuracy divided by response time, which is the reciprocal of the ''inverse efficiency score'' previously reported (Kennett et al., 2001; Spence et al., 2001).

# Image Preprocessing

Functional MRI data, including both resting and task states, were preprocessed using the Statistical Parametric Mapping program (SPM8<sup>1</sup> ) and the toolbox for Data Processing and Analysis of Brain Imaging (DPABI V1.3<sup>2</sup> ; Yan and Zang, 2010). Onehundred and fifty-four volumes of both states were corrected for intra-volume acquisition time differences between slices using Sinc interpolation. Then, volumes were corrected for intervolume geometrical displacement due to head motion using a six-parameter spatial transformation. All included participants had head motions less than 3 mm in any one direction during scanning. Coregistration, segmentation and writing normalization were conducted using unified segmentation of each participant's T1 image. Normalized volumes were resampled to a voxel size of 3 × 3 × 3 mm<sup>3</sup> . Nuisance covariates, including head motion parameters, global mean signal, white matter signal and cerebral spinal fluid signal, were regressed out.

To make the resting and task state comparable, for the working memory run, an additional regressor of task conditions was included (e.g., targets and distracters present vs. absent trials; response vs. no response trials. See Jones et al. (2010) and Gordon et al. (2012) for the rationale for removing task-load effects). By this preprocessing step, resting and task data differed only in the subjects' cognitive state. fMRI images were further spatially normalized to the Montreal Neurological Institute (MNI) echo planar imaging (EPI) template using an optimized 12-parameter affine transformation and nonlinear deformations. A high-pass filter (128 s cutoff period) was used to remove low-frequency confounds (see Lou et al., 2015). It was noted that as spatial smoothing may have artificially enhanced DC intensity that reduces reliability, just as the index of ReHo (Zuo et al., 2013), DC was calculated from an unsmoothed BOLD time series. Spatial smoothing was then performed using a 4-mm full-width at half-maximum (FWHM) Gaussian kernel.

# Whole Brain Degree of Centrality (DC)

The DC analysis was performed for each subject by GRETNA toolbox<sup>3</sup> (Wang et al., 2015b). The principle was described in detail in our previous study (Huang et al., 2016). Briefly, the correlation between the time series of each voxel with every other voxel in the individual whole brain mask was computed. A binary, undirected adjacency matrix was then obtained by thresholding each correlation at r > 0.3 (following Huang et al., 2016). Based on the graph theory, DC was calculated at the individual level (Zuo et al., 2012). We computed DC by counting the number of functional connections (positive and negative correlations respectively) for each voxel to voxels inside (for local connectivity map) and outside (for distant connectivity map) of a 14 mm radius. Thus, we acquired four matrices for each group and cognitive state: positive local DC, positive distant DC, negative local DC and negative distant DC. In addition, normalized DC (DC-Z) indices were calculated by transforming DC to Z-scores based on the global mean of DC and standard deviation (SD) across voxels in the brain (Buckner et al., 2009; Zuo et al., 2012; Di Martino et al., 2013).

# Structural Image Analysis

Previous studies have reported gray matter (GM) atrophy in regions such as the medial temporal lobe, the frontal cortex and the parietal cortex of MCI patients (Singh et al., 2006; Karas et al., 2008). To control for the influence of GM atrophy on our DC analysis, a voxel-based morphometry (VBM) analysis was performed for structural images using DPABI to identify regions of GM atrophy. See the previous study for the detailed method and results (Wang et al., 2016). Briefly, MCI patients exhibited significant GM loss in several brain regions relative to NCs: the middle part of the medial frontal lobe, including parts of the cingulate gyrus, the limbic lobe and nearby white matter and the lateral frontal and parietal lobes. To analyze the effects of GM atrophy on the functional results, we performed the following repeated-measures analysis of variance (ANOVA) in which the GM intensity maps were entered as covariates.

<sup>1</sup>http://www.fil.ion.ucl.ac.uk/spm

<sup>2</sup>http://rfmri.org/dpabi

<sup>3</sup>http://www.nitrc.org/projects/gretna/

# Between-Group Comparison of DC in Resting and Background Network of Working Memory States

To determine the interaction effects of group and cognitive state on DC, we performed a 2-way repeated-measures analysis of co-variance (ANCOVA) using SPM8, with group (MCI and NC) as a between-subject factor and cognitive state (resting and working memory) as a repeated-measure, controlling for age, gender, education, head motion and GM atrophy. Post hoc 2-sample t-tests were performed on clusters showing significant group × state interactions. The statistical threshold was set at p < 0.05 using the AlphaSim correction for multiple comparisons with a threshold of p < 0.01 at the voxel level. All coordinates are reported in the MNI format.

# Functional Connectivity Analysis

To examine the changes of FC from resting to task states in regions showing significant group × state interactions, we conducted a seed-based connectivity analysis with regions showing group × state interactions as seeds. First, for the background network during working memory task in each MCI or NC individual, voxel-wise FC maps to a given seed were computed as maps of temporal correlation coefficients between the BOLD time course of each voxel and the averaged BOLD time course across voxels in the seed region. FC maps from individual subjects were then transformed using Fisher's Z transformation. Second, to determine the interaction effects of group and cognitive state on the FC maps, we performed 2-way repeated-measures ANOVA with group (MCI and NC) as a between-subject factor and cognitive state (resting and working memory) as a repeated-measure. Post hoc 2-sample t-tests were performed on clusters showing significant group × state interactions. The statistical threshold was set at p < 0.05 using the AlphaSim correction for multiple comparisons with a threshold of p < 0.01 at the voxel level.

# Correlation between State-Related Changes and Working Memory Performance

We calculated the Pearson correlation coefficient to explore the relationship between state-related changes of DC as well as corresponding FC and working memory performance (the quotient measured as response accuracy divided by response time) in regions showing significant interaction between groups and states. Bootstrap results were based on 1000 bootstrap samples, and 95% confidence interval were reported.

# Global Graph Theory Measures

To characterize the topological organization of networks, global efficiency and modularity were calculated on a voxel-level graph, using the GRETNA toolbox mentioned above. Global efficiency is biologically meaningful as it reflects how well the information propagates over a network (Rubinov and Sporns, 2010). Modularity measures the fraction of within-module edge weights in an actual network minus the expected value of this fraction in a network with the same community divisions, but the connections are randomly arranged between the nodes (Wang et al., 2015a).

# RESULTS

# Alteration of DC

We did not find any stable or significant interactions between group and state in the positive local, negative distant, or negative local conditions. We observed significant results only in positive long distance condition located in two clusters. One was the bilateral precuneus (cluster size = 258) and posterior cingulate cortex (PCC, cluster size = 154), peak MNI coordinates: x = 0, y = −51, z = 30, and the other one was located in the right angular gyrus (cluster size = 67), peak MNI coordinates: x = 45, y = −54, z = 33 (**Figure 1**). For both clusters, further post hoc t-tests revealed that the value of DC decreased in the working memory state relative to the resting state in NCs, but did not significantly change in MCI patients (right panel of **Figure 1** and **Table 1**).

The MCI patients showed significant deficit in working memory performance compared with NC, which was reported in our previous study (Wang et al., 2016). Furthermore, the state-related change (task minus resting) of DC in cluster precuneus and PCC correlated negatively with working memory performance in all participants (including both NC and MCI; r = −0.507, p = 0.003, bootstrap-based 95% confidence interval −0.693, −0.309), and in the MCI group alone (r = −0.502, p = 0.040, bootstrap-based 95% confidence interval −0.854, −0.195). For cluster right angular gyrus, the state-related change of DC correlated negatively with working memory performance in all participants including both NC and MCI (r = −0.390, p = 0.025, bootstrap-based 95% confidence interval −0.656, −0.118), but not in the MCI group alone (r = −0.089, p = 0.733, bootstrap-based 95% confidence interval −0.565, 0.363).

# Functional Connectivity of Regions Showing Group × State Interactions during Working Memory

To examine whether abnormal DC in these regions found as described above partially resulted from the deficit of FC between these regions and the other parts in the whole brain, we conducted a seed-based connectivity analysis with regions showing group × state interactions as seeds. Then, we compared the seed-based connectivity maps using 2-way repeatedmeasures ANOVA with group as a between-subject factor and cognitive state as a repeated-measure. For cluster bilateral PCC/precuneus, the significant region showing group × state interactions located in right inferior parietal lobule (IPL; **A**), cluster size = 87, peak MNI coordinates: x = 57, y = −24, z = 30 (**Figure 2**). For cluster right angular, we observed significant results in right anterior insula, cluster size = 56, peak MNI coordinates: x = 33, y = 9, z = 12 (**Figure 2**). For both clusters, further post hoc t-tests revealed that the value of FC was higher in the working memory state relative to the resting state in NCs, but did not significantly change in MCI patients (right panel of **Figure 2** and **Table 1**).

Furthermore, the state-related change (task minus resting) of FC between right angular gurus and right anterior insula correlated positively with working memory performance in all participants (including both NC and MCI; r = 0.645, p < 0.001, bootstrap-based 95% confidence interval 0.425, 0.798), and in the MCI group alone (r = 0.579, p = 0.015, bootstrap-based 95% confidence interval 0.231, 0.829). The staterelated change (task minus resting) of FC between cluster precuneus/PCC and IPL did not correlate with working memory performance in all participants or in MCI group alone, both ps > 0.05.

# Global Graph Theory Measures

We observed significant main effect of cognitive state in global efficiency, F(1,31) = 8.66, p = 0.006, η 2 <sup>p</sup> = 0.22. Participants demonstrated higher performance of global efficiency in working memory state than in rest state for both NC and MCI patients. The main effect of group and the interaction between group and cognitive state did not achieve significance, both ps > 0.05. The similar pattern was observed in modularity. Participants showed higher performance of modularity in working memory state than in rest state for both NC and MCI patients, F(1,31) = 16.47, p < 0.001, η 2 <sup>p</sup> = 0.35. There was no significant main effect of group or interaction between group and cognitive state, both ps > 0.05 (**Figure 3**).

# DISCUSSION

The present study demonstrates the altered distant synchronization of background network in MCI during working memory task. We observed significant interactions

TABLE 1 | Regions showing significant interactions between group (mild cognitive impairment, MCI and normal control, NC) and cognitive state (resting and working memory).


Notes: DC, degree of centrality; SD, Standard deviation; FC, functional connectivity; PCC, posterior cingulate cortex; IPL, inferior parietal lobule.

between group and state in the bilateral PCC and the ventral subregion of precuneus. Specially, the results indicated that, for the NC group, the distant FC of the PCC/precuneus with the other regions of the brain was higher during rest state than that during working memory state. For MCI patients, this pattern was not evident. There was no significant difference between rest and working memory state. A similar pattern was observed in the other cluster located in the right angular gyrus. To examine whether abnormal DC in these regions found as described above partially resulted from the deficit of FC between these regions and the other parts in the whole brain, we conducted a seed-based connectivity analysis with these regions as seeds. The results indicated that the FC between bilateral PCC/precuneus and the right IPL increased from rest to working memory state for NC participants. For MCI patients, however, there was no significant change between rest

FIGURE 3 | To characterize the topological organization of networks, global efficiency (A) and modularity (B) were calculated on a voxel-level graph. Participants demonstrated higher performance of global efficiency and modularity in working memory state than in rest state for both NC and MCI patients. The main effect of group and the interaction between group and cognitive state did not achieve significant for global efficiency or modularity. Error bars depict SEM.

and working memory state. The similar pattern was observed for the FC between right angular gyrus and right anterior insula.

PCC is the key structure of the default mode network (DMN; Menon, 2011; Patel et al., 2015). The ventral subregion of precuneus (next to PCC) is also wildly accepted as part of the DMN (Zhang and Li, 2012). Some studies have argued that the ventral subregion of precuneus is deactivated during successful memory encoding (Daselaar et al., 2004; Vannini et al., 2011). In addition other research has shown a tendency for negative correlations between the activations of this region and the task performance (Rami et al., 2012). Furthermore, the posterior IPL is also a part of the DMN (Andrews-Hanna et al., 2010). The results of the present study demonstrated that as a key structure of the DMN, the DC in region of PCC/precuneus in the brain network was higher in rest state than in task state. However, when normal participants transferred from the rest state to a working memory state, the DMN was no longer the key structure for the current cognitive task. Therefore, the DC in PCC/precuneus was lower in task state than in rest state. Nonetheless, the FC between PCC/precuneus and IPL was higher during the working memory state than the rest state. This pattern may demonstrate that although the activity of the regions belong to DMN decreased from rest to task state, the inner connectivity within the DMN increased, because the regions of DMN had to cooperate to be inhibited to complete the current cognitive task. Compared with NC, these patterns were not observed in MCI patients, which may result in the working memory deficit in MCI patients. It should be noted that the state-related alteration (task minus resting) of FC between cluster PCC/precuneus and IPL is not correlated with working memory performance in NC or in MCI group. This may indicate that the state-related alteration of FC between these two regions changed qualitatively from NC to MCI patients.

As a part of the default mood network, it is striking to see the high similarity in task-free deactivation in angular gyrus (Menon, 2011; Seghier, 2013). Some studies have reported that the deactivation of angular gyrus during encoding is beneficial for the memory performance (Daselaar et al., 2009; Uncapher and Wagner, 2009). Nevertheless, there were also opposite findings suggesting that the left angular gyrus activity was greater during successful vs. unsuccessful episodic encoding (Maillet and Rajah, 2014). Elman et al. (2013) demonstrated the dynamic changes in angular gyrus during encoding. The angular gyrus activity decreased when the stimulus initially presented in the memory task (participants were asked to make a perceptual judgment). Then, the stimulus disappeared. Several seconds later, when the participants were asked to remember the stimulus presented previously for a later memory test (an elaborative representational encoding process), the angular gyrus activity increased. Furthermore, as reviewed by Seghier (2013), angular gyrus is a cross-modal integrative hub. Specifically, it is an interface between the converging bottom-up multisensory inputs and the top-down predictions. Angular gyrus processes the information from insula and prefrontal regions. The anterior insula and the dorsal anterior cingulate cortex are the key structures of the salience network, which is involved in detecting, integrating and filtering relevant interoceptive, autonomic and emotional information (Seeley et al., 2007; Menon, 2011). These functions are important to complete the current working memory task.

As described in the introduction, DC is defined as the sum of weights from edges connecting to a node, which is also referred to as the node connectivity strength (Zuo et al., 2012). Thus, DC can be used to represent the importance of a brain region throughout the brain network. The results of the present study demonstrated that compared to MCI patients, the DC of right angular gyrus in NC participants was lower during working memory state compared with rest state, but the connectivity between right angular gyrus and right anterior insula increased. Therefore, the results may indicate that compared to MCI patients, although the importance of right angular gyrus (represented using DC) in the network in NC participants decreased from rest to working memory state, the FC of angular gyrus was more converging on the right insula which facilitated the current cognitive task. The positive correlation between FC and the working memory performance confirmed this relationship further. Relative to NC, MCI patients lacked corresponding changes (as described above) from rest to working memory state, which may reflect their working memory deficit.

We observed significant main effect of cognitive state in global efficiency and modularity. Participants showed higher global efficiency and modularity in working memory state than in rest state. However, there was no difference between MCI and NC groups. Considering the changes described above (the PCC/precuneus, angular gyrus and the FC between PCC/precuneus and IPL, as well as the FC between angular gyrus and anterior insula), it may indicate that the brain network of MCI patients did not show large scale FC changes, on the contrary, it demonstrated a lack of efficient reorganization from rest state to a working memory state. Of note, we observed these different reorganization patterns between NC and MCI patients only in the long distant positive FC condition. This phenomenon was not found in negative FC or in the positive FC in physically short distances of the brain (i.e., local networks).

It should be noted that our study accounted for the possible influence of GM structural atrophy on our findings. To limit the effects of GM atrophy on the functional results, during the DC analysis, we performed the repeated-measures ANOVA in which the GM intensity maps were entered as covariates. Actually, the VBM analysis showed that MCI patients demonstrated GM atrophy in the middle part of the medial frontal lobe and in the lateral parts of frontal and parietal lobes (Wang et al., 2016), however, all these regions did not overlap with regions showing interactions between group and state.

As mentioned in our previous study (Wang et al., 2016), the limitations of the present study were largely related to the small number of MCI patients and the uncontrolled subtypes and severity of cognitive impairment in MCI patients. In addition, the sub-processes of the working memory components (such as encoding, maintain and retrieval) also need to be considered in the further studies.

Our previous study found the altered local synchronization (indexed by ReHo) in MCI patients during working memory task relative to resting state (Wang et al., 2016). The current study expands our previous work by finding the altered distant synchronization of the background network in MCI patients during working memory, which may also serve as a parameter for disease diagnosis and progression monitoring in MCI patients. To our knowledge, this is the first study to examine alterations of distant FC of the background network during working memory in MCI patients. Similar to our previous study (Wang et al., 2016), the present study provides a new perspective regarding the neural mechanisms of working memory deficits in MCI patients, and extends our knowledge of altered brain patterns in resting and task-evoked states.

# REFERENCES


# AUTHOR CONTRIBUTIONS

PW conceived the idea, designed the study, analyzed and interpreted data, and drafted part of the manuscript. RL, ZH, KZ and ZY assisted with the analysis and interpretation of data. JY conducted the experiment. JL conceived the idea, designed the study and participated in the writing and revision of the manuscript.

# FUNDING

This research was supported by the National Natural Science Foundation of China (31271108, 31470998, 31671157 and 31400895); Beijing Municipal Science & Technology Commission (Z171100000117006), the Pioneer Initiative of the Chinese Academy of Sciences, Feature Institutes Program, TSS-2015-06, and CAS Key Laboratory of Mental Health, Institute of Psychology (KLMH2014ZK02, KLMH2014ZG10).


**Conflict of Interest Statement**: The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2017 Wang, Li, Yu, Huang, Yan, Zhao and Li. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

# Resting-State Coupling between Core Regions within the Central-Executive and Salience Networks Contributes to Working Memory Performance

Xiaojing Fang<sup>1</sup> , Yuanchao Zhang<sup>1</sup> , Yuan Zhou<sup>2</sup> , Luqi Cheng<sup>1</sup> , Jin Li <sup>3</sup> , Yulin Wang<sup>4</sup> , Karl J. Friston<sup>5</sup> and Tianzi Jiang1,3,6,7,8 \*

<sup>1</sup> Key Laboratory for NeuroInformation of the Ministry of Education, School of Life Science and Technology, University of Electronic Science and Technology of China, Chengdu, China, <sup>2</sup> Key Laboratory of Behavioral Science and Magnetic Resonance Imaging Research Center, Institute of Psychology, Chinese Academy of Sciences, Beijing, China, <sup>3</sup> National Laboratory of Pattern Recognition, Institute of Automation, Chinese Academy of Sciences, Beijing, China, <sup>4</sup> Key Laboratory of Cognition and Personality (Ministry of Education), School of Psychology, Southwest University, Chongqing, China, <sup>5</sup> Wellcome Trust Centre for Neuroimaging, Institute of Neurology, University College London, London, UK, <sup>6</sup> Brainnetome Center, Institute of Automation, Chinese Academy of Sciences, Beijing, China, <sup>7</sup> CAS Center for Excellence in Brain Science, Institute of Automation, Chinese Academy of Sciences, Beijing, China, <sup>8</sup> Queensland Brain Institute, University of Queensland, Brisbane, QLD, Australia

#### Edited by:

Lynne Ann Barker, Sheffield Hallam University, UK

#### Reviewed by:

Rebecca Elliott, University of Manchester, UK Tamas Kozicz, Radboud University Medical Center, Netherlands

#### \*Correspondence:

Tianzi Jiang jiangtz@nlpr.ia.ac.cn

Received: 20 September 2015 Accepted: 08 February 2016 Published: 25 February 2016

#### Citation:

Fang X, Zhang Y, Zhou Y, Cheng L, Li J, Wang Y, Friston KJ and Jiang T (2016) Resting-State Coupling between Core Regions within the Central-Executive and Salience Networks Contributes to Working Memory Performance. Front. Behav. Neurosci. 10:27. doi: 10.3389/fnbeh.2016.00027 Previous studies investigated the distinct roles played by different cognitive regions and suggested that the patterns of connectivity of these regions are associated with working memory (WM). However, the specific causal mechanism through which the neuronal circuits that involve these brain regions contribute to WM is still unclear. Here, in a large sample of healthy young adults, we first identified the core WM regions by linking WM accuracy to resting-state functional connectivity with the bilateral dorsolateral prefrontal cortex (dLPFC; a principal region in the central-executive network, CEN). Then a spectral dynamic causal modeling (spDCM) analysis was performed to quantify the effective connectivity between these regions. Finally, the effective connectivity was correlated with WM accuracy to characterize the relationship between these connections and WM performance. We found that the functional connections between the bilateral dLPFC and the dorsal anterior cingulate cortex (dACC) and between the right dLPFC and the left orbital fronto-insular cortex (FIC) were correlated with WM accuracy. Furthermore, the effective connectivity from the dACC to the bilateral dLPFC and from the right dLPFC to the left FIC could predict individual differences in WM. Because the dACC and FIC are core regions of the salience network (SN), we inferred that the inter- and causalconnectivity between core regions within the CEN and SN is functionally relevant for WM performance. In summary, the current study identified the dLPFC-related restingstate effective connectivity underlying WM and suggests that individual differences in cognitive ability could be characterized by resting-state effective connectivity.

Keywords: working memory, dorsolateral prefrontal cortex, resting state fMRI, functional connectivity, effective connectivity, spectral dynamic causal modeling

# INTRODUCTION

Working memory (WM) refers to the temporary maintenance and manipulation of information that is essential for higherorder cognitive processing, including comprehension, learning, and reasoning (Baddeley, 1992). Using functional imaging, researchers found that WM is associated with the prefrontal cortex, medial and inferior temporal lobes, and areas near the intraparietal sulcus (Baddeley, 2003; Owen et al., 2005; Rottschy et al., 2012), which respectively are implicated in executive functioning, episodic processing and declarative memory, and the phonological store (Baddeley, 2000, 2003; Petersson et al., 2006). Furthermore, regional activation studies reported that WM tasks consistently recruit the dorsolateral prefrontal cortex (dLPFC; linked to encoding and manipulating information), the dorsal anterior cingulate cortex (dACC; implicated in error detection and performance adjustment), and the ventrolateral prefrontal cortex (vLPFC) extending to the anterior insula (involved in retrieving, selecting information, and inhibitory control; Owen, 1997, 2000; Carter et al., 1999; D'Esposito et al., 2000; Ramautar et al., 2006; Aron et al., 2014). These task-based findings have elucidated some aspects of the functional anatomy of the WM.

Since a study revealed that resting-state activity is correlated with subsequent WM performance (Hampson et al., 2006), investigators have realized that resting-state functional magnetic resonance imaging (rs-fMRI) is a useful technique for understanding the neural basis of the WM. Subsequent studies confirmed this viewpoint by delineating the correlations between WM performance and intrinsic resting-state activity in brain regions. For example, coherent neuronal activity between the dLPFC and medial prefrontal cortex during rest was revealed to be related to WM accuracy (Hampson et al., 2010). However, complex cognitive functions are not reflected by one or two brain regions but rely on many brain regions. Specifically, the WM network comprises multiple intrinsic connectivity networks. Each of them represents a fundamental aspect of the functional brain organization, which consists of some regions that have similar functions. Therefore, increasing attention has been paid to uncovering the correspondence between the spatial composition of these core regions in the intrinsic organizations and the regions engaged by specific cognitive processes (Smith et al., 2009; Di et al., 2014b). For example, some core regions (e.g., the dLPFC, insular areas, and anterior cingulate cortex) in the intrinsic organizations (e.g., the executive-control and frontoparietal networks) identified using resting-state data have been revealed to be involved in WM (Smith et al., 2009). Another study indicated that the regional amplitudes of the resting-state activity of WM regions could predict the activity of these regions during a WM task (Zou et al., 2013).

Although these studies indicated that some brain regions captured at rest are engaged by WM ability, WM is actually achieved by cooperation between distinct regions (Baddeley, 2003). Hence, investigating the independent functions of core WM-related regions may be inappropriate for delineating the ways these regions are involved in WM. Gordon et al. (2012) attempted to address this issue by studying the spatial similarity between the WM network detected at rest and the regions activated during cognitive tasks and suggested that the patterns of activation during a WM task may result from integrating distinct WM-related regions obtained from data collected during rest. However, neither that study nor subsequent ones (Tu et al., 2012; Gordon et al., 2015) revealed the specific mechanism (such as a causal relationship) for this integration. Therefore, dynamic causal modeling (DCM; Friston et al., 2003), which is able to deduce this causal relationship at rest (Razi et al., 2015), may help to clarify this issue.

In the present study, using a large sample of healthy young adults, we investigated the functional relationship between WM ability and the interactions between core WM regions during rest. We first identified the WM-related regions by linking WM accuracy to resting-state functional connectivity (rsFC) with the bilateral dLPFC, the core region in the central-executive network (CEN) that is most frequently involved in WM (Rottschy et al., 2012). Then spectral dynamic causal modeling (spDCM) was used to quantify the effective (corrected) connections between these regions. Finally, a correlation analysis of the effective connectivity and WM accuracy was conducted to investigate the relationship between WM performance and coupling between WM-related regions.

# MATERIALS AND METHODS

# Participants

Two hundred and sixty-four right-handed, healthy young adults (141 females, age: 22.7 ± 2.4 years, education in years: 15.5 ± 2.6) with no history of neurological or psychiatric disease were recruited. Nine participants who did not take the behavioral test were excluded. All participants signed a written, informed consent form that was approved by the Medical Research Ethics Committee of Tianjin Medical University.

# Data Acquisition

Rs-fMRI scanning was performed on a Signa HDx 3.0 Tesla MR scanner (General Electric, Milwaukee, WI, USA). Foam padding was used during the scanning to reduce head motion and earplugs to reduce scanning noise. Rs-fMRI data were obtained using a Single-Shot Echo-Planar Imaging sequence (SS-EPI)

**Abbreviations:** BA, Brodmann area; BMS, Bayesian Model Selection; CEN, central-executive network; DCM, dynamic causal modeling; dACC, dorsal anterior cingulate cortex; dLPFC, dorsolateral prefrontal cortex; FIC, orbital fronto-insular cortex; fMRI, functional magnetic resonance imaging; rsFC, resting-state functional connectivity; rs-fMRI, resting-state functional magnetic resonance imaging; SN, salience network; spDCM, spectral dynamic causal modeling; SPM, Statistical Parametric Mapping; vLPFC, ventrolateral prefrontal cortex; VOIs, volumes of interest; WM, Working Memory.

with the acquisition parameters as follows: no gap, 3.75 mm × 3.75 mm × 4.0 mm (voxel size), 2000/30 ms (TR/TE), 240 mm × 240 mm (FOV), 64 × 64 (resolution within slice), 90◦ (flip angle), 40 transverse slices, and 180 volumes. During the functional magnetic resonance imaging (fMRI) scans, individuals were instructed to keep their eyes closed and relax, move as little as possible, think of nothing in particular, and not fall asleep.

# Experimental Paradigm

The WM performance of each participant was evaluated by the 2-back task. This task has been widely used, especially in studying the neural basis of WM at rest (Kane and Engle, 2002; Tu et al., 2012; Gordon et al., 2015), since it is moderately difficult (Schmidt et al., 2009). The participants were required to press a button when a letter was the same as the letter they saw two letters before. The letter stimuli were case-sensitive and chosen from a set of 18 uppercase letters and 18 lowercase letters (all consonants except L, l, W, w, Y, and y). Each letter stimulus appeared for 200 ms, and the inter-stimulus interval was 1800 ms. There were three 2-back WM blocks. Each stimulus block consisted of 30 stimuli containing 10 targets, and was indicated by an instruction cue before each block. The number of correctly responding target letters was used as the WM accuracy. The results of a covariance analysis (p = 0.27) revealed that our study did not support a speed-accuracy tradeoff. The WM task was performed outside the scanner, and the behavioral assessments were completed within 8 weeks of the fMRI study.

# Preprocessing

Resting-state data preprocessing was performed using Statistical Parametric Mapping (SPM12<sup>1</sup> ). After discarding the first 10 time points to allow for magnetization equilibrium, the preprocessing steps for the remaining 170 functional scans included: (1) slice timing correction and realignment to the first volume to provide for head-motion correction; (2) normalization to Montreal Neurological Institute (MNI) space with resampling to 3 × 3 × 3 mm<sup>3</sup> ; (3) spatial smoothing with a Gaussian kernel of 6 mm full-width at half maximum; (4) linear detrending; and (5) regressing out nuisance signals (six head motion parameters and global, cerebrospinal fluid, and white matter signals) and temporal band-pass filtering (0.01–0.08 Hz) for the rsFC analysis. Based on the estimated motion correction, 12 participants with more than 2 mm maximum displacement in any of the x, y, or z directions or more than 2◦ of angular rotation about any axis for any of the 170 volumes were excluded from further analysis. Although whether or not to remove the global signal is debated (Macey et al., 2004; Fox et al., 2009; Weissenbacher et al., 2009; Van Dijk et al., 2010; Saad et al., 2012), for our data, global scaling appeared to improve the specificity of the rsFC analysis.

# Seed-Based rsFC Analysis

The dLPFC is a key region in the CEN, which is most frequently involved in the WM (Hampson et al., 2010; Rottschy et al., 2012). Consistent with previous studies (based on rs-fMRI data) that viewed the dLPFC as representative of the CEN (Song et al., 2008; Hampson et al., 2010), the bilateral dLPFC were chosen as seed regions in this study. We followed the conventional rsFC analysis procedure. Specifically, the rsFC was analyzed using the Pearson's correlation coefficient between the time series for each voxel of the whole brain and the average blood oxygen level dependent time series in the left or right dLPFC, which was defined as the left or right Brodmann area (BA) 46 (Zhou et al., 2007; Song et al., 2008). Fisher's transformation was applied to the rsFC to transform the r values to z values. Thus, a whole-brain rsFC map of the bilateral dLPFC was created for each subject. A voxel-wise one-sample t-test for the rsFC map was performed in a group-level analysis to identify significant functional connectivity with the bilateral dLPFC (corrected to p < 0.05 using a cluster-level false discovery rate).

# Correlation Analysis of the rsFC Data and WM Performance

To investigate the association between the rsFC and WM performance, a correlation analysis of the rsFC and WM was performed using SPM12. Significant correlations were corrected to p < 0.05 using Monte Carlo simulations (Forman et al., 1995) with the parameters including: single voxel p < 0.01, 1000 simulations, full width at half maximum = 6 mm, cluster connection radius = 5 mm; with a mask and a resolution of 3 × 3 × 3 mm<sup>3</sup> . The results of this step provided empirical evidence for defining the connections of the various alternative models in the DCM model space.

# Correlation Analysis of the Effective Connectivity and WM Performance

Using DCM12 in SPM12, an spDCM analysis (Friston et al., 2014), which is an extension of DCM, was used to estimate the effective connectivity (Friston et al., 2003). This method has a number of advantages compared to some other effective connectivity methods (Penny et al., 2004; Friston, 2009, 2011a). DCM provides both neuronal and hemodynamic models. The former is based on low-order approximations to otherwise complicated equations describing the evolution of neuronal states. In the hemodynamic part of the DCM, neuronal activity gives rise to hemodynamic activity by a dynamic process described by an extended balloon model, which is a biophysical model involving a set of hemodynamic state variables, state equations, and hemodynamic parameters (Penny et al., 2004). The parameters of the equations in the neuronal and hemodynamic models encode the strength of the connections and delineate how they change under different conditions. Therefore these parameters are the objects that DCM tries to estimate (Friston, 2009). In effect, spectral DCM is based upon the same sort of convolution models used in conventional (whole brain) analyses of fMRI data. The only differences are that the convolution model is equipped with interregional connections and that the model fitting proceeds in the frequency domain (Razi et al., 2015).

<sup>1</sup>http://www.fil.ion.ucl.ac.uk/spm/

The volumes of interest (VOIs) for the spDCM were identified based on the results of a correlation analysis of the rsFC data and WM performance. These VOIs were specified as binary masks. We extracted subject-specific estimates of the regional time series following the steps in previous studies that used resting-state DCM (Di and Biswal, 2014a; Kahan et al., 2014; Razi et al., 2015). The connectivity models we considered are described in the ''Results'' Section. The most likely generative models were identified using fixed effects Bayesian Model Selection (BMS; Stephan et al., 2009). The model parameters of the best model were used as summary statistics and entered into a correlation analysis (corrected to p < 0.05 using a cluster-level false discovery rate). This analysis tested for correlations between the effective connectivity and WM performance.

# RESULTS

# Correlation Analysis of the rsFC Data and WM Performance

The rsFC maps based on the bilateral dLPFC are shown in **Figure 1A**. The regions showing significant correlations between the rsFC and WM performance are summarized in **Table 1** and shown in **Figures 1B,C**. Specifically, the rsFC between the bilateral dLPFC and the dACC (Bush et al., 2000) and between the right dLPFC and the left orbital fronto-insular cortex (FIC; Seeley et al., 2007) were significantly greater than zero. Furthermore, the rsFC between the left dLPFC and the bilateral dACC was positively correlated with WM performance (r = 0.263, p = 0.000); the rsFC between the right dLPFC and the bilateral dACC was positively correlated with WM performance (r = 0.222, p = 0.000), and the rsFC between the right dLPFC and the left FIC was also positively correlated with WM performance (r = 0.208, p = 0.001). These results suggest that the stronger the rsFC between the bilateral dLPFC and the bilateral dACC and between the right dLPFC and the left FIC, the better a subject's WM performance.

# Spectral DCM Model Space

In a correlation analysis of the rsFC data and WM performance, the WM-related rsFC of both the left and the right dLPFC was with the same region of the dACC, although the exact range was slightly different. Therefore, the part of the dACC where the two significant correlations overlapped was selected as a VOI. In addition, we did a search based on the dACC and left FIC to explore whether there were other regions whose rsFC with the dACC or left FIC were correlated with WM performance. Specifically, we considered the dACC and the left FIC as seed regions and implemented rsFC and correlation analyses following the steps described in the ''Materials and Methods'' Section. However, we did not find any new VOIs. Hence, based on these results, four VOIs were used in this study—the left dLPFC, right dLPFC, dACC, and left FIC.

A large model space containing more than 1000 models was induced by considering all combinations of the directed connections between the four VOIs. Therefore, only some (plausible) models were considered for BMS. One purpose of using DCM in our study was to attempt to explain why the rsFC between the four regions is related to WM performance. Given that a close relationship exists between effective connectivity and functional connectivity (Friston, 2011b; Friston et al., 2014; Razi et al., 2015), we constrained the plausible alternative models in DCM space based on WM-related rsFCs. Hence, after extracting the resting-state signals of the four VOIs, we employed a region-wise rsFC analysis. Subsequently, the rsFCs between these VOIs were correlated with WM performance. Only those rsFCs related to WM performance could be considered to have WMrelated effective connectivity. The results of the region-wise rsFC analysis did not find any additional WM-related rsFCs other than the three connections which had been revealed in the previous steps. In other words, the WM-related effective connectivity should be considered to be between the bilateral dLPFC and the dACC and between the right dLPFC and the left FIC. Therefore, our model space focused on the connections between these three pairs of regions (**Figure 2A**). Furthermore, there were three possible effective connections between each pair of regions (**Figure 2B**). Thus, the possible combinations resulted in 3<sup>3</sup> = 27 models (**Figure 2C**), which were examined in the spDCM analysis.

# Correlation Analysis of the Effective Connectivity and WM Performance

BMS suggested that the most likely generative model was the third model (**Figure 3A**). This ''reciprocal'' model was therefore considered to be the optimal one. **Table 2** shows the strength of each effective connection under this model. Supplementary t-tests on the effective connection strengths confirmed that the effective connectivity was extremely reliable across the subjects when tested against the null hypothesis, that is, a connection strength of zero (**Table 2**). The left part of **Figure 3A** shows the effective connectivity between the dACC and the bilateral dLPFC and between the left FIC and the right dLPFC. Although the connections were reciprocal, the effective connections that could predict WM performance were restricted to those from the dACC to the bilateral dLPFC and from the right dLPFC to the left FIC.

The correlation between all the effective connections in the winning model and WM performance is shown in **Figure 3B**. Largely positive correlations were found between WM accuracy and the strength of the effective connections from the dACC to the left dLPFC (r = 0.169, p = 0.008), from the dACC to the right dLPFC (r = 0.151, p = 0.019), and from the right dLPFC to the left FIC (r = 0.157, p = 0.014). The three significant correlations all involved afferents to or efferents from the dLPFC.

# DISCUSSION

The present study focused on the causal relationship between core regions of the WM network in healthy people. We found that WM performance was positively correlated with the strength of both functional and effective connectivity between core brain areas belonging to the salience network (SN; i.e., the dACC and left FIC) and the CEN (i.e., the bilateral


dLPFC). Although previous WM studies frequently regarded the functional relationship between the dLPFC and dACC (Kondo et al., 2004) as well as between the dLPFC and left FIC (D'Esposito et al., 2000; Clos et al., 2014) as the neural basis of WM, they did not clarify how these two relationships interacted to function within the neural community that forms the WM system. Our work integrated the functions of the bilateral dLPFC, dACC, and left FIC and linked the dynamic causal relationship between these four regions in the restingstate to explain individual differences in the WM. Our results suggest that the interactive mechanisms of the WM can be reflected by the resting-state coupling between these WM regions.

Individual differences in WM have been suggested as relating to differences in brain connectivity, particularly in the higher order association regions (Wang and Liu, 2014). Our findings that the rsFCs of the bilateral dLPFC-dACC and of the right dLPFC-left FIC were related to WM performance support this view. In addition, our findings are highly consistent with previous studies that reported that these regions are reliably

considered in the bayesian model selection (BMS). The circles indicate the VOIs used in the spDCM analysis.

co-activated during WM tasks, especially during the 2-back WM task using letter stimuli (Cohen et al., 1994; D'Esposito et al., 2000; Kondo et al., 2004). The dACC and left FIC are core regions of the SN, which identifies the most relevant behavioral stimuli and plays an important role in cognitive control (Menon and Uddin, 2010). Existing findings (Greicius et al., 2003; Fox et al., 2006) have revealed that co-activation among these regions indicates that integration of the CEN with the SN is important for the WM. Our results extended this to the resting state and also support a recent finding suggesting that a close correspondence exists between brain activation patterns during a WM task and the engagement of core regions in multiple intrinsic organizations (Gordon et al., 2012).

BMS showed reciprocal influences between the bilateral dLPFC and the dACC and between the right dLPFC and the left FIC. Many studies, including anatomical studies and cytoarchitectural maps, have revealed a bidirectional coupling between these regions (Watson et al., 2006; Fajardo et al., 2008; Fuster, 2008; Medalla and Barbas, 2010). Most crucially, the correlation analysis in the present study revealed that, during rest, the WM performance was correlated with the dACC→dLPFC connectivity and right dLPFC→left FIC connectivity. Our findings are in accordance with a WM process that is based on a top-down mechanism (D'Esposito et al., 2000; Au Duong et al., 2005; Badre and Wagner, 2007). Specifically, the WM cognitive model (Au Duong et al., 2005) suggests that the dACC supervises the dLPFC during the WM process. This is in agreement with existing findings that revealed that the degree of activity in the dACC could predict the strength of the dLPFC activity in cognitive tasks (Kerns et al., 2004). Therefore, the dACC plays an important role in setting the activity levels of the dLPFC during the WM process (Schneider and Chein, 2003). These findings could be regarded as strong evidence for the observed dACC→dLPFC connectivity in the present study. On the other hand, another study suggested that the left FIC is recruited for preprocessing and maintaining information during the delay interval (the second stage of WM processing), whereas the dLPFC plays a crucial role in manipulating this information (D'Esposito et al., 2000). Signals from the dLPFC select the WM-relevant representation in the left FIC, thus enhancing

arrows represent the effective connections correlating with WM performance. The right graphs show the fixed effects BMS in terms of the log-evidence and model probability. (B) Results of the correlation analysis of the effective connections and WM accuracy. Red regression lines indicate that the correlation between the effective connection and WM accuracy was significant (p < 0.05, corrected); the gray lines indicate that the correlation was not significant.

TABLE 2 | Statistical analysis of the resting-state effective connectivity of the winning spDCM model—the strength of the effective connections were analyzed using one-sample t-tests.


these representations (Curtis and D'Esposito, 2003). This finding suggests the right dLPFC→left FIC connectivity identified in our study. Our findings, therefore, seem to integrate these views into one system and suggest that the directions of the information flows between these regions are responsible for the subsequent WM performance.

Our results can be further interpreted from the perspective that the DCM coupling parameters estimated from restingstate data reflect the sensitivity of a target region to its afferent signals (Kahan et al., 2014). This means that modulatory effects on coupling can be conceptualized as an afferent-specific gain modulation of the target (Kahan et al., 2014). In other words, the degree of response of the target is determined by the afferent signals. Therefore, our finding that the increased sensitivity of the bilateral dLPFC to the dACC was correlated with improved WM performance echoes a previous hypothesis (Osaka et al., 2004). In addition, our finding is in line with a recent study of rsfMRI data using a Granger causality analysis that revealed a dominant directional connection from the dACC to the dLPFC in the reciprocal connectivity between these two regions (Uddin et al., 2011). Furthermore, the correlations between better WM performance and the enhanced sensitivity of the dLPFC to the dACC is not surprising given that, as a core region in the SN, the dACC has been demonstrated to adjust temporally inappropriate responses in executive functions during WM (Schneider and Chein, 2003; Miller and D'Esposito, 2005). Specifically, many studies suggest that both the dLPFC and dACC are important regions involved with the executive system (Schneider and Chein, 2003; Barrett et al., 2004; Rueda et al., 2005). In this system, the dACC, which functions as a cognitive activity monitor, determines the level of activity that is sufficient for the executive control signals to produce a successful WM task performance (Schneider and Chein, 2003). Therefore, the interconnectivity between the dACC and the dLPFC provides a pathway through which the dLPFC can accept assistance from the dACC for mediating subsequent cognitive processes (Schneider and Chein, 2003). In summary, the enhanced sensitivity of the dLPFC to the dACC leads to a higher quality of the executive control signals. Thus, our finding suggests that a greater WM ability is related to a better selfadjustment of the executive system.

Our finding that an increased sensitivity of the left FIC to the right dLPFC subserves WM ability is interesting. This is in line with the view that the left FIC is a region that can receive causal outflow from the right dLPFC (Palaniyappan et al., 2013). It is worth mentioning that the left FIC in our finding primarily contained the vLPFC even though the FIC has usually been reported to consist of two parts: the vLPFC (BA 47/45) and anterior insula (Seeley et al., 2007). The left vLPFC has a bias towards cognitive control of memory (Badre and Wagner, 2007). For example, this region is related to the retrieval of relevant knowledge and selection of relevant representations when there is competition between active representations (D'Esposito et al., 2000; Badre and Wagner, 2007). Furthermore, the left FIC is not a strong driver of network dynamics in the CEN (Uddin et al., 2011). Hence, the influence of the left FIC on the right dLPFC should be expected to be lower than that in the reverse direction. The left FIC is a multifunctional integration region, which not only implements the integration of external and internal processes but also has a strong association with semantic and phonological processing (Clos et al., 2014). Indeed, although some findings showed that the left FIC is activated in a selection process, more studies suggested that this region may be generally involved in retrieving, encoding, and selecting abstract information from the memory (Badre and Wagner, 2007; Frings et al., 2009). The degree of this latter group of functions is determined by the demands of executive control (Engle et al., 1999; Curtis and D'Esposito, 2003). Therefore, our results suggest that a higher sensitivity of the left FIC to the dLPFC can contribute to a faster and more precise behavioral performance. In other words, the present study suggests a relationship between WM ability and the modulation of the connectivity strengths inside the executive subsystem of the WM network.

Our aforementioned result may help to clarify one issue. Specifically, in our study, the left FIC mainly involved BA 47, which is in the anterior vLPFC. Several studies suggested that the posterior vLPFC is influenced by the dLPFC and is primarily associated with the maintenance and retrieval of information in the WM (Au Duong et al., 2005). However, other studies have theoretically suggested that BA 47 appears critical in biasing these representations in the WM (Badre and Wagner, 2007). Therefore, according to our findings, it seems reasonable to suggest that BA 47 is more closely related to maintaining information. This is consistent with the hypothesis that, in a WM task, the anterior vLPFC plays a crucial role in maintaining the information manipulated by the dLPFC (D'Esposito et al., 2000).

The present work linked the dynamic interactions of the four core regions to WM ability. Furthermore, the top-down theory that best described the WM process explained why these connections reflect WM performance. Baddeley's model suggests that WM comprises the executive system aided by two or more subsidiary slave systems (e.g., the visuospatial sketch pad, the phonological loop, and the episodic buffer; Baddeley, 2003). If we use this model, our results may elucidate the cerebral substrates of the executive control system, which is the core of the WM model. In fact, Baddeley (2003) suggested that the supervisory attentional system (Norman and Schwartz, 1986) might be the basis for executive control. A subsequent study (Gazzaniga et al., 2002) attributed the attentional functions in the supervisory attentional system primarily to the anterior cingulate cortex. Therefore, consistent with our results, these findings indicated that the causal flows from the dACC to the dLPFC influence executive control. On the other hand, the left FIC has been linked to comparatively simple information processing, such as controlled access to stored conceptual representations (Badre and Wagner, 2007). Hence this region may be an important interface between the executive system and the slave systems, such as the phonological loop and the episodic buffer (Au Duong et al., 2005; Campo et al., 2012). All these findings indicated that, although the dLPFC plays a prominent role in executive control (Baddeley, 2003), executive functioning must be understood in terms of different interactions between regions belonging to different cognitive sub-networks, rather than as a specific association between one region and one higher-level cognitive process (Collette and Van der Linden, 2002). Our results are highly consistent with this view. Further, our study suggests that the degree of coupling is related to WM performance. Therefore, our findings not only demonstrated that effective connectivity during rest can predict individual differences in WM but also that it can provide novel insights into the neural substrates of WM.

There are some limitations that should be addressed for this study. First, in our work, the behavioral measure of the WM performance depended on a direct response to the 2-back WM task. Thus it is possible that other factors, for example, individual differences in responsibility, interfered with our results. To address this issue, we used the naive Bayes (reverse inference) method (Poldrack, 2006) in the BrainMap database<sup>2</sup> to examine the intrinsic function of the core regions obtained in our study and found that coactivities in these regions are strongly related to executive function (Bayes factor = 22.3, meaning that the probability of the existence of this relationship is greater than 0.95). Therefore the result of reverse inference suggests that other factors may have had little influence on our observation. Second, our work paid close attention to the WM mechanism relative to the executive system, which is a central aspect of WM based on the dLPFC. Although the four regions in our work are recognized as core regions of the WM network (Owen et al., 2005; Rottschy et al., 2012), we cannot definitely conclude that the WM system can only be summarized by the interactions between these four regions. That is because, as a complex cognitive task,

<sup>2</sup>http://www.brainmap.org

WM consists of many aspects or components. For instance, some studies indicated that the parietal regions are also activated in WM tasks. However, these regions were not observed in this study. This may indicate the functional segregation of the dLPFC from the parietal cortex can be captured at rest, given that the former has been intimately linked to information manipulation (D'Esposito et al., 2000; Baddeley, 2003) whereas the latter has been implicated in storing phonological short-time memory (Jonides et al., 1998; Baddeley, 2003). However further evidence is required. Therefore, studying the dissociable contributions of the dLPFC and parietal regions to WM based on rs-fMRI data should be done in future studies.

# CONCLUSION

Based on a large healthy Chinese sample, the present study revealed that the effective connectivity from the dACC to the dLPFC and from the right dLPFC to the left FIC was related to individual differences in WM. Our results suggest that the dLPFC is sensitive to the dACC, which can set the appropriate cognitive signal level to produce a successful WM performance. Moreover, a high sensitivity of the left FIC to signals from the dLPFC was also suggested. This increased sensitivity may help to efficiently manipulate WMrelated information. These findings emphasize that this type of causal coupling between core regions in the CEN and SN is necessary for better WM performance during a WM

# REFERENCES


task. Taking the abnormal co-activations of the two intrinsic connectivity networks in mental diseases (e.g., attentiondeficit hyperactivity disorder and schizophrenia) into account, the present study might aid in understanding the abnormal interactions between the two networks in diseased populations with impaired WM (Silver et al., 2003; Martinussen et al., 2005).

# AUTHOR CONTRIBUTIONS

TJ designed and supervised the research. KJF supervised DCM analysis. JL collected the data. XF, YZ, YCZ and YW analyzed the data. XF, YZ, YCZ, LC, KJF, and TJ wrote the article.

# FUNDING

This work was partially supported by the National Key Basic Research and Development Program (973; Grant No. 2011CB707800), the Strategic Priority Research Program of the Chinese Academy of Sciences (Grant No. XDB02030300), and the National Natural Science Foundation of China (Grant No. 91132301, 91432302, and 81101000).

# ACKNOWLEDGMENTS

We are very grateful to Drs. Rhoda E. and Edmund F. Perozzi for English and content editing assistance and discussions.


(dysgranular) prefrontal cortex of humans. Neurosci. Lett. 435, 215–218. doi: 10.1016/j.neulet.2008.02.048


human connectomics: theory, properties and optimization. J. Neurophysiol. 103, 297–321. doi: 10.1152/jn.00783.2009


**Conflict of Interest Statement**: The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2016 Fang, Zhang, Zhou, Cheng, Li, Wang, Friston and Jiang. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution and reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

# Executive function and bilingualism in young and older adults

#### **Shanna Kousaie<sup>1</sup> , Christine Sheppard<sup>1</sup> , Maude Lemieux<sup>2</sup> , Laura Monetta2,3 and Vanessa Taler 1,4\***

<sup>1</sup> Bruyère Research Institute (Affiliated with the University of Ottawa), Ottawa, ON, Canada

<sup>2</sup> Département de Réadaptation, Université Laval, Québec City, QC, Canada

<sup>3</sup> Centre de Recherche de l'Institut Universitaire en Santé Mentale de Québec, Québec City, QC, Canada

<sup>4</sup> School of Psychology, University of Ottawa, Ottawa, ON, Canada

#### **Edited by:**

Lynne Ann Barker, Sheffield Hallam University, UK

#### **Reviewed by:**

Jane Morgan, Sheffield Hallam University, UK Ruth Adam, University of Heidelberg, Germany

#### **\*Correspondence:**

Vanessa Taler, School of Psychology, University of Ottawa, 136 Jean Jacques Lussier, Vanier Hall, Ottawa, ON K1N 6N5, Canada e-mail: vtaler@uottawa.ca

Research suggests that being bilingual results in advantages on executive control processes and disadvantages on language tasks relative to monolinguals. Furthermore, the executive function advantage is thought to be larger in older than younger adults, suggesting that bilingualism may buffer against age-related changes in executive function. However, there are potential confounds in some of the previous research, as well as inconsistencies in the literature. The goal of the current investigation was to examine the presence of a bilingual advantage in executive control and a bilingual disadvantage on language tasks in the same sample of young and older monolingual anglophones, monolingual francophones, and French/English bilinguals. Participants completed a series of executive function tasks, including a Stroop task, a Simon task, a sustained attention to response task (SART), the Wisconsin Card Sort Test (WCST), and the digit span subtest of the Wechsler Adult Intelligence Scale, and language tasks, including the Boston Naming Test (BNT), and category and letter fluency. The results do not demonstrate an unequivocal advantage for bilinguals on executive function tasks and raise questions about the reliability, robustness and/or specificity of previous findings. The results also did not demonstrate a disadvantage for bilinguals on language tasks. Rather, they suggest that there may be an influence of the language environment. It is concluded that additional research is required to fully characterize any language group differences in both executive function and language tasks.

**Keywords: executive function, executive control, bilingualism, bilingual advantage, aging**

#### **INTRODUCTION**

Executive functions, including inhibition, planning, and task switching, are important for everyday function. It is well established in the literature that normal aging is associated with changes in cognition, including executive functions, such as declines in inhibitory control (Hasher and Zacks, 1988) and processing speed (Salthouse, 1996), as well as language comprehension (Kemper, 2006). More recently, studies have shown that being bilingual may result in more efficient, resilient, and robust executive control processes, leading to superior performance on executive function tasks in bilinguals relative to monolinguals (see Bialystok et al., 2009; Adesope et al., 2010). Furthermore, these language group differences have been found to be larger in older than young adults (Bialystok et al., 2004), and it has been suggested that bilingualism may also delay the onset of Alzheimer's disease symptoms (e.g., Bialystok et al., 2007, 2014; but see Chertkow et al., 2010; Zahodne et al., 2014). Given that over 50% of the world's population is bilingual (Fabbro, 1999) and that older adults are the fastest growing demographic (Centers for Disease Control and Prevention, 2003; Statistics Canada, 2007) there are important implications for a "bilingual advantage" in executive function. The goal of the current investigation was to further examine the bilingual advantage in monolingual anglophones, monolingual francophones, and French/English bilingual young and older adults.

The bilingual advantage refers to findings demonstrating superior performance by bilinguals, relative to monolinguals, on tasks measuring inhibitory control. Specifically, advantages have been observed for bilinguals over monolinguals in interference suppression, which refers to the inhibition of task-irrelevant information, but not in response inhibition, which refers to the inhibition of a prepotent response (Bunge et al., 2002). These two components of inhibition can be differentiated using tasks such as the Stroop (1935) or Simon (Simon and Rudell, 1967) tasks to measure interference suppression, and the sustained attention to response task (SART; Robertson et al., 1997) to measure response inhibition. In the Stroop task an individual is required to inhibit the reading of a color word in order to correctly identify the (incongruent) color of the font that the word is printed in. For example, the word BLUE could be printed in blue ink on congruent trials and red ink on incongruent trials. The correct response would be "blue" and "red", respectively; thus, on incongruent trials the participant would be required to inhibit the dominant word reading response in order to correctly identify the color of the ink. Stroop interference refers to the increase in response time (RT) for incongruent trials relative to neutral trials, where there is no color word information, or congruent trials. In one version of the Simon task the individual is required to ignore the spatial position of a stimulus and respond to some other dimension, such as the direction that an arrow is pointing in. For example, a left lateral key press in response to a leftward pointing arrow presented on the right of the screen would require the participant to ignore that the stimulus was presented on the right and respond only to the direction that the arrow is pointing in using a key on the left side of the keyboard. In contrast, the SART requires the participant to withhold a response to an infrequent stimulus, for example the number 3, within a string of stimuli that require a response, such as all other digits.

The bilingual advantage in interference suppression has been found in children (Bialystok and Martin, 2004), young adults (Bialystok et al., 2005; Costa et al., 2008) and older adults; the effect is the largest in the latter group (Bialystok et al., 2004). It is hypothesized that the constant management of two languages by bilinguals makes use of general executive control processes, for example, inhibiting one language while engaging in the other and effortlessly switching between languages when necessary (Bialystok, 2007, 2011; Bialystok et al., 2012). As a result, bilinguals receive extensive practice in these processes, and this experience is thought to be the mechanism underlying the observed bilingual advantage.

However, an advantage for bilinguals relative to monolinguals is not a consistent finding in the literature. Some research has found similar performance across language groups on executive function tasks (e.g., Kousaie and Phillips, 2012a,b; also see Paap and Greenberg, 2013). It is noteworthy that an advantage for bilinguals relative to monolinguals has also been found in working memory, albeit only for spatial material (Luo et al., 2013).

Interestingly, there is also a well-documented bilingual disadvantage on language tasks (see Michael and Gollan, 2005; Bialystok, 2009), including smaller vocabularies and difficulties with lexical access/retrieval (e.g., lower verbal fluency, more frequent tip-of-the-tongue states; longer picture naming latencies). For example, bilinguals have demonstrated a disadvantage relative to monolinguals on the Boston Naming Test (BNT; Kaplan et al., 1983; Gollan et al., 2007) which requires participants to name pictures that increase in difficulty as the task progresses, and working memory for verbal material (Luo et al., 2013). However, in the case of naming tasks, there exists some evidence that the bilingual disadvantage may have been overstated in the literature given that bilinguals have been found to show differential results depending on the method of scoring. Specifically, accepting responses in either language has been found to result in higher scores for bilinguals relative to an administration in which the bilinguals are required to respond in only one of their languages (Gollan et al., 2007).

Other researchers have examined language group differences in verbal fluency measures and found that bilinguals outperformed monolinguals on letter fluency, which has an executive component, but not on category fluency (Luo et al., 2010). Although, others have found a disadvantage for bilinguals on category fluency (Rosselli et al., 2000; Gollan et al., 2002), likely due to the reliance of the category fluency task on linguistic representations. It is noteworthy that Luo et al. subdivided their bilingual group based on vocabulary size, and bilinguals with a high vocabulary outperformed both monolinguals and bilinguals with a low vocabulary.

Taken together, the available evidence indicates that possessing mastery of two languages results in some advantages on executive control tasks and some disadvantages on language-specific tasks. However a number of issues have remained unaddressed in this literature, including whether observed advantages are confined to bilinguals who speak specific languages; whether there is a minimum level of proficiency/years of language experience required before an advantage emerges; and whether there is a particular language use profile that is necessary (e.g., one language at home, vs. another at school/work). Furthermore, the role of immigration status has not been fully explored. That is, in many of the studies that report advantages for bilinguals relative to monolinguals, a large proportion of the participants were immigrants who varied with respect to their native language (L1), or bilinguals who varied with respect to their second language (L2; Bialystok et al., 2006, 2008; Bialystok, 2006; Martin-Rhee and Bialystok, 2008; Luo et al., 2013). Given the many potential confounds that can be associated with participant characteristics such as immigration (e.g., diet, stress, life history; Chertkow et al., 2010), it is necessary to further examine the reported language group effects.

The goal of this investigation was to examine the bilingual advantage in executive function tasks and the bilingual disadvantage in language tasks in young and older monolingual francophones, monolingual anglophones, and French/English bilinguals in the same relatively well-controlled sample. We attempted to collect data from a comprehensive set of tasks measuring multiple aspects of executive function that have been used previously in the literature. We hypothesized that if there is in fact a robust bilingual advantage, bilinguals should outperform monolinguals on tasks of executive function. Specifically, bilinguals should show superior interference suppression relative to monolinguals (as measured by the Stroop and Simon tasks), but all three language groups should show similar response inhibition (as measured by the SART).

The consequences of bilingualism for working memory and cognitive flexibility are less clear. Given previous findings showing better spatial working memory for bilinguals than monolinguals (Luo et al., 2013) we expected to observe similar results for the digit span subtest of the Wechsler Adult Intelligence Scale (Wechsler, 1997). For the digit span task, participants are required to repeat lists of digits that increase in number first in the forward direction and then in the backward direction, starting with 2 digits and increasing by 1 to a maximum of 9 digits for forward digit span and 8 for backward digit span. We used the digit span task as a measure of working memory and expected that bilinguals would outperform monolinguals. To our knowledge, the only investigations to explore bilingualism and the Wisconsin Card Sorting Test (WCST), which measures cognitive flexibility and set-shifting, have examined bilinguals who frequently switch between their languages and those who do not (with non-switchers outperforming switchers; Festman and Münte, 2012) or compared monolinguals and bilinguals to simultaneous interpreters (with interpreters outperforming monolinguals and bilinguals, who did not differ; Yudes et al., 2011). For the WCST, participants are required to sort a set of cards based on a rule, color, shape or number, which switches following 10 consecutive correct responses. Participants are not informed of the rule they are supposed to use or when the rule switches, they are only given feedback on whether the current card was sorted correctly or not. Based on previous findings, we did not expect to see any clear differences between the language groups on the WCST.

Given previous findings demonstrating that the bilingual advantage is larger in older adults (Bialystok et al., 2004) we hypothesized that any observed language group effects would be larger in the older adults than in the younger adults. It is also possible that there could be significant language group differences in the older adults not observable in young adults given that the young adults are at the height of cognitive function and may not experience any additional benefit from being bilingual.

With respect to the language tasks, we expected that monolinguals would outperform bilinguals on the BNT, as has been found in previous studies (Gollan et al., 2007). Hypotheses regarding fluency tasks are less straightforward, given that these tasks also comprise an executive component; therefore, based on previous literature we tentatively hypothesized that bilinguals would outperform monolinguals on letter fluency given the executive demands required for this task, and that there would be no language group effect for category fluency (Luo et al., 2010) given the high level of proficiency of the bilinguals included in the present study. We included fluency measures for the letters F, A, and S, and for the category animals following Bialystok et al. (2008).

# **MATERIALS AND METHODS**

#### **PARTICIPANTS**

Participants included monolingual and bilingual young (monolingual: *n* = 70; bilingual: *n* = 51) and older (monolingual: *n* = 61; bilingual: *n* = 36) non-immigrant adults recruited from the Ottawa and Quebec City communities. The monolingual young group comprised 30 French speakers and 40 English speakers, and the monolingual older group comprised 30 French speakers and 31 English speakers. Bilingual participants were relatively equally proficient in French and English, having self-reported high proficiency in their L2 before the age of 13 (see **Table 1**); proficiency in each language was determined using both selfreport measures and an animacy judgement task described below (Segalowitz and Frenkiel-Fishman, 2005). Thirty-nine percent of young and 72 percent of older bilingual adults reported French as their native language, and the remainder reported English as their native language. Monolingual French speakers were recruited and tested in Quebec City, where the predominant language is French, while monolingual English speakers and bilinguals were recruited and tested in Ottawa, where the predominant languages are English and French. Within each age group, monolingual francophones, monolingual anglophones and bilinguals were matched for age, education, and general cognitive function as measured by the Montreal Cognitive Assessment (MoCA; Nasreddine et al., 2005), and over 90 percent of participants in each group were right handed. Monolinguals self-reported native-like ability in all aspects of their languages (i.e., reading, writing, speaking and listening) with minimal exposure to a second language, and bilinguals self-reported minimal exposure to any other languages besides English and French. Participant characteristics are provided in **Tables 2** and **3.**

#### **MATERIALS**

#### **Animacy judgement task**

The animacy judgement task was used as an objective measure of relative second language (L2) proficiency and was based on the task used by Segalowitz and Frenkiel-Fishman (2005). Bilingual participants were presented with nouns on a computer monitor and were required to decide as quickly and accurately as possible whether each noun referred to something living or nonliving using the "1" and "2" keys on the keyboard. The task consisted of two separate language blocks, one in English, followed by one in French. Each comprised 64 trials (32 inanimate nouns and 32 animate nouns) and was preceded by eight practice trials. Monolingual anglophones and monolingual francophones completed only the English or French block respectively. The standard deviation for correct trials was divided by the RT for correct trials for each language block separately to obtain the coefficient of variability (CV), a measure of intraindividual variability in RT. The more similar the CV in a bilingual's L1 and L2, the more relatively equally proficient the bilingual is believed to be (see Segalowitz and Segalowitz, 1993). Paired samples *t*-tests were used to compare the CVs in L1 and L2 for the bilingual young and older adults separately.

**Table 1 | Mean ranking (**±**standard deviation) for proficiency by modality for both L1 and L2 for bilingual participants**.


Ranking followed a 5 point Likert scale (1 = no ability; 5 = native-like ability).


**Table 2 | Young adult demographic, neuropsychological, executive function, and language task performance (mean** ± **standard deviation)**.

ICN = incongruent color naming; CN = color naming; WR = word reading.

# **MoCA**

The MoCA (Nasreddine et al., 2005) is a 12-min cognitive screening tool used to assess general cognitive function and detect mild cognitive impairment. The domains assessed include visuospatial and executive control, naming ability, memory, attention, language, abstraction, and orientation. The MoCA is scored out of 30 and a score of 26 or higher is considered normal. It was included here to ensure that all participants had normal cognitive functioning and that the language groups were matched on general cognitive function.<sup>1</sup>

# **Stroop task**

The Stroop task (Stroop, 1935) was used as a measure of interference suppression. The version of the Stroop task used here included three conditions: word reading, color naming, and interference/incongruent color naming (naming the color of the print of incongruent color words; e.g., the word BLUE printed in red ink). For each condition, participants were presented with a sheet containing 4 columns of 30 stimuli appearing in random order and were asked to complete as many trials as possible in 45 s, starting with the first column and moving downward. In the word reading condition, the color words RED, GREEN, YELLOW, and BLUE were printed in black font and participants were asked to read as many words as possible. In the color naming condition, strings of six X's were printed in either red, green, yellow, or blue font and participants were asked to name the color of as many of the stimuli as possible. In the incongruent condition, the color words RED, GREEN, YELLOW, and BLUE were printed in one of the incongruent colors, with each color-word combination appearing 10 times, and participants were asked to name the font color of as many words as possible without reading the word. All participants completed the word reading condition first, followed by the color naming condition, and the incongruent color naming condition was completed last. Anglophone and bilingual participants performed the task in English, while francophones performed it in French, and responses were recorded using Audacity 2.0 audio recorder and later played back to determine accuracy. The number of correct responses for each condition was counted.

<sup>1</sup>Note that 1 monolingual anglophone, 1 monolingual francophone and 4 bilingual young adults, and 2 monolingual anglophone and 4 bilingual older adults obtained scores of 24 or 25 on the MoCA; however, based on their interactions with the experimenter and performance on other neuropsychological assessments they were deemed to have normal cognitive function. Critically, within each age group the language groups were matched for MoCA score.


#### **Table 3 | Older adult demographic, neuropsychological, executive function, and language task performance (mean** ± **standard deviation)**.

ICN = incongruent color naming; CN = color naming; WR = word reading.

#### **Simon task**

The Simon task (Simon and Rudell, 1967) was used as another measure of interference suppression. The version of the Simon task used here comprised three conditions: control, reverse, and conflict. In each condition, an arrow was presented on the monitor and participants were instructed to indicate, with the "A" and "L" keys on the keyboard, the direction of the arrow. In the control condition, the arrows appeared at the center of the monitor and participants were required to identify whether the arrow pointed to the left (by pressing the "A" key on the keyboard, located on the left side of the keyboard) or to the right (by pressing the "L" key on the keyboard, located on the right side of the keyboard). In the reverse condition, the arrows appeared at the center of the screen and the participant was required to identify the direction of the arrow using the key on the opposite side on the keyboard; i.e., "A" for a rightward pointing arrow and "L" for a leftward pointing arrow. In the conflict condition, the arrows were presented on either the left or right side of the monitor, creating congruent (e.g., rightward pointing arrow presented on the right) and incongruent trials (e.g., leftward pointing arrow presented on the right). For both the control and reverse condition, there were two blocks of 48 trials each. For the conflict condition, there were a total of 192 trials split into two blocks, with 48 congruent and 48 incongruent trials in each block. At the beginning of each new condition there was a series of practice trials (one practice trial for each trial type); the order of presentation of the conditions was counterbalanced across participants, and stimuli were presented in randomized order within each condition.

#### **SART**

The SART (Robertson et al., 1997) was used as a measure of response inhibition. For this task, participants were presented with the digits 1 through 9 on the computer screen and were required to press the space bar in response to every number except the number 3, for which no response was required. There were 25 blocks of nine trials, and each number appeared once in each block. The numbers were randomized within each block and the participants were not informed of the number of blocks that they would be completing, or the number of "3"s that would appear in each block. Each trial was preceded by a mask (######) that appeared for 500 ms, and the participant's response initiated the subsequent trial, except when the stimulus was the number 3, which stayed on the screen for 2000 ms.

# **Digit span**

The forward and backward digit span subtests of the Wechsler Adult Intelligence Scale III (Wechsler, 1997) were administered as a measure of working memory. In this task, the experimenter read the participant a list of digits that the participant was asked to repeat in either the forward or backward order, depending on the task. The list started with 2 digits and the span increased by 1 digit until a maximum of 9 for the forward and 8 for the backward digit span tasks. There were two trials at each span length resulting in a maximum possible score of 16 for forward digit span and 14 for backward digit span, and the task was discontinued when the participant made an error on both trials at any span length.

# **WCST**

The WCST (Grant and Berg, 1948) measures set-shifting/ cognitive flexibility, and a participant's ability to adapt to changing demands and schedules of reinforcement. In this task participants are asked to sort a series of 64 cards based on three possible criteria: color, shape/form, and number. Four cards (one with a single red triangle, the second with 2 green stars, the third with 3 yellow "+" signs, and the fourth with 4 blue circles) are laid down in front of the participant. Participants are instructed to sort the cards (each containing 1–4 of the above-mentioned shapes in any of the four colors) into piles according to the four cards placed in front of them, whereupon the experimenter informs the participant whether the card was sorted correctly or incorrectly. The sorting rule changes each time the participant correctly categorizes 10 consecutive cards; the sorting rule begins as "color", then switches to shape/form, and then to number, and then repeats in this order, following standardized instructions, until all the cards have been sorted. A point is awarded each time the participant achieves a category (i.e., 10 consecutive correct responses), resulting in a maximum score of 6.

# **BNT**

The BNT (Kaplan et al., 1983) is a picture naming task comprising 60 images. Participants are presented with the images one at a time and are asked to name them. BNT performance is used to measure language function and can be used to help diagnose cognitive status (e.g., Mungas et al., 2005). Standardized scoring procedures were used; one point was awarded for each correctly identified image, if a stimulus cue was needed one point was awarded if the pictures was correctly identified following the semantic cue, but not following the phonemic cue. Bilingual participants completed the BNT three times: once in French, once in English, and once in a condition where they could respond in either language (bilingual administration) in randomized order; the data from their L1 are reported here.

# **Verbal fluency**

Participants completed letter fluency for the letters F, A, and S and category fluency for the category animals (Controlled Oral Word Association test; Benton and Hamsher, 1976). In bilinguals this was done in both French and English, as well a condition in which they could respond in either language, in randomized order; the data from their L1 are reported here. In the letter fluency task, participants are asked to generate as many words as possible in 1 min, beginning with the specified letter. The total number of words generated was counted, excluding repetitions, numbers, proper nouns and words of the same root (e.g., love, lover, loving). In the category fluency task, participants name as many animals as they can in 1 min. The total number of words generated was counted, excluding repetitions. Responses were recorded using Audacity 2.0 audio recorder<sup>2</sup> and transcribed later.

# **APPARATUS**

Several tasks were completed on a laptop computer, including the animacy judgement task, the Simon task, and the SART. Regardless of the task, stimuli were presented using E-Prime 2.0 presentation software (Psychology Software Tools, Pittsburg, PA, USA); however, three different laptops were used to collect the data. At the Quebec City site, the data were collected using a Toshiba Portégé A600 laptop with a 12.1<sup>00</sup> screen, Windows 7 operating system and an Intel Centrino 2 processor (all monolingual francophone participants were tested using this hardware). At the Ottawa site, the majority of the participants were tested using a Dell Inspiron Mini with a 10<sup>00</sup> screen, Windows XP operating system and Intel Atom processor. However, one monolingual and two bilingual young adults were tested using a Dell Latitude E4310 laptop with a 12.1<sup>00</sup> screen, Windows XP operating system and Intel Core i5 processor.

Given that the data were collected using different hardware, several additional analyses were conducted to ensure that there were no systematic differences in the data collected from the different laptops. We conducted an independent samples *t*-test, for the young and older adults separately, comparing the data from monolingual francophones (tested using the Portégé A600 laptop) and monolingual anglophones (tested using Dell Inspiron Mini laptop) for the Simon and SART tasks. These analyses showed that there were no RT differences in the data for either age group on any of the conditions of the Simon task (control, reverse, and conflict conditions; all *p*s > 0.08); however, monolingual francophones showed longer RTs for the SART than monolingual anglophones (*M* = 80.1 ms for the young adults and 96.4 ms for the older adults). Given that only one monolingual and two bilingual young adults were tested using the Dell Latitude laptop there were not enough data to run a valid *t*-test to compare the data collected using these different laptops. Following these additional analyses we were confident that combining the data collected with different hardware would not introduce any confounds, except perhaps in the case of the SART.

# **PROCEDURE**

Data from monolingual anglophones and bilinguals for the current investigation were collected as part of a larger study. Therefore, participants visited the laboratory on two occasions each lasting between 1.5 and 2 h. Informed consent was obtained and participants completed a series of paper-and-pencil and computerized tasks, including those reported here. At the end of the second session, participants were debriefed and compensated \$10 per hour of participation. This study was approved by the

<sup>2</sup>http://audacity.sourceforge.net

Research Ethics Board at the Bruyère Research Institute and the University of Ottawa.

Data from monolingual francophones were collected in two sessions, each lasting 1 h. E-Prime data (i.e., Animacy judgement, Simon and SART) were collected during the second session. Informed consent was obtained at the beginning of the testing session. At the end of the session participants were debriefed and compensated at the rate of \$10 per hour of participation. This study was approved by the Research Ethics Board at Laval University.

All participants followed the same procedure, independent of testing site. Tasks were administered in the following order for the monolinguals: MoCA, verbal fluency and BNT, Simon task, animacy judgement task, Stroop task, WCST, digit span, SART. For bilinguals the first session included the MoCA, verbal fluency and BNT (English, French or bilingual administration), Simon task, animacy judgement task, verbal fluency and BNT (English, French or bilingual administration), and the second session included verbal fluency and BNT (English, French or bilingual administration), Stroop task, WCST, digit span, and SART, administered in that order. The different language administrations were randomized across bilingual participants.

# **RESULTS**

All statistical analyses were conducted using PASW Statistics 18 using an α-level of 0.05, unless otherwise specified. All RT data trials for which RTs were greater than ±2.5 standard deviations from the mean were excluded as outliers by participant and condition. Unless otherwise specified, we conducted an analysis of variance (ANOVA) for each of the tasks comparing young and older adults (Age Group), and monolingual francophones, monolingual anglophones, and bilinguals (Language Group). We report all significant main effects and interactions; any significant interactions were followed up with simple effects analyses. All the data are presented in **Tables 1** and **2**. Given that Language Group effects were of primary interest in the current investigation, these effects are summarized for the executive function and language tasks in **Table 4**. Technical difficulties resulted in the loss of a small portion of data; given that this was not consistent across tasks, any missing data is reported separately for each task.

#### **ANIMACY JUDGEMENT TASK**

Data were missing for two young bilinguals. The CV was calculated by dividing the standard deviation for correct trials by the mean RT for correct trials for each participant and language separately. We conducted separate paired samples *t*-tests for the young and older bilingual adults in order to compare the CVs in L1 and L2 and ensure that participants were relatively equally proficient in both of their languages. The *t*-tests revealed no significant difference in the CVs between the L1 and L2 for the young (*t*(48) = −0.42, *p* = 0.67) or the older (*t*(35) = −1.1, *p* = 0.28) adults, suggesting that bilingual participants were highly proficient in their L2.

#### **EXECUTIVE FUNCTION TASKS Stroop task**

Data were missing for one young anglophone and three young bilinguals. The repeated measures ANOVA including the withinsubjects factor Condition (word reading, color naming, incongruent color naming) revealed main effects of Age Group

**Table 4 | Summary of language group (and language group by age interaction) effects by task**.


BI = bilingual; MF = monolingual francophones; MA = monolingual anglophones; ICN = incongruent color naming; CN = color naming; WR = word reading.

(*F*(2,208) = 95.93, *MSE* = 324.88, *p* < 0.01, η 2 *<sup>p</sup>* = 0.32), showing that older adults generated fewer responses than young adults; Language Group (*F*(2,208) = 5.02, *MSE* = 324.88, *p* < 0.01, η 2 *<sup>p</sup>* = 0.05), showing that bilinguals generated fewer correct responses than monolingual francophones; and Condition (*F*(2,416) = 2236.35, *MSE* = 87.98, *p* < 0.01, η 2 *<sup>p</sup>* = 0.92, showing a difference between all three conditions, with the highest number of correct responses for word reading and the fewest for incongruent color naming. In addition, there was a Language Group × Condition interaction (*F*(4,416) = 15.45, *MSE* = 87.98, *p* < 0.01, η 2 *<sup>p</sup>* = 0.13), showing that monolingual francophones generated more correct responses than monolingual anglophones and bilinguals for word reading and color naming (all *p*s < 0.05), but fewer correct incongruent color naming responses than both monolingual anglophones (*p* = 0.04) and bilinguals (*p* < 0.01).

In addition, we analyzed language group differences in two different measures of Stroop interference (i.e., the decrease in correct response for incongruent color naming relative to a neutral condition). Interference was calculated relative to both word reading and color naming by subtracting the number of correct responses for the incongruent color naming condition from the number of correct responses for the word reading condition and from the color naming condition, respectively. Young adults showed less interference than older adults when interference was calculated relative to both word reading (*F*(1,208) = 8.42, *MSE* = 229.58, *p* < 0.01, η 2 *<sup>p</sup>* = 0.04) and color naming (*F*(1,209) = 3.72, *MSE* = 96.18, *p* = 0.055, η 2 *<sup>p</sup>* = 0.02). In terms of language group differences, monolingual francophones showed greater interference than monolingual anglophones and bilinguals when interference was relative to word reading (*F*(2,208) = 13.89, *MSE* = 229.58, *p* < 0.01, η 2 *<sup>p</sup>* = 0.12). When interference was relative to color naming, however, all three language groups differed, with the least interference demonstrated by bilinguals and the most by monolingual francophones (*F*(2,209) = 50.72, *MSE* = 96.18, *p* < 0.01, η 2 *<sup>p</sup>* = 0.33).

#### **Simon task**

Data were missing for 1 young and 1 older francophone, 1 older anglophone, and 1 young and 3 older bilinguals. The data for conditions with central (control and reverse) and lateral (conflict) presentation were analyzed separately. A repeated measures ANOVA including the within-subjects factor Condition (control vs. reverse) revealed faster responses for young than older adults (main effect of Age Group, *F*(1,204) = 260.7, *MSE* = 38681.4, *p* < 0.01, η 2 *<sup>p</sup>* = 0.56). We also found a main effect of Condition *F*(1,204) = 334.3, *MSE* = 10625.5, *p* < 0.01, η 2 *<sup>p</sup>* = 0.62), whereby responses were faster in the control than reverse condition. Finally, an Age × Condition interaction (*F*(1,204) = 95.9, *MSE* = 10625.5, *p* < 0.01, η 2 *<sup>p</sup>* = 0.32) showed that the increase in RT for the reverse relative to the control condition was larger in older than young adults (87.03 vs. 287.84 ms). A second repeated measures ANOVA was conducted to compare congruent and incongruent trials in the conflict condition. There were main effects of Age (*F*(1,205) = 205.8, *MSE* = 30307.5, *p* < 0.01, η 2 *<sup>p</sup>* = 0.50) and Trial Type (*F*(1,205) = 58.2, *MSE* = 654.5, *p* < 0.01, η 2 *<sup>p</sup>* = 0.22) showing faster responses for young adults and congruent trials relative to older adults and incongruent trials, respectively. There was also a significant interaction between Age and Trial Type (*F*(1,205) = 63.0, *MSE* = 654.5, *p* < 0.01, η 2 *<sup>p</sup>* = 0.24), demonstrating that only the older adults showed an increase in RT for incongruent relative to congruent trials (39.49 ms).

Of critical interest in this task was Language Group differences in interference suppression. We subtracted the RT for congruent trials from the RT for incongruent trials within the conflict condition to obtain an interference score and conducted a oneway ANOVA on these scores with Language Group and Age Group as the between subjects factors. Overall, older adults showed larger interference effects than younger adults (main effect of Age Group (*F*(1,205) = 63.04, *MSE* = 1309.02, *p* < 0.01, η 2 *<sup>p</sup>* = 0.24), and monolingual anglophones showed larger interference effects than monolingual francophones (main effect of Language Group (*F*(2,205) = 3.09, *MSE* = 1309.02, *p* = 0.05, η 2 *<sup>p</sup>* = 0.03).

# **SART**

Analysis of data from the SART revealed a main effect of Age Group (*F*(1,204) = 81.34, *MSE* = 9458.51, *p* < 0.01, η 2 *<sup>p</sup>* = 0.29), whereby older adults responded more slowly than young adults. We also found a main effect of Language Group (*F*(2,204) = 15.0, *MSE* = 9458.51, *p* < 0.01, η 2 *<sup>p</sup>* = 0.13), whereby RTs were longer in monolingual francophones than in monolingual anglophones and bilinguals. Moreover, older adults committed fewer errors than young adults (*F*(1,204) = 41.27, *MSE* = 8.69, *p* < 0.01, η 2 *<sup>p</sup>* = 0.17).

### **Digit span**

A separate ANOVA was conducted for the forward and backward digit span tasks. There was a main effect of Age Group on the forward digit span (*F*(1,212) = 9.07, *MSE* = 4.52, *p* < 0.01, η 2 *<sup>p</sup>* = 0.04), showing that the young adults achieved higher scores than the older adults. However, an Age Group × Language Group interaction (*F*(2,212) = 3.69, *MSE* = 4.52, *p* = 0.03, η 2 *<sup>p</sup>* = 0.03) demonstrated that young adults achieved higher scores than older adults in the monolingual francophone (*p* < 0.01) and bilingual groups only (*p* = 0.02), whereas there was no effect of Age Group in monolingual Anglophones (*p* = 0.67).

Analysis of the backward digit span revealed an Age Group × Language Group interaction (*F*(2,212) = 4.35, *MSE* = 5.17, *p* = 0.01, η 2 *<sup>p</sup>* = 0.04), whereby young monolingual francophones achieved higher scores than older monolingual francophones (*p* = 0.02). The simple effect of Age Group was not significant in monolingual anglophones (*p* = 0.10) or bilinguals (*p* = 0.13).

# **WCST**

Analysis of the WCST revealed that young adults obtained more categories than older adults (main effect of Age Group (*F*(1,209) = 38.09, *MSE* = 0.95, *p* < 0.01, η 2 *<sup>p</sup>* = 0.15), and monolingual francophones achieved more categories than monolingual anglophones and bilinguals (main effect of Language group (*F*(2,209) = 3.91, *MSE* = 0.95, *p* = 0.02, η 2 *<sup>p</sup>* = 0.04).

#### **LANGUAGE TASKS**

For these analyses only data from the bilinguals' L1 were included.

# **BNT**

Analysis of the BNT revealed a main effect of Language Group (*F*(2,209) = 24.77, *MSE* = 39.65, *p* < 0.01, η 2 *<sup>p</sup>* = 0.19), whereby anglophones obtained more correct responses than francophones and bilinguals.

# **Verbal fluency**

The total number of words generated for each of the letter fluency tasks (F, A, and S) was summed to obtain a single score for each participant. The analysis revealed that monolingual anglophones generated more words than monolingual francophones and bilinguals (main effect of Language Group *F*(2,210) = 4.47, *MSE* = 121.2, *p* = 0.01, η 2 *<sup>p</sup>* = 0.04). In category fluency young adults generated more animal names than older adults (main effect of Age Group *F*(1,210) = 37.66, *MSE* = 30.36, *p* < 0.01, η 2 *<sup>p</sup>* = 0.15). There was also a trend toward a main effect of Language Group (*F*(2,210) = 2.90, *MSE* = 30.36, *p* = 0.06, η 2 *<sup>p</sup>* = 0.03), whereby monolingual anglophones generated more animal names than monolingual francophones.

# **DISCUSSION**

The goal of the current investigation was to further examine language group differences in executive function and language tasks in a group of young and older monolingual francophones, monolingual anglophones and French/English bilinguals. Previous research has found that bilinguals demonstrate advantages on tasks of executive function and disadvantages on language tasks, relative to monolinguals. A larger advantage on executive function tasks has been reported in older adults relative to young adults. However, questions arise with respect to some socio-demographic variables (e.g., immigration status) of some of the samples studied in previous research. Therefore, in the current investigation we controlled for immigration status and languages spoken. We hypothesized that, if there is a robust executive function advantage for bilinguals, it should emerge in our data, and it should be larger in older than younger adults. We also expected to replicate previous findings showing disadvantages for bilinguals on language tasks.

Measures of interference suppression (Stroop and Simon tasks), response inhibition (SART), working memory (forward and backward digit span) and cognitive flexibility (WCST) were used to assess executive function. Our hypotheses regarding interference suppression and response inhibition were clear: bilinguals should show better performance and less interference than monolinguals on the Stroop and Simon tasks, and all groups should perform similarly on the SART. In general, these hypotheses were not supported.

Specifically, in the Stroop task, there was weak support for a bilingual advantage in that monolingual francophones produced fewer incongruent color naming responses than bilinguals (and monolingual anglophones, who did not differ from bilinguals). Monolingual francophones also exhibited greater interference (i.e., greater decrease in the number of correct responses for incongruent color naming relative to a neutral condition) than bilinguals, regardless of how the interference score was computed (i.e,. relative to word reading or color naming). Monolingual anglophones also demonstrated more interference than bilinguals (but less than monolingual francophones), but only when interference was calculated relative to color naming. For the Simon task, there were no language group effects for the raw RT data. Furthermore, in comparison with either monolingual group, bilinguals did not show smaller interference, despite monolingual anglophones showing larger interference than monolingual francophones. Given that both the Simon task and the Stroop task measure interference suppression, it is interesting that the two tasks result in contrasting findings. It is unclear why this is the case; one possibility is that it is due to the Stroop task having a language component. Finally, monolingual francophones showed longer RTs for the SART than monolingual anglophones and bilinguals, who did not differ from each other. The hypotheses regarding working memory and cognitive flexibility were less straightforward. The results showed that there were no differences between monolinguals and bilinguals for the forward or backward digit span, and monolingual francophones outperformed monolingual anglophones and bilinguals (who did not differ) on the WCST.

The results of the executive function tasks do not provide clear evidence for a bilingual advantage and the findings are not consistent across the tasks. For Simon interference and cognitive flexibility, monolingual francophones show an advantage relative to both monolingual anglophones and bilinguals; for the Stroop task and response inhibition, in contrast, monolingual francophones show a disadvantage relative to the two other groups. It is important to note that there was a significant difference in RT for the SART between monolingual anglophones and francophones who were tested using different laptop computers; language group differences for response inhibition may thus be due to the use of different testing equipment rather than a true language group effect.<sup>3</sup> The only result supporting a purely bilingual advantage was for Stroop interference, where bilinguals showed smaller interference relative to both monolingual groups.

These findings suggest that the language group differences observed here are the result of something other than bilingualism. The two monolingual groups were tested in different locations, therefore complicating interpretation of the results; however, data that were collected using different computers were compared and found to be similar, with the exception of the SART. Therefore, the results suggest that there may be a cultural effect driving the observed language group differences. This is an interesting possibility given that French is the predominant language in Quebec City, whereas English and French are both commonly used in the city of Ottawa. This implies that the English monolinguals included here may have been exposed to French on a more regular basis whereas the French monolinguals are not exposed to English with the same frequency. However, critically, monolingual anglophones and bilinguals, who were living in and tested in the same location (i.e., Ottawa, Ontario), did not differ on any of

<sup>3</sup>The independent samples *t*-tests that we conducted indicated that the use of different testing computers across sites did not result in systematic differences in RT data for the Simon task, despite differences for the SART task. Given that the critical test of the bilingual advantage is language group differences in Simon interference the use of different computers poses less of a risk in terms of our conclusions for the Simon task given that any effect of hardware would be similar across conditions.

the tasks except Stroop interference. Furthermore, there were no instances in which any language group differences were larger for older than young adults.

In order to assess any disadvantages in language performance, the BNT, letter and category fluency tasks were included in the test battery. It was hypothesized that monolinguals would show an advantage for the BNT; however, the results indicate an advantage only for monolingual anglophones relative to monolingual francophones and bilinguals, which partially supports our hypothesis. It is unclear why only the anglophones show an advantage relative to the bilinguals and perform better than the francophones; this may suggest that the BNT is more difficult in French than in English, or that the items are less prototypical in French culture, resulting in a familiarity effect. Although there is no empirical evidence to support the claim that the BNT is more difficult in French, there is some evidence that the difficulty of the BNT varies based on language background and the languages that an individual knows (Roberts et al., 2002; Rosselli et al., 2012). It is noteworthy that Roberts et al. included French Canadians in their investigation, and French-English bilinguals performed worse than English monolinguals.

The results of the fluency tasks do not support our hypotheses: monolingual anglophones produced more correct responses than monolingual francophones on the category fluency task, while in the letter fluency task, monolingual anglophones outperformed both monolingual francophones and bilinguals. The scoring method that was used entailed combining the scores to obtain a composite score for the three letter fluency tasks, as has been done in previous investigations (Bialystok et al., 2008). It is possible that an alternative scoring method, such as examining clustering (i.e., generating similar items close together in sequence) and switching (i.e., shifting from one subcategory to another; Troyer et al., 1997) would reveal language group differences; this possibility should be explored in future research.

Taken together, the results of the executive function and language tasks raise questions about the reliability, robustness, and specificity of the purported "bilingual advantage". There are several possible reasons why the data reported here do not support previous findings, all of which imply that the bilingual advantage may be less robust, or more specific than previously suggested. One such explanation is that in addition to being non-immigrants, bilingual participants in this study likely have a very different language-use profile than bilinguals included in other studies. That is, the language environment of bilingual participants included here exposes them to both of their languages on a very regular and consistent basis in most situations that they encounter, given the bilingual nature of the city of Ottawa. This language use/exposure differs from that of many other bilinguals who vary with respect to their two languages and may use each of their languages in very specific and separate situations (e.g., one language at home, the other language at school/work). It is possible that these language-use differences affect the cognitive consequences of bilingualism. Recently, there has been substantial interest in codeswitching and how this behavior may lead to different cognitive outcomes. For example, Festman and Münte (2012) found that individuals who exhibited cross-language interference on a bilingual picture naming task (i.e., language switchers) performed worse than non-switchers on the WCST and a flanker task.

It is also possible that our measures of interference suppression were not sensitive enough to detect differences between monolinguals and bilinguals, particularly in the young adults. That is, language group differences in young adults who are at the height of cognitive function may be more subtle and difficult to detect. This is supported by previous research that has found language group differences in brain-based measures, but no differences in behavior (e.g., Bialystok et al., 2005; Kousaie and Phillips, 2012b). However, the current results do show canonical interference and age effects, suggesting that the tasks themselves were effective at introducing interference. Another possible explanation related to task sensitivity is that the tasks were too long, allowing monolinguals enough practice to overcome any initial disadvantage relative to bilinguals. Previous research has found that over the course of multiple blocks of trials there is convergence between the performance of monolinguals and bilinguals (Bialystok et al., 2004; also see Hilchey and Klein, 2011). In order to address this possibility we conducted a supplemental analysis of the Simon interference effect including the first block of trials only. This analysis revealed similar findings as the overall interference analysis, indicating that the lack of a bilingual advantage in Simon interference is unlikely to be the result of too many trials. Unfortunately, given the nature of the Stroop task it was not possible to examine any practice effects on Stroop interference. Finally, our tasks may not have been challenging enough—others have found that the bilingual advantage only emerges under conditions that are demanding of monitoring processes (e.g., Bialystok et al., 2004; Bialystok, 2006; Costa et al., 2009).

Interestingly, there was some evidence of a monolingual francophone advantage in Stroop word reading and color naming, Simon interference, and the WCST. As previously mentioned, given the possibility of confounds resulting from different testing environments, this finding is difficult to interpret. However, it may suggest the existence of effects associated with the specific language(s) that an individual speaks. Alternatively, the specific environment in which an individual lives may exert an effect; for example, in Quebec City the predominant language is French, and monolinguals are likely exposed to English much less frequently than anglophones in the bilingual city of Ottawa are to French. This finding merits further investigation; future research should compare monolingual francophones and French/English bilinguals in the same location and testing environment.

In conclusion, the current investigation does not provide convincing support for a bilingual advantage on executive function tasks and does not replicate previous findings of a bilingual disadvantage for language tasks. Our conclusions are tentative given the difficulties associated with interpreting null results; however, the data presented here raise questions with respect to the robustness, reliability and specificity of such an advantage. Despite the limitations of this study (e.g., two groups of monolinguals from different locations) it is clear that additional research is required to fully characterize both the potential advantages and disadvantages associated with being bilingual. Given the importance of executive function and language-based tasks for neuropsychological assessment, this area of research has important clinical implications and it is imperative that we understand the consequences of bilingualism on the performance of these tasks. In terms of broader implications, the current investigation demonstrates the importance of context to the development of executive function processes, particularly language context. Given the influence of highly plastic language functions on executive function processes, an argument can be made for the utility of other cognitive training programs for the recovery of executive function in populations experiencing deficits resulting from age-related decline and/or neuropathology.

# **ACKNOWLEDGMENTS**

This research was supported by a Canadian Institutes of Health Research Catalyst Grant awarded to Vanessa Taler and Shanna Kousaie, and an Alzheimer Society of Canada Research Grant awarded to Vanessa Taler, Laura Monetta, and Shanna Kousaie. The authors would like to thank the research assistants and participants for their important contributions to this research. Special thanks go to Chloé Corbeil, Julien Blacklock, and Dominique Fijal for assistance with data collection.

#### **REFERENCES**


evidence from executive function tests. *Neuropsychology* 28, 290–304. doi: 10. 1037/neu0000023


in elderly Hispanics and non-Hispanic Whites. *J. Int. Neuropsychol. Soc.* 11, 620– 630. doi: 10.1017/s1355617705050745


**Conflict of Interest Statement**: The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

*Received: 30 January 2014; accepted: 02 July 2014; published online: 25 July 2014*. *Citation: Kousaie S, Sheppard C, Lemieux M, Monetta L and Taler V (2014) Executive function and bilingualism in young and older adults. Front. Behav. Neurosci. 8:250. doi: 10.3389/fnbeh.2014.00250*

*This article was submitted to the journal Frontiers in Behavioral Neuroscience*. *Copyright © 2014 Kousaie, Sheppard, Lemieux, Monetta and Taler. This is an openaccess article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms*.

# Neurobehavioral Abnormalities Associated with Executive Dysfunction after Traumatic Brain Injury

#### Rodger Ll. Wood<sup>1</sup> and Andrew Worthington<sup>2</sup> \*

<sup>1</sup>Clinical Neuropsychology, College of Medicine, Swansea University, Swansea, United Kingdom, <sup>2</sup>College of Medicine and College of Human and Health Sciences, Swansea University, Swansea, United Kingdom

Objective: This article will address how anomalies of executive function after traumatic brain injury (TBI) can translate into altered social behavior that has an impact on a person's capacity to live safely and independently in the community.

Method: Review of literature on executive and neurobehavioral function linked to cognitive ageing in neurologically healthy populations and late neurocognitive effects of serious TBI. Information was collated from internet searches involving MEDLINE, PubMed, PyscINFO and Google Scholar as well as the authors' own catalogs.

Conclusions: The conventional distinction between cognitive and emotional-behavioral sequelae of TBI is shown to be superficial in the light of increasing evidence that executive skills are critical for integrating and appraising environmental events in terms of cognitive, emotional and social significance. This is undertaken through multiple frontosubcortical pathways within which it is possible to identify a predominantly dorsolateral network that subserves executive control of attention and cognition (so-called cold executive processes) and orbito-frontal/ventro-medial pathways that underpin the hot executive skills that drive much of behavior in daily life. TBI frequently involves disruption to both sets of executive functions but research is increasingly demonstrating the role of hot executive deficits underpinning a wide range of neurobehavioral disorders that compromise relationships, functional independence and mental capacity in daily life.

Keywords: neurobehavioral disorder, executive dysfunction, decision making, traumatic brain injury, brain injury

# INTRODUCTION

rehabilitation

Executive functions represent higher level cognitive abilities that underpin many aspects of social cognition and interpersonal behavior, starting in infancy with the onset of attention control (Anderson et al., 2002) and ability to inhibit overlearned behavior (Jurado and Rosselli, 2007), progressing to the ability to infer others' mental states (Stone et al., 1998; Zelazo and Carlson, 2012). Improvements in selective attention, working memory (WM) and problem solving occur throughout adolescence linked to late myelination and synaptogenesis in the frontal regions (Fuster, 2002; Blakemore and Choudhury, 2006). They are largely mediated by the pre-frontal cortex (Stuss, 1992; Stuss and Levine, 2002) and are therefore especially vulnerable to the mechanical forces associated with traumatic brain injury (TBI; see Bigler, 2013).

#### Edited by:

Lynne Ann Barker, Sheffield Hallam University, United Kingdom

#### Reviewed by:

Carlos Tomaz, Universidade Ceuma, Brazil Gregg Stanwood, Florida State University, United States

\*Correspondence:

Andrew Worthington aworthington@headwise.org.uk

Received: 20 July 2017 Accepted: 03 October 2017 Published: 26 October 2017

#### Citation:

Wood RL and Worthington A (2017) Neurobehavioral Abnormalities Associated with Executive Dysfunction after Traumatic Brain Injury. Front. Behav. Neurosci. 11:195. doi: 10.3389/fnbeh.2017.00195 For this reason, executive dysfunction lies at the heart of neurobehavioral disability (Wood, 2013) and can act as a major constraint upon an individual's capacity for social independence. However, the way executive dysfunction and neurobehavioral disability is expressed in terms of social handicap depends upon which functions of the prefrontal system are compromised by TBI. Whilst the cognitive components of executive ability are reasonably well understood by the majority of clinical practitioners, the way in which executive dysfunction can undermine social cognition and behavioral self-regulation is often less clear. The aim this article is to review how anomalies of executive function after TBI can translate into disorders of social behavior that have an impact on a person's capacity to live safely, and independently, in the community.

# COGNITIVE—BEHAVIORAL ASPECTS OF EXECUTIVE DYSFUNCTION

The predominant cognitive deficits associated with executive dysfunction involve: (1) problems with planning, organizing, and prioritizing; (2) a lack of attentional flexibility; (3) impaired concept formation; (4) poor WM; and (5) an inability to monitor and adapt behavior consistent with changing social circumstances. These processes underpin rational thinking and for this reason are often referred to as ''cold'' executive functions, involving logic and reasoning. They are associated with the dorsolateral pre-frontal cortical regions (Chan et al., 2008). They are distinguished from ''hot'' executive functions which process emotionally salient information and comprise: (1) empathy; (2) theory of mind (ToM); (3) social judgment; and (4) emotion regulation. Hot executive functions are mediated by the ventromedial and orbito-frontal cortices (Chan et al., 2008; McDonald, 2013; Baez and Ibanez, 2014) which are implicated in the appraisal of emotional and motivational significance of events, and have both direct and indirect impact on social cognition and interpersonal behavior (Chan et al., 2008; McDonald, 2013). Indeed, our need for social interaction has led some to consider that all higher brain functions, including episodic and emotional memories, our capacity for abstract reasoning, and metacognition, have evolved solely to support interpersonal behavior and social cohesion (Frith, 2012; Shea et al., 2014). Brain activity in either hot or cold neural circuits is associated with reciprocal inhibition (Goel and Dolan, 2003) such that when one is operational the other is suppressed, which may explain why it is so difficult after TBI to exert self-control over emotional impulses.

Both hot and cold executive functions play a role in social cognition. Cold executive functions outlined above draw upon attention, memory and language and play an important role maintaining meaningful social interaction. However, these functions also rely on the ability to evaluate and interpret emotional and mental states intrinsic to social interaction, thereby involving neural substrates associated with hot executive ability. Adults with autistic spectrum disorder for example have been shown to have impairments in hot executive functions compared with controls matched on cold executive functions (Zimmerman et al., 2016). Conversely hot executive processes can predominate when people who lack mental flexibility have difficulty finding alternative ways to resolve a complex situation. Similarly the inability to monitor and update the contents of WM, characteristic of cold executive function, can undermine goal-directed behaviors, as information relevant to the intended action would not be updated and taken into account in action planning. This can result in hot executive processes driving impulsive behaviors which take precedence over pre-planned actions.

# DISORDERS OF IMPULSE CONTROL

Dickman (1990) distinguished between functional impulsivity (the ability to act without delay under pressure) and dysfunctional impulsivity, when acting without forethought leads to maladaptive responses. The latter epitomizes an imbalance between reflective and impulsive mechanisms (Strack and Deutsch, 2004) which, in the context of TBI, usually reflects an abnormality of those brain functions that mediate self-regulation. This, leads to impulsive behavior that contributes to such diverse deficits as poor tolerance, impulsive aggression, poor emotional decision-making and an amoral (pseudopsychopathic) disposition which, in combination, have implications for mental capacity. However, even though disorders of impulse control represent a frequent legacy of TBI they remain poorly understood and not always easy to recognize.

Barratt's (1959) influential three factor model of impulsivity differentiated motor impulsivity, (acting without thinking); cognitive impulsivity, (reflecting quick decision making) and nonplanning impulsivity, (which is largely a combination of the cognitive and motor components that represents a reactive form of behavior). Patton et al. (1995) separated the motor activation component of impulsivity (acting on the spur of the moment) from two cognitive components: attention failure (not focusing on the task at hand) and an executive deficit involving a lack of planning (not thinking carefully about options). Alternatively failure at the cognitive level may be due to inability to inhibit pre-potent responses and resist proactive interference in WM that characterized impulsiveness after TBI (Rochat et al., 2013).

More recently Whiteside and Lynam (2001) introduced the UPPS four dimensional model of impulsivity (Urgency, Perseverance, Premeditation and Sensation-seeking). Urgency refers to the tendency to experience and act on strong impulses, frequently under conditions of negative affect. Perseverance (lack of) refers to an individual's inability to remain focused on a task that may be boring or difficult. Premeditation (lack of) refers to the inability to think and reflect on the consequences of an act before engaging in that act. Sensation seeking refers to the tendency to enjoy activities that are exciting and the willingness to try new experiences. The neural basis for each of these putative stages remains to be detailed.

It is clear from the various constructs referred to above that there may be several substrates of inhibitory control that mediate impulsive behavior, each linked to different regions of the prefrontal cortex. Bechara and Van Der Linden (2005) proposed that the ventromedial prefrontal cortex (vmPFC) and orbito-frontal cortex (OFC) are generally considered the principal regions controlling self-regulated behavior. vmPFC dysfunction influences how inhibitory control mediates decision making, such as preparing to act (Brass and von Cramon, 2002), adaptive thinking—to switch between response alternatives (Dove et al., 2000), and inhibiting inappropriate responses during strategy tasks (Shallice and Burgess, 1991). OFC dysfunction was considered by Fuster (1999) to undermine capacity for response inhibition, a process that normally helps maintain goal-directed behavior. Sohlberg and Mateer (2001) regard impulse control as an ability mediated primarily by the OFC to inhibit automatic response tendencies that usually allow flexible goal-directed behavior. They proposed that an impairment of response inhibition may result in impulsive responding, stimulus-boundedness and perseveration that can have an adverse impact on various forms of decision making. In extreme forms, utilization behavior may be exhibited, which Shallice et al. (1989) argued arose when perceptual attributes of an object automatically trigger a behavioral response in the absence of supervisory attentional control. Fuster (1999) argued that impulsive responding to random environmental stimuli arose as a consequence of distractibility, thereby side-tracking planned, goal-directed behavior, and providing a link between attention control (a cold executive function) and poor impulse control. Consistent with this hypothesis Horn et al. (2003) using fMRI showed that on a response inhibition task (Go/No-Go) impulsive adults showed greater brain activation in paralimbic areas whereas less impulsive individuals showed higher levels of activation in cortical association regions.

# DISORDERS OF INHIBITORY CONTROL

Disorders of inhibitory control are a category of neurobehavioral deficit that includes impulsive behavior but also behavior which is not impulsive but is otherwise ill-judged or inappropriate in the broader social context in which it occurs. Thus, whilst poor inhibitory control underlies impulsive acts, many disinhibited behaviors are more accurately understood as a failure to appraise the action in context and recognize the normal social constraints on certain behaviors. The problem of disinhibition may be one of nature (the behavior is inappropriate to a specific context) or degree (the behavior is carried out to extreme levels which makes it unacceptable). Illustrations of the former include liberal use of swear words in a family setting that might be tolerated in some workplaces, and sexual innuendo that might be acceptable within an intimate relationship with a partner but would be intrusive or overly-personal in other social contexts. Research on the impact of TBI has identified social and sexual disinhibition as significant neurobehavioral factors affecting the quality of relationships (Anderson et al., 2002; Wood et al., 2005).

Physiologically, inhibition operates at different levels, the highest of which is voluntary inhibition which is centered on executive processes, although some consider inhibition to be a behavioral manifestation of several different executive processes (Bari and Robbins, 2013). Braver (2012) proposed a dual process model of inhibition whereby a proactive control mode reflects sustained and anticipatory maintenance of goal-relevant information and a reactive control mode which responds to stimulus-driven influences. De Pisapia and Braver (2006) placed these systems within the anterior cingulate (AC) and prefrontal cortex respectively though the lateral prefrontal cortex is also involved in maintaining top-down goal representations, whilst posterior and subcortical regions are engaged in task-specific processing. This allows attentional selection of goal-related information when faced with competing stimulus demands. Recent interest has focused on the role of the right inferior frontal gyrus (Chikazoe et al., 2007; Hampshire et al., 2010). Damage in this area is associated with disruption of inhibitor processes (Aron et al., 2003) whilst direct stimulation can reduce impulsive behavior (Jacobson et al., 2011). Aron et al. (2014) have argued that this area is a key part of a fronto-subcortical braking system that is normally under executive control and mediates contextually appropriate behavior. Evidence is accumulating that multiple brain regions are recruited in maintaining socially acceptable behavior, mediated largely by right hemisphere areas (Starkstein and Robinson, 1997; Garavan et al., 2006).

# IMPULSIVE AGGRESSION

Impulsive aggression is distinguished by a hair-trigger response following minimal provocation, usually out of proportion to the precipitating event (Barratt et al., 1997). In its verbal form it has been described by Dyer et al. (2006) as the principal aggressive trait after brain injury. It is clinically distinguishable from irritability that reflects a loosening of constraints on reactions to everyday trials and tribulations after frontal brain injury (so-called frontal irritability). Fontaine and Dodge (2006) argue that impulsive aggression arises due to automatic access to a habitual behavior, a minimal acceptability threshold and lack of executive controls. Denson et al. (2011) proposed that the threshold for eliciting anger reduces if people ruminate on a trigger event and for people with reduced executive control, resources deployed to interrupt rumination can lead to further depletion of self-control and potentially increase aggression risk. Greve et al. (2001) observed an association with post-TBI impulsive aggression and a premorbid history of irritability, impulsive or antisocial behavior. Amongst violent offenders Barratt et al. (1997) found impulsiveness per se was not sufficient for impulsive aggression unless poor verbal skills, in the form of developmental dyslexia, was also present. When verbal deficits interacted with low arousability thresholds impulsive aggression could be triggered in situations of conflict. This was supported by Baker and Ireland (2007) who demonstrated higher rates of dyslexia amongst offenders than non-offenders, with dyslexic traits correlating with executive difficulties and impulsiveness. In addition, dyslexic traits were also linked to more violent offences. It is also consistent with recent research on brain injury in offenders showing a link between TBI and more violent offences (Pitman et al., 2015).

Impulsive aggression has been associated with poor inhibitory control in the face of social threat at the level of the orbitofrontal and medial prefrontal cortex, resulting in an inability to control emotions generated by limbic structures such as the amygdala (Coccaro et al., 2007). This is consistent with notions that aggressive urges may be caused by hyperexcitation in a corticolimbic arousal system that includes the amygdala, AC and ventral prefrontal cortex (Keele, 2005; Brown et al., 2006).

# DISORDERS OF ATTENTIONAL CONTROL

Attention control, also known as executive attention, refers to an individual's capacity to choose what they pay attention to and what they ignore (Mirsky et al., 1991; Posner, 1994). Attention control therefore helps maintain a focus on task-relevant information in the presence of internal and external distraction. Thus, attention control (similar to inhibitory control) is needed to direct purposeful behavior by inhibiting the influence of irrelevant representations from gaining access to WM, within which task-related goals are retained in the face of interference by extraneous stimuli, which otherwise could disrupt the ability to maintain focus on a task and work towards achieving a goal.

Attention control is primarily mediated by prefrontal areas (including the AC cortex) that activate, regulate, and monitor how information is received and processed. It is therefore thought to be closely related to other executive functions that mediate many aspects of social cognition and human interaction (Posner and Petersen, 1990; Astle and Scerif, 2011). Posner and Petersen (1990) proposed an interactive system comprising three functional networks: alertness (maintaining awareness), orientation (information from sensory input) and executive control (resolving conflict). After TBI difficulties can arise due to inability to sustain attention (Whyte et al., 1995), attend selectively (Park et al., 1999) or depletion of overall attention resources (Azouvi et al., 2004). A disturbance that specifically affects the flexible allocation of attention towards internal representations and external information could contribute to an impression of apathy and a lack of initiative, by making the person incapable of coordinating intentions in response to changing environmental stimuli. Inattention to environmental cues can cause misperception of situations, for example failure to register nonverbal cues can undermine social interaction (McDonald and Flanagan, 2004). Low levels of attentional control are also thought to increase chances of developing anxiety because the ability to shift one's focus away from threat information is important in processing emotions. Attentional bias can cause a person to processes emotionally negative information preferentially over emotionally positive information (Astle and Scerif, 2011).

The ability to interact in a flexible and creative way with the environment is essential for psychological health and community independence. A reduction in attention capacity and control means that people are restricted in the amount of information they can accommodate in WM. Therefore they are limited in the extent to which they can think or respond to alternatives or deal with changing environmental events. This often leads to a rigid style of thinking and behaving reflected by repetitive or stereotyped behavior. Such individuals often live according to a pre-planned schedule or time table that guides their activities during the day. If events conspire to demand changes to their schedule many individuals fail to adapt. They can therefore behave in a manner inappropriate to the situation or/and exhibit outbursts of temper because they cannot cope with the frustration and uncertainty generated by unpredictable changes to a planned schedule.

Burgess et al. (2007) hypothesized that a ''supervisory attentional gateway system'' flexibly allocates attention towards either internal stimuli, such as mental action plans to achieve goals or to deal with emotional states, or towards external information from the environment that demands flexibility to adjust to changing circumstances. This cognitive control mechanism, which relies mainly on the activity of the rostral prefrontal cortex (RPFC; Brodmann's area 10), may support a wide range of situations critical to competent human behavior in everyday life, such as multitasking or remembering to carry out intended actions after a delay (Burgess et al., 2007), otherwise leading to frustration, angry outbursts, feeling of inadequacy and despondency.

# COMPULSIVE BEHAVIOR

There is a lack of clarity about the emergence of de novo obsessive-compulsive behavior after TBI. Using a psychiatric frame of reference for obsessive compulsive disorder (OCD), van Reekum et al. (2000) estimated a prevalence of 6.4%, twice as common as the general population. However, Berthier et al. (1996), state that OCD has rarely been described after TBI except in individual case studies (Drummond and Gravestock, 1988; Jenike and Brandon, 1988; Donovan and Barry, 1994; Max et al., 1995) or small series lacking control groups (McKeon et al., 1984; Kant et al., 1996). Whilst no formal estimates are available from large scale controlled studies, clinical experience suggests that OCD after TBI is less common, and often different in character, to the emergence of compulsive or stereotyped behaviors that, whilst not meeting the DSM-5 criteria for OCD, nevertheless act as a constraint on adaptive social behavior. Indeed, the absence of anxiety in many cases led Wood (2001) to suggest that after TBI, novel patterns of compulsive behavior seems better classified as obsessive compulsive personality (DSM-5 301.4) than OCD (DSM-5 300.3).

In many respects, compulsive behavior after TBI appears to be an extension of a pre-accident personality characteristic, such that a person who was always methodical and organized exhibits a more concrete or rigid style of thinking leading to stereotyped behavior patterns. Unlike OCD in a psychiatric context, obsessive thoughts, urges or images which the individual tries to suppress, often associated with fears about contamination, are far less common, or intrusive, than compulsive tendencies to maintain order (what Bond, 1984, described as ''organic orderliness''). Hoarding behavior, sometimes referred to as abnormal ''collecting drives'', seems to be associated with an inability to decide what is useful and should be retained, as opposed to what amounts to ''junk''. Anderson et al. (2005) described compulsive collecting behavior as, indiscriminate acquisition behavior and diminished discarding behavior that was blatant, repetitive and generally non-selective. Many individuals disinclination to discard objects persisted even when the ''collections'' led to significant negative consequences.

The development of habitual checking behavior, which develops as a novel response after TBI, is often linked to failures of WM that reflects a lack of confidence about whether or not an action (turning off the gas, electrics, etc.) has been carried out, leading to checking rituals which then develop as a habit response (see Zitterl et al., 2000). Radomsky et al. (2001) asserted that problems of attention control and information processing underpin some aspects of obsessive compulsive behavior. This view is supported by Savage et al. (2000) who suggested that memory impairment in the compulsive element of OCD is secondary to deficits of WM and executive function because patients focus on memorizing specific details but fail to have a conceptual overview, preventing details adding to a general understanding of the ''big picture''.

The neurobiological basis for obsessive behavior post-TBI, is unclear. Saxena et al. (1998) proposed an orbitofrontosubcortical circuit as responsible for OCD which could easily be implicated in obsessive behaviors after TBI. The circuit involves projections from the OFC to the head of the caudate nucleus and ventral striatum, then to the mediodorsal thalamus via the internal pallidus, and finally returning from the thalamus to the OFC, a circuit which also includes connection with the basal ganglia, a system that mediates many aspects of cognition and executive function. Therefore, mechanisms of injury in TBI could have an impact on this circuit in a variety of ways. Figee et al. (2013) reviewed 37 case reports of patients with acquired OCD due to acquired brain injury and suggest that lesions in the cortico-striatothalamic circuit, parietal and temporal cortex, cerebellum and brainstem may induce compulsivity. Post traumatic hoarding behaviors have been associated with mesial prefrontal damage. Anderson et al. (2005) investigated the occurrence of abnormal collecting behavior resulting from focal brain damage and found it could result from damage to the right mesial prefrontal region, at the level of the AC and the frontal pole.

# DECISION MAKING

Decision-making reflects a process in which a choice is made after reflecting on the consequences of that choice. Fontaine and Dodge (2006) proposed that real-time decision making involves multiple stages of evaluation which they characterized as follows: ''an individual responds to a social stimulus by perceiving stimulus cues (step 1: encoding), making social inferences about the stimulus and social context (step 2: interpretation), clarifying his or her own personal interests (step 3: clarification of goals), generating alternative ways to respond to the stimulus (step 4: response access or construction), evaluating these alternatives, considering their possible consequences, selecting the preferred response for enactment (step 5: response decision), and carrying out the selected behavior in response to the stimulus (step 6: enactment)'' (p. 606). However, after TBI, many individuals exhibit poor judgment and pursue actions that lead to bad outcomes in such a manner that suggests an inability to learn from experience. Consequently, they repeat the same mistakes and lack the ability to anticipate the likely outcome of decisions. The failure to learn from repeated mistakes, against a background of normal intelligence, memory, speech, sensation, and movement, has been referred to as the frontal paradox (Walsh, 1985; Wood, 2001) and represents a dislocation between normal performance on measures of cognition compared to abnormal performance in the application of cognition in everyday life. Such individuals seem to exercise poor judgment when choosing friends and partners, or engage in activities that place them at some kind of risk.

The failure of decision-making that is so obvious in community activities, is often not reflected by performance on standardized clinical tests. This was one factor that led Damasio (1996) to propose a theory of decision making influenced by emotional factors, referred to as the Somatic Marker Hypothesis (SMH). The central feature of this theory is that emotionrelated signals (somatic markers) assist cognitive processes in implementing decisions, especially when the outcome is ambiguous. Some somatic markers can operate below a level of conscious awareness yet bias behavioral actions, a notion that influenced Bechara's development of the Iowa Gambling Task (Bechara et al., 2000). The IOWA examines decision-making by asking participants to make choices in circumstances that mimic real-life situations because of elements of uncertainty, reward, and punishment. This decision-making mechanism has parallels with personality traits represented by ''non-planning impulsivity'', i.e., living for the moment and disregard for the future (Patton et al., 1995) or a lack of ''premeditation'', the absence of thinking and reflecting on the consequences of an act before engaging in that act (Whiteside and Lynam, 2001).

Acting without thought of the consequences is considered by many to be a cardinal feature of altered personality after TBI. Impulsive decision making and poor social judgment are often accompanied by shallow affect and a lack of concern for social values, usually associated with right hemisphere prefrontal injury. The pattern of behavior after TBI has been referred to as pseudo-psychopathy (Blumer and Benson, 1975), or acquired sociopathy (Blair and Cipolotti, 2000). These terms describe the personalities of a subset of patients who lack adult tact and restraint, in association with poor social judgment and short-lived enthusiasm for ill-judged projects. Euphoric mood is sometimes accompanied by emotionally labile and erratic behavior, with low tolerance of frustration, leading to shallow irritability and impulsive aggression. Such individuals exhibit a jocular, often puerile sense of humor, making facetious comments or acting in a manner that reflects a lack of tact and restraint, usually in the form of social and/or sexual disinhibition. They exhibit a tendency to hold a favorable view of themselves that is at odds with how they are seen by others.

Disordered behavior and personality that reflects poor decision-making has been associated with injury to the vmPFC and medial orbitofrontal cortex (mOFC; Blair and Cipolotti, 2000; Bechara and Van Der Linden, 2005). Damage to medial

prefrontal cortex is linked to a range of deficits in reward sensitivity, emotion based learning and decision making (Young and Koenigs, 2007; Gläscher et al., 2009). Ventromedial damage also makes people less inclined to take into account future consequences (Bechara et al., 1994) and thus incentives to action are less effective. This may help explain the increased tendency to discount future rewards after TBI, favoring short term gains in decision making (McHugh and Wood, 2008; Wood and McHugh, 2013).

# EMOTIONAL DEFICITS

The perception of emotionally salient information involves a complex and diverse neural system which includes the ventral striatum, specific thalamic nuclei, the amygdala, the anterior insula and regions of the prefrontal cortex (Davidson et al., 2000). At a cortical level, the ventromedial and ventrolateral prefrontal regions appear to be of particular importance for the generation of emotional responses. The ventrolateral prefrontal cortex responds to emotional information, including the induction of sad mood and the recall of personal memories and emotional material (Drevets, 2000). Functional neuroimaging has also identified dorsal regions of the AC gyrus and dorsomedial and dorsolateral prefrontal cortices in selective attention, planning and motor responses to emotional stimuli (Ochsner et al., 2002; Phillips, 2003).

Given the extensive and complex prefrontal systems mediating emotion it is unsurprising that emotion and social conduct are intimately linked (Bibby and McDonald, 2005; Henry et al., 2006; Muller et al., 2010), though it is often difficult to establish the nature of this association. For example Milders et al. (2008) failed to find any link between cognitive flexibility, emotion recognition, or ToM and proxy ratings of behavior problems 1 year post-TBI. Likewise McDonald et al. (2014) did not find evidence of a specific ToM contribution to social communication. Aboulafia-Brakha et al. (2011) were also unable to identify a specific executive process common to ToM tasks. McDonald (2013) considered the distinction between ''hot'' and ''cold'' aspects social cognition as a basis for evaluating and interpreting a social situation from the perspective of empathy. Hot social cognition was associated with emotional empathy (empathizing with the affective state of another person and actually experiencing the same emotions, but not necessarily to the same degree). Cold social cognition mediates cognitive empathy (which allows us to objectively recognize another person's emotional state without becoming emotionally involved ourselves). Both forms of empathy are important in social interaction. However, an awareness of the context in which emotion is exhibited is also important. This can influence how we should behave in different social or environmental settings. For example, shouting emotionally charged comments whilst standing on the terrace of one's local soccer club may be considered acceptable and even appropriate. However, the same behavior exhibited in the middle of one's local supermarket would probably result in being arrested.

Diminished ability to experience emotion usually translates into an inability to empathize. Empathy has been described as the ''binding force'' of social cognition, allowing individuals to share experiences and understand each other's perspectives (Eslinger, 1998). Unsurprisingly therefore, a lack of empathy can contribute to the fragility of relationships when a partner, who was previously loving and affectionate becomes, after TBI, emotionally withdrawn and indifferent. Wood and Williams (2008) explored the capacity for emotional empathy in a cohort of 89 head injured patients, 60.7% of whom recorded low levels of emotional empathy, compared to 31% of a demographically matched control group drawn from the general population. Whilst often maintaining a respectable social and emotional veneer, their behavior towards close friends and family was perceived as emotionally indifferent, with a self-indulgent attitude, often with an amoral disposition that was out of character with the individual's pre-accident behavior.

Williams and Wood (2010) found a link between a lack of emotional empathy after TBI and the presence of Alexithymia, a multifaceted personality construct comprising difficulty–identifying feelings; distinguishing between feelings and bodily sensations of emotional arousal; describing feelings to other people; constricted imaginal processes evidenced by paucity of fantasies, and a stimulus-bound, externally-oriented thinking style (Taylor et al., 1997). The impaired emotion processing and regulating capacities thought to underpin alexithymia, has led to it being conceptualized as one of several possible post TBI personality risk factors underpinning a lack of emotion recognition and responsiveness (Wood and Williams, 2007, 2008; Williams and Wood, 2010). **Figure 1** presents a

# REFERENCES


schematic representation of key brain structures and circuits highlighted in this review.

# CONCLUSION

The mechanics of TBI render especially vulnerable the frontotemporal regions and associated subcortical structures such as the cingulate, amygdala, striatum and insula that are intimately connected to prefrontal cortex. Orbito-frontal and ventromedial areas particularly have been implicated in a wide range of emotional and behavioral sequelae of TBI arising from disruption to hot executive functions, whilst damage to dorsolateral regions is typically associated with disturbance of cold executive processes. This distinction provides a useful means of characterizing the diverse nature of executive deficits that underlie neurobehavioral disorders and explains why traditional tests of executive abilities are often inadequate to encapsulate the range of real life problems often experienced after TBI.

# AUTHOR CONTRIBUTIONS

The two authors are jointly responsible for the content of the article, and both contributed significantly to the submitted version. RW as first author wrote the first draft of the article; AW as second author reviewed and rewrote the article, adding significant new material which was subject to final amendment by RW and approval by AW.

non-offender sample. Int. J. Law Psychiatry 30, 492–503. doi: 10.1016/j.ijlp. 2007.09.010


injury: a pilot study. Brain Inj. 22, 715–721. doi: 10.1080/0269905080 2263027


**Conflict of Interest Statement**: The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2017 Wood and Worthington. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

# **Metacognitive Aspects of Executive Function Are Highly Associated with Social Functioning on Parent-Rated Measures in Children with Autism Spectrum Disorder**

#### *Tonje Torske1 \*, Terje Nærland2, 3, Merete G. Øie4, 5, Nina Stenberg6 and Ole A. Andreassen3*

<sup>1</sup> Division of Mental Health and Addiction, Vestre Viken Hospital Trust, Drammen, Norway, <sup>2</sup> NevSom Department of Rare Disorders and Disabilities, Oslo University Hospital, Oslo, Norway, <sup>3</sup> NORMENT, KG Jebsen Centre for Psychosis Research, University of Oslo and Oslo University Hospital, Oslo, Norway, <sup>4</sup> Department of Psychology, University of Oslo, Oslo, Norway, <sup>5</sup> Research Department, Innlandet Hospital Trust, Lillehammer, Norway, <sup>6</sup> Division of Mental Health and Addiction, Oslo University Hospital, Oslo, Norway

#### *Edited by:*

Lynne Ann Barker, Sheffield Hallam University, United Kingdom

#### *Reviewed by:*

Christos Frantzidis, Aristotle University of Thessaloniki, Greece Margot J. Taylor, Hospital for Sick Children, Canada

> *\*Correspondence:* Tonje Torske tonje.torske@vestreviken.no

*Received:* 15 March 2017 *Accepted:* 19 December 2017 *Published:* 10 January 2018

#### *Citation:*

Torske T, Nærland T, Øie MG, Stenberg N and Andreassen OA (2018) Metacognitive Aspects of Executive Function Are Highly Associated with Social Functioning on Parent-Rated Measures in Children with Autism Spectrum Disorder. Front. Behav. Neurosci. 11:258. doi: 10.3389/fnbeh.2017.00258 Autism Spectrum Disorder (ASD) is characterized by social dysfunction. Even though executive dysfunction has been recognized as important in understanding ASD, the findings are inconsistent. This might be due to different definitions of executive function (EF), which part of EF that has been studied, structured vs. unstructured tasks, inclusion of different moderators (age, IQ, sex) and different diagnostic categories within the spectrum. The main finding is that people with ASD have more EF difficulties than normal controls and more difficulties on open-end tasks than on structured cognitive tasks. Since some EF difficulties may not be observable in a laboratory setting, informant measures might have higher ecological validity than neuropsychological tests. Evidence suggests that executive dysfunctions are associated with social impairments, but few studies have investigated the details of this relationship, and it remains unclear what types of EF deficits are relevant for the social problems of individuals with ASD. Here we investigated which EF domains were associated with various domains of social function on parent-rated measures. A total of 86 children and adolescents with a diagnosis of ASD were included and tested for general cognitive abilities. Parents completed the Behavior Rating Inventory of Executive Function (BRIEF) and the Social Responsiveness Scale (SRS). Multiple regression analysis revealed significant associations between SRS scores and age, sex, total IQ and the BRIEF indexes. The Metacognition Index from the BRIEF added significantly to the prediction of the SRS total score and the subscales Social Communication, Social Motivation and Autistic Mannerisms. The findings suggest that metacognitive aspects of EF are of particular importance for social abilities in children and adolescents with ASD. Earlier research has shown that typically developing (TD) children have a different relationship between EF and social function than children with ASD. They found that in TD children the EF domain related to behavioral regulation was most important to social function. The results from the current study may have implications for understanding the cognitive components of the social problems that define ASD, and may be relevant in developing more targeted clinical EF interventions related to core ASD dysfunctions.

**Keywords: executive function, social function, autism spectrum disorder (ASD), behavior rating inventory of executive function, social responsiveness scale**

# **INTRODUCTION**

Autism Spectrum Disorder (ASD) is characterized with persistent deficits in social communication and social interaction across multiple contexts, and restricted, repetitive patterns of behavior, interests, or activities (American Psychiatric Association, 2013). Executive function (EF) deficits are common in children with ASD (Pennington and Ozonoff, 1996; Hill, 2004; Geurts et al., 2014a), but not part of the diagnostic criteria. Furthermore, EF correlates strongly with adaptive behavior (Gilotty et al., 2002) and influences Quality of Life (QoL) in children with ASD (de Vries and Geurts, 2015). EF is often defined as the process of physical, cognitive, and emotional self-control and self-regulation that are necessary to maintain an effective goal-directed behavior (Pennington and Ozonoff, 1996; Corbett et al., 2009; Diamond, 2013). EF comprises several components including inhibition, working memory, flexibility, emotional control, initiation, planning, organization, and self-control (Miyake et al., 2000; Hill, 2004). Even though executive dysfunction has been recognized as important in understanding ASD, the findings are inconsistent (Van Eylen et al., 2015). One explanation might be that EF is an umbrella term comprising several components, and researchers have focused on different subdomains. Meta-analyses and reviews have been written about domains like inhibition and interference control (Geurts et al., 2014b), cognitive flexibility (Leung and Zakzanis, 2014) and working memory in ASD (Barendse et al., 2013; Wang et al., 2017). All conclude that patients with ASD have EF deficits within the investigated areas, but not all found differences between ASD and normal controls on neuropsychological testing. Furthermore, the most consistent and striking difficulties are seen on tasks that are open-ended in structure, lack explicit instructions and involve arbitrary rules (White, 2013; Van Eylen et al., 2015). Therefore, some of the inconsistency is suggested to be due to different types of measurements (parentrated measures vs. neuropsychological testing). Individuals with ASD often display pronounced EF deficits in daily life, while performing adequately on highly structured neuropsychological tasks (Kenworthy et al., 2008). The presence of comorbidities like Attention Deficit Hyperactivity Disorder (ADHD) also increase the risk of EF difficulties (Craig et al., 2016). Most research has focused on data from neuropsychological assessments of EF and/or how EF impairment is related to a diagnosis of ASD (Hill, 2004; Leung et al., 2015). Since some EF difficulties may not be observable in a laboratory setting, informant measures might have higher ecological validity than neuropsychological tests (Kenworthy et al., 2008). For this reason, questionnaires have been developed to investigate EF deficits in everyday life settings (Gioia et al., 2000). A frequently used scale is the Behavior Rating Inventory of Executive Function (BRIEF) (Gioia et al., 2000). The BRIEF is divided into a Behavioral Regulation Index (BRI), and a Metacognition Index (MI), which together form a Global Executive Composite (GEC). The BRI comprises the child's ability to modulate both behavior and emotional control, and the ability to move flexible from one activity to another. The MI is related to the child's ability for active problem solving, and to initiate, organize and monitor their own actions (Gioia et al., 2000). Deficits in metacognitive aspects of EF (MI) have in earlier studies shown to be of particular importance to adaptive functioning in high functioning children with ASD (Gilotty et al., 2002).

While both social and EF deficits in ASD have been extensively studied separately, there has been limited research on the *relationship between* EF and social function. Some findings underpin the relationship between EF and key social concepts in ASD like *joint attention* and *Theory of Mind* (*ToM*). Dawson (Dawson et al., 2002) argues that performance on ventromedial prefrontal EF tasks is strongly correlated to joint attention ability in young children. Joint attention is an important prerequisite for social functioning and often impaired in ASD (Dawson et al., 2002). Pellicano found that individual differences in EF in early life, predicted change in children's ToM skills (Pellicano, 2010). This line of research has to a large extent been based on laboratory measures and neuropsychological test results. Leung et al. explored the role of EF in social impairment in ASD using informant-based measures (Leung et al., 2015). They reported that both BRI and MI from the BRIEF predicted social functioning in children with ASD, while only BRI predicted social functioning in the general population. In contrast, Kenworthy et al. (2009) found that EF, such as behavior regulation from BRIEF and semantic fluency and divided auditory attention, correlated with autistic symptoms. In an objective neuropsychological assessment of EF in ASD children, Landa and Goldberg (2005) found no relationship between EF and social skills.

Due to the heterogeneity of ASD (Lai et al., 2014), more insight may also be gained from studying the range of social difficulties beyond diagnostic categories. The Social Responsiveness Scale (SRS) (Constantino and Gruber, 2005), is a questionnaire designed to identify the presence and severity of social impairments associated with ASD. The SRS consists of five subscales which were developed to differentiate between the subcategories of social impairments for children with ASD (Constantino and Gruber, 2005). This continuous scale allows trait quantification and focuses on functions closer linked to activities in everyday life that diagnostic tools might fail to capture (Achenbach, 2011; Nelson et al., 2016). Most findings support a one-factor model of SRS (Constantino et al., 2004; Bolte et al., 2008), while others have found evidence for several dimensions (Nelson et al., 2016). Although most research with the SRS has focused on the total score only, knowledge about how the different subscales are related to EF will provide new and more differentiated knowledge about the social deficiencies in ASD and may have important implications for interventions.

Furthermore, it remains unclear what types of EF deficits are most relevant for the social problems of children and adolescents with ASD. It is known that the intelligence quotient (IQ), age and sex are factors that might influence the EF in children with ASD, supported by findings of increasing EF deficits with age (Rosenthal et al., 2013), sex-dependent EF deficits (Bolte et al., 2011; Lemon et al., 2011; Lehnhardt et al., 2016) and a relationship between EF and intelligence (Diamond, 2013; Blijd-Hoogewys et al., 2014; Rommelse et al., 2015). Thus, the degree to which IQ, sex, and age influence the relationship between EF and social problems needs to be clarified.

There is an increased interest in possible sex differences in ASD, since there might be important biological and behavioral differences between girls and boys with ASD (Halladay et al., 2015; Lai et al., 2015). However, there are inconsistent findings regarding differences in the composition of EF difficulties in girls and boys with ASD. Some have found that girls have more EF difficulties with inhibition (Lemon et al., 2011), while others reported that girls outperform boys on EF tasks related to processing speed and verbal fluency (Bolte et al., 2011; Lehnhardt et al., 2016). White et al. (2017), showed that girls with ASD have more problems related to everyday EF than boys. To the best of our knowledge, there are no studies of possible sex differences in the relationship between everyday EF deficits and social function related to ASD symptoms. However, Bolte et al. (2011) found a correlation between EF difficulties on performance based tasks and stereotypical behaviors in boys.

A review showed that EF continues to develop throughout childhood in typical developed children, reaches adult-like levels in mid-adolescence, and that the different EF components vary in their developmental trajectories (Best and Miller, 2010). Van den Bergh et al. (2014) found that inhibition problems in everyday life were more pronounced in young ASD children, and that planning was more evident for the oldest group. Contrary to their expectations they did not find a relationship between ASD severity and EF (Van den Bergh et al., 2014). In a longitudinal study by Andersen et al. (2015a) maturation of inhibition and cognitive flexibility from childhood to adolescence was found in ASD, even though the ASD group was more impaired in EF than typically developing (TD) children. They concluded that there may be a delayed development of EF in ASD, and suggested a possible developmental arrest for working memory (Andersen et al., 2015b). Rosenthal et al. (2013) on the other hand showed that there were widening divergences with age between children with ASD and TC on EF tasks in everyday life, especially in metacognitive executive abilities. They also reported significant and quite stable problems with flexibility in ASD (Rosenthal et al., 2013).

The aim of the present study was to explore the relationship between social functioning measured with the SRS (total score and the subscales Social Awareness, Social Cognition, Social Communication, Social Motivation and Autistic Mannerisms) and everyday EF measured with the BRIEF (the indexes BRI and MI) in a clinical sample of children with ASD. We also investigated potential sex differences by comparing girls and boys, and possible age differences by splitting the sample at 12 years of age. We hypothesized that there is a significant positive association between parental reports of SRS scores and BRIEF scores in children with ASD, controlling for age, IQ and sex. Furthermore, we hypothesized that both BRI and MI from the BRIEF are significant predictors of the SRS total score.

# **METHODS**

# **Participants**

The study was part of the national BUPgen network, recruiting patients from Norwegian health services specializing in the assessment of ASD and other neurodevelopmental disorders. The current sample comprised 86 children with ASD, recruited between 2013 and 2016 and assessed at age 6–18 years. Thirteen of the children (15.1%) had childhood autism, one had atypical autism (1.2%), 41 (47.7%) had Asperger syndrome and 31 (36%) had unspecified pervasive developmental disorder (PDD-NOS) (**Table 1**).

Male: female ratio was 2.7:1. In total, 42 children were diagnosed with at least one comorbid disorder. Attention deficit/hyperactivity disorder (ADHD) was the most common comorbid diagnosis, and 28 participants (32.6%) had an ADHD diagnosis in combination with their ASD diagnosis. Furthermore, we divided our sample into; girls and boys, and two age groups above and below 12 years (6–12 years and 13– 18 years). Our sample included 23 girls and 63 boys. There were no significant differences between girls and boys on age, IQ or proportion of comorbid ADHD. More boys had a diagnosis of childhood autism, but there were no significant differences between the sexes in the distribution of the Asperger syndrome or PDD-NOS diagnoses. There were 36 children in the youngest age group and 50 adolescents in the oldest group. In these two age groups there were no significant differences in sex distribution, IQ, proportion of ADHD or ASD diagnoses (**Table 1**). All participants had an IQ within the normal range based on a standardized Wechsler's test (Total IQ ≥ 70) and spoke Norwegian fluently. The participants total IQ fell within 2 standard deviations of the average normal score when including confidence intervals. Exclusion criteria were significant sensory losses (vision and/or hearing).

# **Clinical Assessment**

The children were assessed by a team of experienced clinicians (clinical psychologists and/or child psychiatrists). Diagnostic conclusions were best-estimate clinical diagnoses derived from tests, interview results and observations. All diagnoses were based on the International Statistical Classification of Diseases and Related Health Problems 10th Revision (ICD-10) (World Health Organization, 1992) criteria, and the autistic symptoms


**TABLE 1 |** Child characteristic.

\*p < 0.01.

n/a, Not Applicable.

aChi-square.

bPDD-NOS, Pervasive developmental disorder unspecified.

cADHD, Attention deficit/hyperactivity disorder.

were evaluated using the Autism Diagnostic Observation Schedule (ADOS) (Lord et al., 2000) and/or Autism Diagnostic Interview-Revised (ADI-R) (Rutter et al., 2003b) and/or the Social Communication Questionnaire (SCQ) (Rutter et al., 2003a). In addition, the assessment included a full medical and developmental history, physical examination, and IQ assessment. The current study included children with ASD and comorbid ADHD. Results from recent studies indicate that neuropsychiatric disorders overlap with respect to both symptoms and causes (Moreno-De-Luca et al., 2013), and it has been recognized by both clinicians and researchers for some time that ASD and ADHD often co-occur (Yerys et al., 2009). In the DSM-5, other behavioral diagnosis may accompany a diagnosis of ASD, for example ADHD (American Psychiatric Association, 2013).

# **Measures**

# **Social Function**

The parent version of the SRS (developed for the age group 4–18 years) (Constantino and Gruber, 2005) was used to identify social communication difficulties. The SRS is used in screening and/or as an aid to a clinical diagnosis of ASD and is comprised of 65 questions, rated on a 4-point Likert scale. In addition to a total score, the SRS consists of five subscales: Social Awareness, Social Cognition, Social Communication, Social Motivation and Autistic Mannerisms. The SRS has been translated into Norwegian (Ørbeck, 2009). Internal consistency is high, both in population based samples and clinical samples (Cronbach's alpha = 0.93–0.97) (Constantino and Gruber, 2005). The SRS's reliability and validity have proven satisfactory in both population based and clinical samples in Europe and in USA (Bolte et al., 2008; Wigham et al., 2012), and correlate well with Autism Diagnostic Interview-Revised (ADI-R) scores (Constantino et al., 2003). *T-scores* of ≥76 are strongly associated with a clinical diagnosis of Autistic Disorder, Asperger's Disorder, or more severe cases of pervasive developmental disorder not otherwise specified (PDD-NOS). *T-scores* of 60–75 represent mild to moderate deficits in reciprocal social behavior that is clinically significant, resulting in mild to moderate interference in everyday social interactions.

# **Executive Function (EF)**

In order to assess EF the parents completed the parent version of the BRIEF (Gioia et al., 2000). The BRIEF for children and adolescents aged 5–18 years includes 86-item parent and teacher forms that allow professionals to assess everyday EFs in the home and school environments (Gioia et al., 2000). The BRIEF contains eight clinical scales that are grouped in a Behavioral Regulation Index (BRI): Inhibit, Shift and Emotional Control, and a Metacognition Index (MI): Initiate, Working Memory, Plan/Organize, Organization of Materials and Monitor. *T-scores* of ≥65 are considered to represent clinically significant areas. The Global Executive Composite (GEC), is a summary score that incorporates all eight clinical scales. The GEC has high reliability in both standardized and clinical samples (Cronbach's alpha = 0.80–0.98). The current study used the Norwegian version of the parent rating form, which has high internal consistency (Cronbach's alpha = 0.76–0.92) (Fallmyr and Egeland, 2011). Similar levels are reported for the English version (Cronbach's alpha = 0.80–0.98) (Gioia et al., 2000).

# **Intelligence Quotient (IQ)**

IQ was assessed using age-appropriate Wechsler tests of intelligence (Wechsler, 2002, 2003, 2008).

# **Statistical Analyses**

Data analyses were conducted using the statistical package IBM SPSS Statistics for Windows, version 23.0 (SPSS, Inc.,

dIQ, Intelligence Quotient.

Chicago, Illinois). Descriptive analyses and bivariate correlations were conducted. Pearson's independent *t*-test was used to compare means between girls and boys, and the two age groups. Chi-square for crosstabs was used to investigate differences in the distribution of autism diagnoses and comorbid ADHD. The differences between the subgroups correlation coefficients were calculated using http://vassarstats.net/rdiff. html (two-tailed). Because of small subgroup sizes and no significant differences between the correlation coefficients, the regression analyses were done on the total sample and sex and age were incorporated as independent variables. The assumptions of linearity, multicollinearity, independence of errors, homoscedasticity, unusual points and normality of residuals were met. Separate multiple regression analyses were conducted to explain the variance in SRS scores (total and subscales) from age, sex, total IQ and BRIEF (BRI and MI subscales). Bonferroni correction was used to correct significance level for multiple comparisons on all the analyses in this paper, and the *p* < 0.01 level (*p* < 0.05/5 = *p* < 0.01) is used in all the regression analyses. The *p* < 0.003 level is used for the correlation analyses (*p* < 0.05/18 = *p* < 0.003). For all the other analyses (*t*tests, Chi-square and difference between correlation coefficients) we used *p* < 0.01.

# **Ethical Considerations**

The study was approved by the Regional Ethical Committee and the Norwegian Data Inspectorate (REK #2012/1967), and was conducted in accordance with the Helsinki Declaration of the World Medical Association Assembly. The study is based on tests that are included in regular clinical assessment. The patient, the family and the professional network around them were offered information about the test results, the diagnostic process and recommended interventions. Written informed consent was obtained from all the individual participants included in the study.

# **RESULTS**

# **SRS and BRIEF Scores**

The mean SRS total score was in the severe range (*T-score* = 78.5). The highest mean score was on the Social Mannerisms subscale (*T-score* = 80.3). The lowest mean score was on the Social Awareness subscale (*T-score* = 65.3). Means and standard deviations for SRS total score and treatment subscales are shown in **Table 2**.

The results from the BRIEF were in the clinically significant range on the indexes BRI (*T-score* = 68.5) and GEC (*Tscore* = 67.1). The mean MI score from the BRIEF was *Tscore* = 64.4. The highest mean score was found on the subscale Shift (*T-score* = 72.2), and the lowest mean score was found on the subscale Organization of Materials (*T-score* = 54.7). All other subscales had *T-score* averages between 62 and 67 (**Table 2**).

There were no significant differences between girls and boys on BRIEF and SRS scores. However, there was a strong tendency for girls to have higher scores (more problems) on SRS total (*p* = 0.013) and Social Cognition (*p* = 0.013), and a tendency for higher scores on the subscales Social Communication (*p* = 0.032) and Plan/Organize (*p* = 0.035). There were no significant differences between the two age groups on BRIEF and SRS scores. However, there was a tendency for the youngest group to have higher scores (more problems) on the Social Awareness subscale from the SRS (*p* = 0.036).

# **The Relationship between the SRS and the BRIEF**

There was a statistically significant (*p* < 0.001–0.005) relationship between all the subscales on the SRS and the BRIEF indexes, except the SRS subscale Social Motivation and the BRIEF index BRI. The proportion of shared variance (*r*2) varied between 6 and 37%. The strongest positive correlation was found between the SRS total score and the BRIEF index scores GEC (*r* = 0.62, *p* < 0.001) and MI (*r* = 0.60, *p* < 0.001). There was also a moderate positive correlation between the SRS total score and BRI (*r* = 0.48, *p* < 0.001). For Pearson's correlation coefficients, see **Table 3A**.

Girls generally had a stronger relationship between the SRS total and the BRIEF index scores than boys (see **Table 3B**). However, the differences between the correlation coefficients for girls and boys were not significant, calculated using http:// vassarstats.net/rdiff.html (to-tailed) (*p* = 0.197–0.390). For girls there was a strong and significant relationship between the SRS total and all the BRIEF indexes. The proportion of shared variance (*r*2) for the correlations for girls was 41–58%. For boys there was a moderate to strong relationship between the SRS total and the BRIEF index scores, and all the correlations were significant. The proportion of shared variance (*r*2) for the correlations for boys was 24–34%.

For the youngest age group (6–12 years) there was a strong relationship between the SRS total and the BRIEF index scores, and all the relationships were significant. For the oldest group (13–18 years) the relationship between BRI and SRS total was not significant. However, the MI and GEC showed significant relationships to the SRS total. The proportion of shared variance (*r*2) for the correlations was 45–61% for the youngest group and 12–23% for the oldest group (see **Table 3B**). The differences between the correlation coefficients between the youngest and the oldest group were not significant, but showed strong tendencies in the relationship between SRS total and GEC (*p* = 0.021), between SRS total and MI (*p* = 0.024) and SRS total and BRI (*p* = 0.044).

A multiple regression analysis was conducted to identify the relations between SRS total score and age, sex, total IQ, BRIEF BRI and BRIEF MI. This model statistically significantly explained SRS total; *<sup>F</sup>*(5, 80) <sup>=</sup> 12.57, *<sup>p</sup>* <sup>&</sup>lt; 0.001, *<sup>R</sup>*<sup>2</sup> <sup>=</sup> 0.440. Only BRIEF MI had a significant independent contribution to the prediction, *p* = 0.001. This result remained when the children with comorbid ADHD were removed from the analysis. For the children with ASD without ADHD (*n* = 58), the regression model with age, sex, total IQ, BRIEF BRI, and BRIEF MI significantly explained SRS total; *F*(5, 52) = 10.31, *p* < 0.001, *<sup>R</sup>*<sup>2</sup> <sup>=</sup> 0.498. Only BRIEF MI had a significant independent contribution to the prediction, *p* = 0.003. Regression analyses were also done with each SRS subscale as dependent variables



<sup>\*</sup>p < 0.01.

aIndependent t-tests were conducted for comparisons between girls and boys, and between the age groups 6–12 years and 13–18 years.

Elevated SRS T-scores indicate a high degree of impairment. T-scores of 76 or higher are strongly associated with a clinical diagnosis of ASD. T-scores of 60–75 indicate deficiencies in reciprocal social behavior that are clinically significant and are resulting in mild to moderate interference in everyday social interactions.

Elevated BRIEF T-scores indicate a higher degree of impairment, with T-scores of 65 and above considered to represent clinically significant areas.

**TABLE 3A |** Associations between social function (SRS) and executive function (BRIEF) assessed with questionnaires (N = 86).


\*Significance after correction for multiple testing is set to p < 0.003 (2-tailed).

BRIEF, Behavior Rating Inventory of Executive Function.

on the total sample (*n* = 86), and all the regression models were significant (*p* < 0.001). For the subscales Social Communication, Social Motivation and Social Mannerisms, only MI from the BRIEF had a significant independent contribution to the predictions. For the subscale Social Awareness, only age had a significant independent contribution to the prediction (*p* = 0.001). None of the independent variables made an independent contribution on the subscale Social Cognition, but both sex and MI showed strong tendencies (*p* = 0.012 and 0.016). The details are described in **Table 4**.

# **DISCUSSION**

# **The Importance of the Metacognition Index**

The main finding of the present study was that the metacognitive component of EF (MI), was the most important factor in explaining social function in children with ASD. We hypothesized that both BRI and MI from the BRIEF would predict SRS scores in children with ASD. However, despite high BRI scores, the MI explained more of the social dysfunctions measured with SRS in children with ASD. This is an interesting finding, since other studies have shown that behavior regulation **TABLE 3B |** Associations between social function (SRS) and executive function (BRIEF) assessed with questionnaires for the subgroups girls and boys, and the age groups 6–12 years and 13–18 years.


\*Significance after correction for multiple testing is set to p < 0.003 (2-tailed).

SRS, Social Responsiveness Scale.

BRIEF, Behavior Rating Inventory of Executive Function.

is closely linked to social function (Kenworthy et al., 2009), and that metacognition competence is of more importance for school performance (Carretti et al., 2014). MI is composed of subdomains like initiating, working memory, organizing and monitoring. Difficulties in these areas are probably easier to overlook by parents, teachers and clinicians in everyday life, than difficulties with subdomains within behavior regulation like; inhibition, flexibility, and emotional control. Therefore, it is important to highlight that MI is of importance for social function. The current findings provide new knowledge about the relationship between EF and the various domains of social competence in children with ASD, using parent-rated measures.

There are few studies regarding the relationship between EF in everyday life settings and social function in children with ASD. The current findings are in line with Leung et al. (2015), who showed that the MI from the BRIEF plays a role for social function in children with ASD. However, in contrast to the finding from Leung et al. (2015), our study did not find a statistically significant contribution of the BRI to social function. There is no obvious reason for these different findings, and the participants in the two studies share many of the same characteristics. However, some differences in the recruitment process or sample size may explain the different results, and further studies are needed to clarify this question. Our study contained a larger number of participants and had a more conservative significant level compared to Leungs study, and a proportion of our participants also had comorbid ADHD, and this may explain the different results. However, the results from both studies emphasize the importance of the relationships between MI and social function in children and adolescents with ASD. Kenworthy et al. (2009), on the other hand, found that EF such as BRI predicted only the communication symptoms and not the reciprocal social interaction in children with ASD. However, they used a composite score based on ADI and ADOS scores to measure social function, and not the SRS (Kenworthy et al., 2009). Landa and Goldberg (2005) found no relation between the neuropsychological assessments of EF and the social function in ASD children. The difference to our findings may be a result of biases in parent rated approaches (e.g., an inclination


to score "favorable" or "unfavorable" to items, regardless of the specific content), but it is also reasonable to assume that a parentrated design may uncover some relevant aspects of everyday function not accessible to the controlled setting of standardized assessments.

We did not find any significant differences between girls and boys in our sample. However, girls had a tendency for higher scores on especially SRS total and Social Cognition, which might imply that girls have more social problems than boys in our sample. Contrary to White et al. (2017), the girls in our sample did not have significantly higher scores on the BRIEF than the boys. Others have found that females with ASD have more EF impairments compared to males (Lemon et al., 2011), while Lehnhardt et al. (2016) and Bolte et al. (2011) found evidence for higher EF functioning for females with ASD than for males. Sex did not have an independent impact in our regression models, but showed strong tendencies toward significance on the SRS total score and the subscale Social Cognition, where girls had more problems than boys in our sample. None of the studies earlier mentioned have investigated the relationship between EF and social function in everyday life settings, but it underlines the importance of being aware of possible sex differences in ASD.

Even though we did not find any significant differences between the two age groups children (6–12 years) and adolescents (13–18 year), there were some interesting tendencies. The BRIEF scores in our sample were quite similar in the two age groups, which is in contrast to Rosenthal et al. (2013) who found greater EF problems especially in metacognition for older children with ASD. Our result, that the relationship between EF difficulties and social dysfunction was strongest in the youngest group, might be due to young children having more generalized difficulties in both social function and EF than older children. However, we found that in the oldest group there was a significant relationship between social function and metacognition, but not for social function and behavior regulation. This might imply that metacognitive EF is of importance for social function in older age, and that behavior regulation does not have the same impact on social function. In our regression analyses we showed that age significantly predicted the ability to perceive social cues (Social Awareness), were the youngest group had more problems than the oldest group.

Intelligence (IQ level) did not have any significant impact on the relationship between EF and social function in our analysis. The relationship between measurements of EF, and especially fluid IQ, has been the subject of a longstanding discussion. Several studies have found that much of the variance in EF performance can be explained by IQ level (Joseph and Tager-Flusberg, 2004; Friedman et al., 2006; Diamond, 2013). However, there is indication that the relationship between EF and IQ is different among individuals with ASD, compared with typically developed (TD) and other patients groups. Merchan-Naranjo et al. (2016) found that EF was affected, but did not correlate with IQ in children and adolescents with ASD without intellectual disability. Rommelse et al. (2015) even found that participants with above average intelligence performed relatively more poorly on some EF tasks compared to IQ matched controls. One possible explanation for the different relationship between EF and IQ in individuals with ASD might be the role of speed of information processing. This ability can be intact in ASD, and not correlated to IQ. However, speed of information processing is normally closely linked to the general factor (g-factor) of intelligence (Scheuffgen et al., 2000; Wallace et al., 2009). The g-factor was established by Spearman, who discovered that most cognitive tests are positively correlated with each other, regardless of which cognitive domain the individual test measures (Colom et al., 2006). A review by Sheppard and Vernon (2008) concluded that measures of mental speed and information processing are significantly correlated with measures of intelligence in nonclinical samples. This highlights the importance of incorporating IQ as a covariant in the relationship between the EF and social function.

Earlier research has shown that TD children have a different relationship between EF and social function than children with ASD. In TD children, the BRI has an impact on their social function (Leung et al., 2015). Furthermore, there is evidence suggesting that there are mutual interactions between EF and social function for both TD and ASD groups (Moriguchi, 2014; Leung et al., 2015). From studies of TD children, we know that social interaction may facilitate the development of EF. Especially maternal scaffolding, modeling and imitation have demonstrated to be predictors of the development of EF (Moriguchi, 2014). It is also likely that EF facilitates the development of cognitive skills that are important for social interaction (van Lier and Deater-Deckard, 2016). Subcomponents of EF, such as inhibition, flexibility and monitoring are thought to influence the ability to engage in positive social interactions (van Lier and Deater-Deckard, 2016). White (2013) offers a different interpretation of the difficulties that individuals with ASD experience on especially unstructured EF tasks. She argues that poor results on these tasks are caused by difficulties in forming an implicit understanding of the experimenter's expectations for the task. In this view, implicit information is less available to individuals with ASD due to their mentalizing difficulties, and this leads to poor performance, more than difficulties with EF *per se*.

# **Potential Implication for Development of Clinical Interventions**

Our finding that subdomains of social function have a different relation to EF may be relevant for stratifying the treatment for children with ASD. Interventions that target metacognitive skills have, in earlier studies, shown to improve the social abilities in children and adolescents with ASD (Kenworthy et al., 2014; Leung et al., 2015). However, as social ability is a broad concept comprising different abilities and motivational factors, more studies are needed to identify which part of social function may profit on metacognitive interventions. Our study indicates that the ability to perceive social cues and social motivation has, possibly, a weaker relationship to EF than the other aspects of social functioning. This might imply that children with problems related to social communication, social mannerisms and social cognition might benefit more from intervention programs designed to enhance EF in general, and metacognitive function, in particular. Individuals that primarily have problems with the ability to pick up on social cues and social motivation might be less likely to benefit from such intervention programs. Taken together, it is possible to hypothesize that children with difficulties with understanding social cues (social awareness) might benefit more from classic social skills training programs (Chang et al., 2014; Otero et al., 2015). For children with problems related to social motivation it might be important to build on the child's areas of interest to enhance their motivation (Dunst et al., 2011). In all cases, we need to assess the individual child's cognitive and social profile and then tailor interventions to fit the child. Further studies are needed to examine the clinical implications of these findings. More generally, the finding that there is an important relationship between EF and social function, gives support to the hypothesis that an intervention that has an impact on cognitive abilities and EF is also likely to have an influence on the social skills of children and adolescents with ASD (McGovern and Sigman, 2005; Wang et al., 2017).

# **Strengths and Limitations of the Study**

The strengths of the study include a reasonable sample size of clinically well-defined children and adolescents with ASD, with a moderate sample of females. Both the SRS and the BRIEF are parental ratings, and this might bias the findings. However, we considered that parents observe their children when they are engaged in a range of everyday activities and that their observations add valuable information. Particularly since it is known that difficulties may not be observable in a laboratory setting or structured clinical settings, informant measures might have higher ecological validity (Kenworthy et al., 2008). At the same time, parent-ratings are likely to be affected by other factors such as IQ, emotional/behavioral problems and comorbidities in the children (Aldridge et al., 2012; Cholemkery et al., 2014; Havdahl et al., 2016). However, diagnostic tools like the ADOS and the ADI-R are also influenced by these kinds of confounding factors (Havdahl et al., 2016). Some of the children in our study were medicated, and several had a comorbid disorder. It is known from the literature that comorbidity is common in ASD, and that as many as 70% have a comorbid disorder (Simonoff et al., 2008; Lai et al., 2014; Gjevik et al., 2015). The main finding that metacognitive aspects of EF are significantly related to social function was significant also after removing the children with comorbid ADHD from the analysis. It can be argued that it is important to study individuals with ASD and comorbid ADHD because this is a common comorbidity. A recent meta-analysis of EF and children and adolescents with high-functioning ASD, was the first to take the confounding effect of ADHD comorbidity into account. They confirmed the presence of executive dysfunction in this group, and found that these deficits were not solely accounted for by the effect of comorbid ADHD or general cognitive abilities (Lai et al., 2016). Furthermore, language deficits are important factors that affect social communication/function in a negative way in ASD. We did not perform a comprehensive assessment of different language functions in our participants, but some studies have found that EF is not related to language ability in verbal, school-aged children with ASD (Joseph et al., 2005). Since our participants were school-aged children within the normal IQ range, we assume that our finding is valid, even without a comprehensive language assessment. The current sample included a wide age range and we therefore controlled for this in the regression analyses. Despite this, the age range might be a methodological limitation due to the correlation with social awareness, and future studies might benefit from focus on more restricted age groups.

Our study did not clarify in which direction EF and social function affect each other, but it suggests that there is a strong relation between the two. The current findings should be replicated in independent samples and combined with objective measurements, such as neuropsychological or neurophysiological examination. This was not available in the current sample. Thus, future research should combine laboratory and informant-based measures for a more in-depth investigation of the link between EF and social function, as the two measures are complementary to each other (Leung et al., 2015). To fully understand the relationship between EF and social functioning, studies with longitudinal designs are need to provide more specific detail about functional developmental change at different ages (Taylor et al., 2015). Furthermore, most research within the ASD field is conducted on male samples, but some evidence suggests that females exhibit greater cognitive impairments than their male counterparts (Lemon et al., 2011; White et al., 2017). In our analysis, we found a tendency for girls to have higher SRS scores, and a stronger relationship between social function and EF than boys. For future research, it is important to investigate if the relationship between EF and social function can be modulated by sex.

# **CONCLUSION**

We report a relationship between parental reports of EF and social function in an everyday setting in children with ASD. We found that the metacognitive domain of EF has a significant association to many aspects of social function. These results may have implications for understanding the cognitive components in the social deficits that define ASD. Further studies are needed to clarify if children with ASD will improve their social function through intervention programs designed to enhance EF in general and metacognitive function, in particular.

# **AUTHOR CONTRIBUTIONS**

Conceived and designed the study: TT, TN, MØ, and OA. Performed the study: TT, TN, and NS. Analyzed the data: TT and TN. Interpreted the data, wrote the paper, and approved the version to be published: TT, TN, MØ, NS, and OA.

# **ACKNOWLEDGMENTS**

We are thankful to the BUPgen partners and to all the participants. The study is part of the BUPgen Study group and the research network NeuroDevelop. The project was supported by the National Research Council of Norway (Grant #213694) and the South-Eastern Norway Regional Health Authority funds

**REFERENCES**


the Regional Research Network NeuroDevelop (Grant #39763). The corresponding author has a research grant from Vestre Viken Hospital Trust (Grant #6903002).


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

*Copyright © 2018 Torske, Nærland, Øie, Stenberg and Andreassen. This is an openaccess article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.*

# Automatic inhibition and habitual control: alternative views in neuroscience research on response inhibition and inhibitory control

# *Agnes J. Jasinska\**

*National Institute on Drug Abuse, Intramural Research Porgram, Neuroimaging Research Branch, Baltimore, MD, USA \*Correspondence: jasinskaaj@mail.nih.gov*

*Edited by:*

*Lynne A. Barker, Sheffield Hallam University, UK*

Decades of theory and experimental evidence underscore the critical importance of inhibitory functions to flexible, contextappropriate and goal-directed action (Diamond et al., 1963; Logan and Cowan, 1984; Aron et al., 2004; Friedman and Miyake, 2004; Ridderinkhof et al., 2004; Munakata et al., 2011). Deficits in inhibitory control and response inhibition have been implicated in substance use disorders, attention-deficit/hyperactivity disorder, and impulse control disorders (Jentsch and Taylor, 1999; Nigg, 2001; Li and Sinha, 2008; Groman et al., 2009), and may also play a role in depression and anxiety (Disner et al., 2011; Jovanovic and Norrholm, 2011). But despite exciting progress in this area of cognitive and clinical neuroscience, some fundamental questions remain unsolved, and may be hindering the translational efforts aiming at improving the treatment and prevention of disorders characterized by impairment or dysregulation of inhibitory functions.

The purpose of this Opinion is to highlight some alternative ideas and approaches in neuroscience research on response inhibition and inhibitory control that have emerged in the literature, in order to stimulate debate and suggest hypotheses for future research. In particular, building on recent neuroscience-based accounts of response inhibition (e.g., Mostofsky and Simmonds, 2008; van Gaal et al., 2008; Munakata et al., 2011), I argue *against* the long-standing and pervasive view that there is a fundamental distinction between inhibitory processes on the one hand and response selection and execution processes on the other hand. Although intuitive, this view does not appear to be supported by evidence. Instead, until there is evidence to the contrary, I propose a more parsimonious view of inhibitory control mechanisms in the brain, such that: (1) response inhibition can be either a control process (to override a prepotent response tendency) or it can itself be a prepotent response tendency (to be overridden); (2) response inhibition processes and response selection and execution processes are fundamentally the same; and (3) we learn to inhibit a response in fundamentally the same way that we learn to select and execute a response.

# **PROPOSITION 1: RESPONSE INHIBITION CAN BE EITHER A CONTROL PROCESS (TO OVERRIDE A PREPOTENT RESPONSE TENDENCY) OR A PREPOTENT RESPONSE TENDENCY (TO BE OVERRIDDEN)**

A long-standing assumption has been that response inhibition processes, and inhibitory control more generally, are fundamentally different from response selection and execution processes. This assumption in part derives from the centuries-old and highly influential tradition of dichotomizing between the topdown, voluntary and deliberate control processes, and the bottom-up, stimulusdriven and automatic response tendencies that often need to be overridden. In fact, a typical experimental paradigm used to examine inhibitory control (such as Go/NoGo and Stop-Signal tasks) requires a deliberate, effortful inhibition of a prepotent, automatic response, creating the illusion that inhibitory control is always deliberate and effortful whereas the response to be inhibited is always prepotent and automatic. But one can think of conditions and situations in which this relationship is reversed, e.g., a patient with social anxiety may have to intentionally and effortfully overcome his or her prepotent tendency to refrain from public speaking, and many of us would face a similar challenge if asked to get up on the stage and sing.

Therefore, I argue that response inhibition is not a control process *by default*; instead, it may function as either a control process (to override a prepotent response tendency) or as a prepotent response tendency itself (to be overridden), depending on the situational demands and/or the relative strengths of the two competing goal representations. I expand on the idea of inhibitory goal representation in Proposition 2, and I discuss the notion of automatic inhibition in Proposition 3.

# **PROPOSITION 2: THE BRAIN PROCESSES MEDIATING RESPONSE INHIBITION ARE FUNDAMENTALLY THE SAME AS THE BRAIN PROCESSES MEDIATING RESPONSE SELECTION AND EXECUTION**

Evidence from human functional MRI (fMRI) studies using Go/NoGo, Stop-Signal, and similar tasks suggests that inhibition of a motor response engages the same network (or networks) of brain regions that are engaged during selection and execution of this motor response. In particular, response inhibition processes engage a distributed network of both cortical and subcortical regions, including the inferior frontal gyrus (IFG; or inferior frontal cortex, IFC), insula, anterior cingulate cortex (ACC), pre-supplementary motor cortex (pre-SMA), dorsolateral prefrontal cortex (DLPFC), parietal regions, and basal ganglia [e.g., (Rubia et al., 2001; Garavan et al., 2002; Aron and Poldrack, 2006); for meta-analyses, see (Wager et al., 2005; Simmonds et al., 2008; Swick et al., 2011; Criaud and Boulinguez, 2013; Hart et al., 2013)]. For instance, there is a compelling neuroimaging and lesion evidence of the critical role of the IFG in inhibitory control (Konishi et al., 1999; Aron et al., 2003; Rubia et al., 2003). Yet, as demonstrated by a recent fMRI study in over 1800 subjects (Whelan et al., 2012), a frontal network centered on the IFG is engaged both during successful inhibition trials and during failed inhibition trials (i.e., when subjects erroneously executed the response). Similarly compelling is the neuroimaging evidence for the importance of the pre-SMA in inhibitory control [e.g., (Rubia et al., 2001; Mostofsky et al., 2003; Garavan et al., 2006), for meta-analyses, see (Simmonds et al., 2008; Swick et al., 2011; Criaud and Boulinguez, 2013)]. But as reviewed by Mostofsky and Simmonds (2008), the pre-SMA also plays a critical role in both response preparation and response selection. In fact, evidence from electrophysiological recordings in non-human primates suggests that some pre-SMA neurons participate both in the suppression of the incorrect response and in the facilitation of the correct response in a saccade Go/NoGo task (Isoda and Hikosaka, 2007), suggesting that response inhibition processes overlap with response selection processes not only at the level of large-scale brain networks involved, but also at the level of individual neurons.

Similarly, at the level of synaptic transmission, response inhibition processes may not be fundamentally different from response selection and execution processes. It is sometimes assumed that response inhibition must rely on inhibitory synaptic transmission to a larger degree than response selection and execution processes. But why should it be the case? Inhibitory synaptic transmission involving the neurotransmitter gammaaminobutyric acid (GABA) is known to be as important as excitatory glutamatergic transmission both at cortical and subcortical levels (Kandel et al., 2000). Fast inhibitory synaptic transmission is mediated primarily by ionotropic GABAA receptors, which hyperpolarize the cell and thus raise the threshold for firing an action potential when activated. When GABAA receptors are localized to postsynaptic glutamatergic neurons, they serve to inhibit the activity of these neurons. But GABAA receptors can also be localized to postsynaptic GABAergic neurons, in which case they may serve to disinhibit (rather than inhibit) the activity of downstream glutamatergic neurons, leading to activation rather than inhibition at the local-circuit or larger-network level. Conversely, activation of glutamatergic neurons may be required to activate a group of GABAergic neurons and trigger inhibition. Thus, successful response inhibition likely relies on reciprocal interactions between inhibitory GABAergic neurons and excitatory glutamatergic neurons, rather than on inhibitory GABAergic transmission alone.

# **PROPOSITION 3: WE LEARN TO INHIBIT A RESPONSE IN FUNDAMENTALLY THE SAME WAY THAN WE LEARN TO SELECT AND EXECUTE A RESPONSE**

We have a working model of how the brain learns to select and execute a specific response: such learning is thought to involve the formation of a distributed neural goal or task representation in the prefrontal cortex (Miller and Cohen, 2001; Sakai, 2008), by which a specific pattern of sensory input becomes progressively associated with a specific pattern of motor output via long-term potentiation (LTP) and associated synaptic-plasticity processes at glutamatergic synapses (Kandel et al., 2000). In comparison, the processes underlying inhibitory goal representations and inhibitory learning remain less wellunderstood—including synaptic plasticity at inhibitory synapses (for a recent review, see Castillo et al., 2011).

Nevertheless, if response inhibition processes are fundamentally the same as response selection and execution processes, then it follows that the neural representation of inhibitory goals (or Stop goals) should not fundamentally differ from the neural representations of response selection and execution goals (or Go goals), and the underlying learning processes should also be fundamentally the same. In the influential horse-race models (Logan and Cowan, 1984; Verbruggen and Logan, 2009b), response inhibition in a Stop-Signal task is conceptualized as a race between a Go process triggered by a Go stimulus (i.e., a Go goal) and a Stop process triggered by a Stop stimulus (i.e., a Stop goal). In this model, the competing Go and Stop goals are regarded as equivalent, and it is the relative timing of the Go and Stop processes that determines whether the response is successfully inhibited or not. In fact, Munakata and colleagues (2011) have argued that at least some inhibitory control processes can be understood in terms of such competition between goal representations in the prefrontal cortex, and it is the relative strength of these goal representations that determines whether a behavioral response is executed or inhibited by that individual in a given situation.

Furthermore, although counterintuitive, growing evidence suggests that response inhibition processes may be stimulus-driven to the same extent that response selection and execution processes are stimulus-driven. In fact, one and the same stimulus can activate both a goal representation to carry out a behavior and a goal representation to inhibit this behavior. For instance, food-related cues may activate both a goal to consume the food and a competing goal to stay on a diet (Fishbach et al., 2003; Hare et al., 2009; Kroese et al., 2011); smoking-related cues and smoking-cessation messages may activate both a goal to smoke a cigarette and a competing goal to abstain from smoking (Brody et al., 2007; Jasinska et al., 2012); and signals of threat may activate both aggressive and fear-related behaviors (Beaver et al., 2008; Passamonti et al., 2008).

Finally, following the same logic, there is no reason why response inhibition should not become automatic—and inhibitory control habitual—with appropriate and sufficient practice. Specifically, if response inhibition can be triggered by a Stop stimulus in the same fashion that response selection and execution is triggered by a Go stimulus, then a consistent mapping between a specific Stop stimulus and the inhibition of a specific response should result in practicerelated improvements and eventually automaticity (Verbruggen and Logan, 2009a; Lenartowicz et al., 2011), even if Stop stimuli are not consciously perceived [(van Gaal et al., 2008, 2009, 2010); see also (Eimer and Schlaghecken, 2003)]. Indeed, such practice-related improvements have been demonstrated in the Go/NoGo task, which relied on consistent stimulus-inhibition associations, but not in the Stop-Signal task, in which no such associations were learned [(Verbruggen and Logan, 2008); see also, (Verbruggen and Logan, 2009b)]. Converging evidence of such learned stimulus-inhibition association—or *automatic inhibition* following training in a Go/NoGo task was also demonstrated by reduced corticospinal excitability on Go trials preceded by NoGo trials, relative to Go trials preceded by Go trials (Chiu et al., 2012). Interestingly, despite a lack of such consistent stimulus-inhibition mapping, direct activation of the right IFG with transcranial direct current stimulation (tDCS) improved response inhibition in the Stop-Signal task relative to sham condition (Jacobson et al., 2011). These findings support the view that inhibitory control can become habitual. However, if response inhibition and response execution are learned in the same manner, it follows that learning of stimulusinhibition associations should follow the same principles of initial specificity and subsequent generalization. These principles may determine the extent of generalization—or conversely, the limits of transfer—in training-based interventions for inhibitory control deficits.

### **CONCLUSIONS**

In this Opinion article, drawing on recent neuroscience-based accounts of inhibitory control mechanisms in the human brain (e.g., Mostofsky and Simmonds, 2008; van Gaal et al., 2008; Munakata et al., 2011), I argued that: (1) response inhibition can be either a control process (to override a prepotent response tendency) or it can itself be a prepotent response tendency (to be overridden); (2) response inhibition processes and response selection and execution processes are fundamentally the same; and (3) we learn to inhibit a response in fundamentally the same way that we learn to select and execute a response. These propositions have implications for both basic and translational neuroscience research on inhibitory control, and the

goal is to stimulate debate and inspire novel hypotheses for future research, ultimately aimed at treatment and prevention of inhibitory control deficits.

### **ACKNOWLEDGMENTS**

Dr. Jasinska is supported by the NIDA-NIH Intramural Research Program.

#### **REFERENCES**


in the dynamic control of behavior: inhibition, error detection, and correction. *Neuroimage* 17, 1820–1829.


(2003). fMRI evidence that the neural basis of response inhibition is task-dependent. *Brain Res. Cogn. Brain Res.* 17, 419–430.


response inhibition while mesial prefrontal cortex is responsible for error detection. *Neuroimage* 20, 351–358.


*Received: 01 March 2013; accepted: 18 March 2013; published online: 04 April 2013.*

*Citation: Jasinska AJ (2013) Automatic inhibition and habitual control: alternative views in neuroscience research on response inhibition and inhibitory control. Front. Behav. Neurosci. 7:25. doi: 10.3389/fnbeh. 2013.00025*

*Copyright © 2013 Jasinska. This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits use, distribution and reproduction in other forums, provided the original authors and source are credited and subject to any copyright notices concerning any third-party graphics etc.*

# Cognitive Impairment in Patients with Chronic Neuropathic or Radicular Pain: An Interaction of Pain and Age

Orla Moriarty 1, 2, Nancy Ruane2, 3, David O'Gorman2, 3, Chris H. Maharaj 2, 3 , Caroline Mitchell 2, 3, Kiran M. Sarma2, 4, David P. Finn1, 2 and Brian E. McGuire2, 3, 4 \*

*<sup>1</sup> Pharmacology and Therapeutics, School of Medicine, National University of Ireland, Galway, Ireland, <sup>2</sup> Centre for Pain Research, National Centre for Biomedical Engineering Science, National University of Ireland, Galway, Ireland, <sup>3</sup> Division of Pain Medicine, Galway University Hospital, Galway, Ireland, <sup>4</sup> School of Psychology, National University of Ireland, Galway, Ireland*

A growing body of empirical research has confirmed an association between chronic pain and cognitive dysfunction. The aim of the present study was to determine whether cognitive function is affected in patients with a diagnosis of chronic neuropathic or radicular pain relative to healthy control participants matched by age, gender, and years of education. We also examined the interaction of pain with age in terms of cognitive performance. Some limitations of previous clinical research investigating the effects of chronic pain on cognitive function include differences in the pain and cognitive scale materials used, and the heterogeneity of patient participants, both in terms of their demographics and pathological conditions. To address these potential confounds, we have used a relatively homogenous patient group and included both experimental and statistical controls. We have also specifically investigated the interaction effect of pain and age on cognitive performance. Patients (*n* = 38) and controls (*n* = 38) were administered a battery of cognitive tests measuring IQ, spatial and verbal memory, attention, and executive function. Educational level, depressive symptoms, and state anxiety were assessed as were medication usage, caffeine, and nicotine consumption to control for possible confounding effects. Both the level of depressive symptoms and the state anxiety score were higher in chronic pain patients than in matched control participants. Chronic pain patients had a lower estimated IQ than controls, and showed impairments on measures of spatial and verbal memory. Attentional responding was altered in the patient group, possibly indicative of impaired inhibitory control. There were significant interactions between chronic pain condition and age on a number of cognitive outcome variables, such that older patients with chronic pain were more impaired than both age-matched controls and younger patients with chronic pain. Chronic pain did not appear to predict performance on the Wisconsin Card Sorting Task, which was used a measure of executive function. This study supports and extends previous research indicating that chronic pain is associated with impaired memory and attention.

Perspective: Compared to healthy control participants, patients with chronic neuropathic or radicular pain showed cognitive deficits which were most pronounced in older pain patients.

Keywords: neuropathic pain, radicular pain, age, cognition, attention, memory, executive function

#### Edited by:

*Nicholas Morton, Doncaster Rotherham and South Humber NHS Foundation Trust, United Kingdom*

#### Reviewed by:

*Eleni Konsolaki, American College of Greece, Greece David Playfoot, Sheffield Hallam University, United Kingdom*

> \*Correspondence: *Brian E. McGuire brian.mcguire@nuigalway.ie*

Received: *15 July 2016* Accepted: *11 May 2017* Published: *13 June 2017*

#### Citation:

*Moriarty O, Ruane N, O'Gorman D, Maharaj CH, Mitchell C, Sarma KM, Finn DP and McGuire BE (2017) Cognitive Impairment in Patients with Chronic Neuropathic or Radicular Pain: An Interaction of Pain and Age. Front. Behav. Neurosci. 11:100. doi: 10.3389/fnbeh.2017.00100*

# INTRODUCTION

Chronic pain is a debilitating condition associated with biopsychosocial consequences. Subjective reports by chronic pain patients and objective empirical research have demonstrated that chronic pain is associated with cognitive deficits in various domains of functioning including, attention, working memory, and executive function (Moriarty et al., 2011; Berryman et al., 2013; Moriarty and Finn, 2014). However, the problem of painrelated cognitive impairment remains under-researched due to various methodological barriers (McGuire, 2013).

Research suggests that cognitive deficits occur across a range of pain conditions [e.g., migraine (Meyer et al., 2000; Calandre et al., 2002; Mongini et al., 2005), fibromyalgia (Grace et al., 1999; Park et al., 2001; Luerding et al., 2008), or diabetic neuropathy (Ryan et al., 1992, 1993), but less emphasis has been placed on examining specific pain types (e.g., neuropathic, inflammatory), irrespective of their etiology. One study comparing different types of pain found that attention was impaired to a similar extent in rheumatoid arthritis, musculoskeletal pain, and fibromyalgia patients compared with healthy controls (Dick et al., 2002). Conversely, there is evidence that emotional decision making was impaired in lumbar spinal or radicular pain of the lower back, but not in Complex Regional Pain Syndrome (CRPS; Apkarian et al., 2004a), and general cognitive functioning was worse in neuropathic pain patients than in patients with a diagnosis of mixed neuropathic and nociceptive pain (Povedano et al., 2007). Although previous investigations of cognition in chronic pain have included neuropathic pain patients as part of a wider sample, few studies have examined performance specifically in neuropathic/radicular pain. In one of these studies, Povedano et al. (2007) reported cognitive impairment in a neuropathic pain cohort compared with the normative sample for the Mini Mental State Exam (MMSE). Two limitations of the study were the absence of a matched comparison group, and the reliance on the MMSE, which may not detect subtle deficits in particular cognitive domains.

Increasing age is consistently associated with cognitive decline (Salthouse, 1996; Salthouse et al., 1998), and there is evidence to suggest that age may moderate the impact of pain on cognitive performance in human and animal models (Leite-Almeida et al., 2009; Oosterman et al., 2013). Based on the hypothesis that pain competes for available attentional resources (Eccleston, 1994; Eccleston and Crombez, 1999), it could be predicted that the negative effect of pain on cognitive function would exacerbated as cognitive function declines with age. However, a positive relationship between reported pain ratings and executive function has also been demonstrated in elderly populations with Alzheimer's disease or arthritis/arthrosis (Scherder et al., 2008; Oosterman et al., 2009). This may simply suggest that pain report is less reliable in the case of more severe cognitive decline, but demonstrates that age is an important determinant of the relationship between pain and cognitive function, and thus requires further investigation.

The aim of the present study was to investigate the effects of pain on cognitive functioning, which would address, where possible, the limitations associated with previous research. Therefore, the study was designed to investigate cognitive functioning specifically in patients with chronic neuropathic or radicular pain, to probe performance in a variety of cognitive domains by exposing participants to a comprehensive battery of well-validated cognitive tests, and to minimize potential confounds. These include differences in the pain and cognitive scale materials used, and the heterogeneity of patient participants both in terms of their demographics and pathological conditions.

Our specific hypotheses were that: (1) chronic neuropathic/radicular pain patients would demonstrate impairments in cognitive performance compared with healthy participants even after controlling for potentially confounding factors, for example, levels of education, affective state and nicotine, alcohol and caffeine consumption. (2) We expected that in models predicting the effect of pain on cognitive performance, a pain × age interaction would emerge whereby older individuals with chronic pain would display the greatest levels of impairment; and (3) that in pain patients, cognitive performance would be predicted by the severity of their pain.

# MATERIALS AND METHODS

# Participants

A total of 38 chronic pain patients and 38 control participants took part in the study. Patients with chronic neuropathic pain or radiculopathy (minimum of 3 months, diagnosed by a specialist pain physician) were identified from the database of patients attending a tertiary pain management clinic at Galway University Hospital, Galway, Ireland. Recruitment of control participants was achieved through placement of advertisements in public places and in the local and national print media. Exclusion criteria were: age < 18 years; self-reported diagnosis of pre-existing cognitive impairment or major psychiatric illness (including major depressive disorder or generalized anxiety disorder); self-reported history of substance abuse, diabetes, epilepsy, seizures, or traumatic brain injury; and in the case of control participants, self-reported history of chronic pain. Patient and control groups were matched by gender, age, and education (**Table 1**). All participants gave informed written consent, in person, prior to the test session, and all testing procedures were carried out at University Hospital Galway or National University of Ireland, Galway. The study received full institutional approval from the Research Ethics Committee of the National University of Ireland, Galway and from the Galway University Hospitals Research Ethics Committee.

# Procedure

Patient participants were sent (by post) an information sheet and an invitation to participate in the research study, at least 1 week in advance of the assessment. To reduce patient burden, chronic pain patients were invited to complete the assessment on the day of a routinely scheduled appointment at the pain clinic. The testing was completed in advance of the clinic appointment to avoid the potentially confounding effects of interventional analgesic treatments. Where patients could not attend on the day of their appointment, but consented to participate, the assessment was scheduled for an alternative date. Medication

#### TABLE 1 | Demographic information.


\*\**p* < *0.01.*

status at the time of the assessment was not evaluated. For control participants, an outline of the study was provided in the public advertisements, and each participant was provided with a more detailed information sheet by the examiner prior to consenting to take part in the study.

# Demographics

Demographic information, including age, gender, and number of years of education, was collected for all participants using a standard form. In addition, participants were asked to estimate, if applicable, the length of time since they had last consumed nicotine and/or caffeine.

# Pain Assessment

Participants in the patient group were asked to complete the Chronic Pain Grade (CPG) Questionnaire (Von Korff et al., 1992). This 7-item scale provides measures of current pain and pain over the previous 3 months, as well as a measure of painrelated disability and an overall chronic pain grade classification, with responses recorded on a 10-point Likert scale. The CPG has been validated as a self-completion measure and is widely used in chronic pain research (Elliott et al., 1999; Dunn et al., 2008; Raftery et al., 2011). In addition, pain patient participants were asked to indicate painful areas on a manikin diagram and to estimate the number of months since the diagnosis of their pain. The total number and drug classification of patients' analgesic medications were also recorded.

# Perceived Impact of Pain

Patients' subjective ratings of the perceived impact of pain on cognitive function (concentration, memory, problem solving, and decision making) were recorded on a 10-point Likert scale where 0 was "no interference" and 10 was "extreme interference." This single-item scale was developed by the research team and was phrased in a manner similar to other CPG "interference" items ("In the past 3 months, how much has pain interfered with your concentration, memory, problem solving or decision making?").

# Depressive Symptoms and State Anxiety

Depressive symptoms were assessed using the 9-item Patient Health Questionnaire (PHQ-9) depression scale (Kroenke et al., 2001). This questionnaire is based on the Diagnostic and Statistical Manual of Mental Disorders-IV criteria, and has been widely used in research (Kroenke et al., 2001; Kroenke and Spitzer, 2002; Lowe et al., 2004). Scores were computed to give a tentative diagnosis of depression and a measure of symptom severity. The questionnaire responses also gave an indication of symptom-related functional impairment (ability to "work, take care of things at home or get along with other people"). The level of anxiety at the time of cognitive testing was measured using the "state" portion of the State-Trait Anxiety Inventory (STAI), Form Y (Spielberger et al., 1983). The STAI-S (20 items) provides a measure of "state" or current anxiety at the time of completing the questionnaire.

# Neuropsychological Tests General Intellect

An estimate of participants' intelligence quotient (IQ) was obtained using a two-test short-form of the Wechsler Adult Intelligence Scale-III (WAIS-III, Wechsler, 1997a), the Digit-Symbol Coding and Information subtests. Estimated full-scale IQ obtained using this dyadic short form of the WAIS-III has been shown previously to correlate (r <sup>2</sup> = 0.82) with IQvalues obtained using the full 12-subtest scale (Sattler and Ryan, 2001), and this combination of subtests was chosen based on its relatively short administration time. In the Digit-Symbol Coding subtest, participants were presented with nine digit-symbol pairs followed by a list of digits only. Participants were required to fill in the symbol corresponding to each digit as quickly as possible on a standard form. For the Information subtest, the participant was required to answer a series of factual general knowledge questions about common events, objects, places, and people. Raw scores for both subtests were converted to age-adjusted scaled score equivalents using the WAIS-III Administration and Scoring Manual (Wechsler, 1997a). Full-scale IQ was then estimated from the sum of the scaled scores using the extrapolation tables taken from Sattler and Ryan (Sattler and Ryan, 2001).

# Verbal Memory

The Logical Memory subtests I and II of the Wechsler Memory Scale-III (WMS-III, Wechsler, 1997b) were used to assess shortand long-term verbal memory and recognition memory. Two different stories (A and B) were read to the subject, and immediately afterwards the subject was asked to retell the story from memory.

After an interval of ∼25 min, the participant was again asked to recall as many details as possible from both stories A and B. For recognition memory, the participant was required to give "Yes" or "No" answers to a set of 30 questions relating to stories A and B, which includes a mixture of correct and incorrect statements about the story content. Participants were scored on the accuracy of the story recall ("story" and "theme" units) and number of correct responses to recognition questions. Raw scores were converted to age-adjusted scaled score equivalents using the WMS-III Administration and Scoring Manual (Wechsler, 1997b).

# Spatial Memory

The spatial span subtest of the WMS-III was used to measure short-term spatial memory capacity. The test was administered using the spatial span board, which consists of 10 cubes with the numbers 1–10 printed on the sides of the cubes facing the examiner. For spatial span forward, the examiner tapped the cubes in a specified sequence pattern and then asked the participant to tap the same sequence. For spatial span backward (reverse), the participant was asked to tap the sequence in reverse order. The sequence length increased until the participant could no longer replicate the sequence correctly. Raw scores were converted to age-adjusted scaled score equivalents using the WMS-III Administration and Scoring Manual (Wechsler, 1997b).

# Attention/Vigilance and Psychomotor Speed

The Continuous Performance Test—Identical Pairs (CPT-IP), adapted from the MATRICS (Measurement and Treatment Research to Improve Cognition in Schizophrenia) Consensus Cognitive Battery (Nuechterlein and Green, 2004), was used to measure attention and vigilance. This computerized test required the participant to monitor a series of 4-digit numbers as they appeared briefly on the computer screen. The participant responded to the sequential appearance of identical pairs of numbers by clicking a mouse as quickly as possible. The number of correct responses, the number of incorrect responses and the reaction time were recorded automatically. Incorrect responses were categorized as "false alarms" (i.e., responses to similar, but not identical, numbers presented in sequence) or random responses. The Dprime (range 0–4.2), an indication of the rate of hits to false alarms, was also calculated as an index of attention. The average reaction times to both hits and false alarms during the CPT-IP trial were used as tentative measures of psychomotor speed.

# Executive Function

Executive function was assessed using the Wisconsin Card Sorting Test—Computerized Version 4 (WCST-CV4). This test measured the respondent's ability to adapt to changing schedules of reinforcement (i.e., the "rules" about the task) and thus assesses cognitive flexibility, a key component of executive functioning (Berg, 1948). The computerized version used in this study presented the participant with four key cards and a stimulus card. The cards were matched according to one of three categories: color, shape or quantity of items on the card. Matching is achieved by placing the cursor on the key card selected and left clicking the mouse. The participants were not told how to match the cards, but they were given verbal and visual feedback on whether each match was "right" or "wrong." The category by which the cards were to be matched changed without warning after presentation of every ten stimulus cards; the subject was required to recognize the changed "rules" and identify the new presentation pattern. Raw scores and demographically (age and education) corrected standard scores were computed for five outcome measures: percentage errors, percentage perseverative (repetitive) responses, percentage perseverative errors, percentage non-perseverative errors and, percentage conceptual level responses. Raw scores were also computed for the number of categories completed, the number of trials to complete the first category, number of failures to maintain set and a learning-to-learn score (indicative of conceptual efficiency across consecutive categories).

# Statistical Design and Analysis

Data were analyzed using the Statistical Package for the Social Sciences (SPSS, version 18, IBM Corp., USA) for Windows. The statistical analyses were performed in two distinct phases; the first included both patient and control participant data, while the second included patient data only.

# Analysis by Pain Condition (Phase I)

In this phase of analysis, hypotheses (1) and (2), as stated in the Introduction, were tested. The first level of analysis was the computation of descriptive and inferential statistics. This was followed by correlation analyses of cognitive outcome measures and participant characteristics aimed at identifying potential covariates. The variables included in the correlation analyses were: age, gender, number of years of education, smoking status and time since last nicotine consumption, time since last caffeine consumption, and depression and anxiety scores. These variables can predict cognitive performance (see for example, review by Moore et al., 2009), effect of alcohol (Rohrbaugh et al., 1988), nicotine (Foulds et al., 1996; Mancuso et al., 1999), and caffeine (Brice and Smith, 2001; Kelemen and Creeley, 2001). We intended controlling for correlates of the outcomes when testing our hypotheses. This would allow us to examine the extent to which the hypothesized effects emerged, having held constant important correlates of the outcome being investigated.

Hierarchical multiple regression analyses were used to test the hypotheses; groups were coded as −1 (patient group) and +1 (control group), age was mean-centered, and an interaction term [group (−1/+1) × age (mean-centered)] was calculated. The purpose of centring was to minimize multicollinearity (arising, in this case, from entering both the variables and their interaction into the regression equation). Participant-characteristic variables that were correlated with the cognitive outcomes were entered into the first block of the regression model. Values for significance levels and β coefficients quoted in the text are for the overall model. Interaction plots for estimated marginal means (2 × 2 ANCOVA with patient vs. control and age median-split at 45 years) were used for visualization of the directionality of effects and for determining the nature of the interaction between variables (data not shown).

# Analysis by Pain Variable (Phase II)

The second phase of the study tested hypothesis (3), as stated in the Introduction. As in Phase I, an interaction effect between pain variables and age was also hypothesized, and this was tested in the analyses. The pain variables investigated were present pain intensity, pain-related disability score, number of painful areas, and pain chronicity (number of months since diagnosis of pain). Additional patient characteristics investigated as potential covariates were the presence or absence of medication (and different medication sub-groups), total number of medications, percentage pain relief from medications, and self-assessed impairment of cognitive function. Descriptive and inferential statistics relating to pain information (for patient participants only, n = 38) were calculated. Correlation matrices, using the patient group data only, were constructed to identify potential covariates. Each of the pain variables was centered, and interaction terms with mean-centered age were calculated. Hierarchical regression analyses were again used to test the hypothesis.

# RESULTS

# Participant Characteristics

Demographic and psychological descriptors of pain patient and control groups are presented in **Tables 1, 2**. Assumptions of the relevant tests were checked using a range of graphical (e.g., box plots, normality plots) and inferential tests of normality. Basic between-group comparisons were used to test the extent to which the matching procedure was successful. There were no significant differences between patient and control groups in age, gender, years of education, or duration since last consumption of caffeine or nicotine. There were, however, significantly more smokers in the patient group than in the control group, and patients exhibited higher depressive-symptom scores, increased depression-related functional impairment and greater levels of state anxiety than controls (see **Tables 1**, **2**).

# Cognitive Variables

# Analysis by Pain Condition (Phase I)

Prior to testing the hypothesized effects, group differences (patient vs. control) were explored. Results indicated consistently lower outcome scores in the patient group, or higher scores on reverse scales (number of false alarms and number of random responses in the CPT-IP, and raw scores for % errors, % perseverative responses, % perseverative errors, % nonperseverative errors, number of trials to complete the first category and failure to maintain set in the WCST, see **Table 3**). Significant differences between groups, where observed, are indicated in **Table 3**.

A correlation matrix of participant characteristic and cognitive outcome variables was used to identify potential covariates (see Table S1). Gender and smoker/non-smoker classifications were correlated with a number of cognitive measures (gender correlated with immediate verbal memory and reaction time; cigarette smoking correlated with immediate verbal memory), while, as expected, a large number of the outcomes were positively correlated with years of education and negatively correlated with measures of depression and state anxiety. Interestingly, the duration since last consumption of caffeine was positively correlated with scores measuring attentional performance. Thus, an increase in the time since participants had last consumed caffeine was associated with higher scores, an effect inconsistent with the recognized acute effect of caffeine as a CNS stimulant (Brunye et al., 2010).

The hypotheses predicted an effect of participant group (patient or control) and an interaction effect of group and age, and were tested through a series of individual hierarchical regressions for each of the cognitive outcomes. Significant covariates of the dependent variable (identified by correlation analysis) were entered in the first block, group (patient vs. control, coded −1/+1) and age (mean-centered) were entered in block 2, and the interaction term (pain × age) was entered in block 3. Statistics for cognitive variables predicted by group or by the interaction term (at levels close to or below the p = 0.05 level of statistical significance) are presented in **Table 4**.

In the regression model predicting estimated full scale IQ (FSIQ), years of education, PHQ-9 severity score and STAI-S total score were entered as controls in block 1. PHQ severity score and STAI-S total score were found to be correlated; however, the observed variance inflation factors (VIFs) were <10, suggesting that the assumption of minimal multicollinearity was not violated and that scores can be assumed to reflect independent constructs. Years of education significantly contributed to the FSIQ model (β = 0.37, p < 0.001), as did the presence of pain (β = 0.33, p = 0.04). Interaction plots showed that FSIQ scores were decreased in the presence of pain. Neither age nor the interaction term made any significant contribution to the model (see **Table 4**).

Immediate verbal memory was measured as story-unit and theme-unit recall scores and a learning-slope score. Presence of a chronic pain diagnosis, age, and their interaction term made no contribution to the story-unit or theme-unit regression models. There were no significant predictor variables of the theme-unit model, and only gender significantly contributed to the story recall unit model (β = 0.24, p = 0.04). In the case of learning slope, the contribution of STAI-S total score (entered in the first block) approached significance (β = −0.27, p = 0.06). There was a significant contribution of age (β = −0.29, p = 0.02); the interaction term of pain and age was also associated with an effect close to statistical significance (β = 0.20, p = 0.08, **Table 4**), with interaction plots suggesting that older pain patients performed

#### TABLE 2 | Psychological variables.


*†Clinically relevant depressive symptoms were considered if the participant's answers fell within the highlighted section of the 9-item Patient Health Questionnaire (PHQ-9) on four or more items, one of which corresponded to items 1 or 2.*

*††The putative type of depressive disorder was classified as "major depressive disorder" depending on the number of items answered in the highlighted section of the PHQ-9.*

*†††Depression-related functional impairment was assessed by item 10 of the PHQ-9, and deemed present if the item was endorsed as "somewhat difficult" or greater.*

\*\*\**p* < *0.001.*

worse than controls of a similar age and worse than younger pain patients.

Delayed verbal memory was also measured with story-unit and theme-unit recall scores, as well as with a recognitionmemory index and with percentage retention. In a manner similar to the case of immediate verbal memory, neither pain condition, age, nor their interaction significantly contributed to either story-unit or theme-unit models, and none of the additional predictor variables significantly affected these models. Furthermore, no significant predictors of percentage retention were identified (though the effect of age was close to significance: β = −0.21, p = 0.08). Recognition memory, however, was significantly predicted by both age (β = −0.27, p = 0.01) and the pain-age interaction term (β = 0.30, p = 0.006, **Table 4**), with poor performance particularly evident in older pain patients.

Spatial memory was assessed in forward and reverse trials of the spatial span task. Total spatial span and forward trial scores were not predicted by any of the three main independent variables of interest (pain, age, pain × age interaction), or by any of the control measures entered into the model. Effects close to the level of statistical significance were associated with the presence of pain (β = 0.21, p = 0.07) and years of education (β = 0.23, p = 0.06) in the reverse scores model (**Table 4**).

Attention was measured using the CPT, the outcome variables of which were the numbers of hits, false alarms and random responses. The number of hits in the CPT was significantly predicted by age (β = −0.36, p = 0.002) and there was also a trend for a contribution of chronic pain (β = −0.36, p = 0.06, **Table 4**). The control variable of depressive-symptom score (PHQ severity) was also a significant contributor to the model (β = −0.59, p = 0.002, **Table 4**). The slopes of the interaction plot indicated a reduced number of hits in the control group compared with the patient group; however, the group means in fact indicate an increased number of hits in the control participants compared with the patient group. Further analysis revealed a strong negative association between PHQ severity score and pain condition (r <sup>2</sup> = −0.83, p < 0.001). Therefore, despite tolerance and VIF-values being within accepted limits, the relationship between PHQ severity score and pain condition may have affected the overall predictive capability of the model. Age (β = 0.28, p = 0.02) and the age × pain interaction term (β = −0.25, p = 0.02) made significant contributions to the regression model for number of false alarms (**Table 4**). The number of random responses was influenced by age (β = 0.36, p = 0.001) and there was a near-significant effect for the age × pain interaction term (β = 0.20, p = 0.06, **Table 4**). In the case of the model fitted for D-prime scores, age (β = −0.34, p = 0.003) and PHQ severity scores (β = −0.33, p = 0.07) emerged as predictors. There were no significant effects of pain or pain-age interaction.

The reaction times to both hits and false alarms in the CPT task were considered measures of psychomotor speed. There were no main effects of age or pain. Though the individual contributions of age and presence of pain were non-significant, the pain-age interaction term significantly contributed to the model for "hit" reaction time (β = 0.24, p = 0.03, **Table 4**). This interaction effect was approaching significance in the case of "false alarm" reaction time (β = 0.21, p = 0.08), though the overall model was not significant. The interaction plots suggest that reaction time increased with the presence of pain in the younger participants but was reduced in the presence of pain in the older group.

The independent variables of pain, age, and age × pain interaction did not predict any of the WCST standard scores or the secondary outcomes (number of categories completed, number of trials to first category, failure to maintain set and learning-to-learn score). The only significant predictor was age (number of categories completed: β = − 0.28, p = 0.02), whereby older people achieved lower scores than younger participants.

# Analysis by Pain Variable (Phase II)

**Table 5** shows descriptive statistics for variables unique to the pain patient group and **Table 6** provides an overview of patients' medication status. Seventy-nine percent of patients were receiving long-term medication for the treatment of pain. Opioids were the most commonly prescribed class of medication, followed by anticonvulsants. The "other" category included

#### TABLE 3 | Group comparisons of cognitive outcomes.


\**p* < *0.05,* \*\**p* < *0.01,* \*\*\**p* < *0.001.*

classes of analgesics taken by a small number of participants within the sample. These treatments included: antispasmodics, benzodiazepines and other sedative hypnotics, local anesthetics, and transient receptor potential vanilloid 1 (TRPV1) ligands (capsaicin), as well as adjunctive therapies such as acupuncture and spinal-cord stimulation. In cases where a drug had more than one indication (for example, benzodiazepines used as a muscle relaxant, as an anxiolytic, or as treatment for insomnia), the precise reason for which it was prescribed was not queried. More than half (55%) of the patient participants were taking three or more different drugs to manage their pain. The average subjectively-reported relief provided by pharmacological treatments was 37%.

A correlation matrix of cognitive outcome variables and general participant characteristics was again constructed to identify significant correlations to be entered as control variables in the regression models (patient data only). Similarly to Phase I, years of education correlated with a large proportion of the outcome measures (see Table S2). The only other significant correlations were between: duration since last consumption of nicotine and the spatial span forward trial; depression score and number of hits, false-alarm reaction time and WCST failures to maintain set; and STAI-S total score and false alarm reaction time. Correlations between cognitive outcomes and patientspecific variables, including medications and patients' subjective assessment of the effect of their pain on cognitive function were also investigated (see Table S3). Anticonvulsant treatment was found to correlate significantly with learning-slope performance (rpb = 0.37, p < 0.05) suggesting lower scores in those receiving anticonvulsant medications. On the other hand, antidepressant treatment was associated with higher scores on the attention task (rpb = −0.38, p < 0.05 D-prime) and improved WCST performance as indicated by a decrease in the number of trials required to complete the first category (rpb = 0.34, p < 0.05). Self-assessment of the effect of pain on cognitive function was positively correlated with the failure to maintain set in the WCST (rpb = 0.45, p < 0.01). All significant correlations were entered as control variables in the hierarchical regression analyses. The hypothesis stated that pain variables, and the interaction of these variables with age, would predict differences in cognitive outcomes, and this was again tested through a series of hierarchical regressions. Significant covariates of the dependent variable were entered in the first block. Each pain variable (present pain intensity, average pain intensity, painrelated disability score, pain chronicity and the number of painful areas) was tested individually. Both the pain variable and age were centered and entered in block 2, and the interaction term was entered in block 3.

Pain chronicity was found to contribute significantly to the models of FSIQ (β = 0.31, p = 0.05), immediate verbal memory (β = 0.47, p = 0.003) and delayed verbal memory (β = 0.40, p = 0.02). The association was positive in each of these cases, suggesting that performance improved with an increasing duration of pain. The contribution of chronicity was also close to significance in the hit response reaction time model (β = 0.36, p = 0.07), which would suggest an increase in reaction time with increased pain chronicity.

The number of painful areas was found to be an effective predictor of immediate verbal memory (story-unit recall: β = −0.34, p = 0.05; theme-unit recall: β = −0.37, p = 0.04) and delayed verbal memory (β = −0.35, p = 0.04), whereby "more painful sites" was associated with poorer performance. The interaction of the number of painful areas and age was a significant contributor to the model of the spatial span reverse trial (β = −0.36, p = 0.03, ANOVA p = 0.06) and the number of random responses on the CPT (β = 0.30, p = 0.03).


*FSIQ, Full Scale Intelligence Quotient; CPT, Continuous Performance Task; PHQ, Patient Health Questionnaire; SAI, State Anxiety Inventory.* \*\*\**p* ≤ *0.001;* \*\**p* ≤ *0.01;* \**p* ≤ *0.05; † p* = *0.06;* #*p* = *0.07;* <sup>+</sup>*p* = *0.08.*



TABLE 6 | Breakdown of patient medications.


Chronic pain grade significantly predicted D-prime scores of attention (β = −0.43, p = 0.002), as did disability score (β = −0.28, p = 0.03). The interaction term of disability score by age also made a significant contribution to the memory retention model (β = −0.47, p = 0.006), while the interaction of present pain intensity and age had an effect close to significant on the number of hits in the CPT (β = −0.31, p = 0.07). The negative association indicates that worse pain was associated with a lower performance level on this task. In general, therefore, aspects of verbal memory and attention appear to be related to a variety of pain symptom and disability variables. When the total number of medications, and individual medication classifications, were entered as variables of interest, no main effects were observed on any of the cognitive outcome measures.

# DISCUSSION

This study adds to the evidence base that chronic pain patients are impaired on memory and attention tasks; in this case, in patients with neuropathic or radicular pain. Although matched for age and education, IQ was significantly lower in neuropathic/radicular pain patients than in controls and IQ score was significantly predicted by pain in the regression model. There was a trend for pain-related deficits in reversal of spatial memory in the spatial-span task, and the pain-age interaction negatively predicted verbal memory performance. A pattern of abnormal responding on the attention task was observed in pain patients, particularly in the older group. Performance on the Wisconsin Card Sorting Task was not affected by the presence of pain or by pain**-**age interaction. Individual pain variables (number of painful areas, pain intensity, pain-related disability) were inversely correlated with cognitive performance, in particular those related to memory. Conversely, there was a positive relationship between pain duration and measures of IQ and verbal memory. The findings of the study are considered below.

# Participant Characteristics

Comparison of patient and control groups revealed that the proportion of smokers was greater in pain patients than in matched healthy controls, consistent with previous findings (Ekholm et al., 2009; Zvolensky et al., 2009). However, smoking status and the time since last nicotine consumption were correlated with very few cognitive outcomes.

There were no between-group differences in the time since last caffeine consumption; however, the duration since last consumption was positively correlated with measures of attention, suggesting improved performance with increased duration since last caffeine intake. This appears to contradict the established acute effect of caffeine as a CNS stimulant. It is possible that negative effects of caffeine withdrawal on attention may have diminished with time since last consumption, or tolerance to the effects of caffeine may develop in heavy users (Fredholm et al., 1999). Regularity and average quantity of caffeine consumption was not measured in this study but should be considered for future studies.

Depressive-symptom and anxiety scores were significantly elevated in pain patients, consistent with the well-documented relationship between chronic pain and affective disorders (McCracken et al., 1996; Von Korff and Simon, 1996; Fishbain et al., 1997; Gureje et al., 1998; Raftery et al., 2011). Notably, 50% of patient participants met the PHQ diagnostic criteria for depression (compared with 0% control participants) and over 70% of patients scored higher than the standard normative scores, compared with 18% of the controls. While state anxiety score is not necessarily indicative of an anxiety disorder, these results suggest that pain patients were more susceptible to situational anxiety associated with the assessment or with their clinic visit, than were controls.

Recording medication use is important in studies examining the influence of pain on cognitive performance (McGuire, 2013). Within our patient group, five medication classifications were identified. Although anticonvulsant and antidepressant treatments were found to correlate with several cognitive measures, neither treatment significantly contributed to the overall regression models for cognitive variables. This is perhaps surprising given that analgesic agents, in particular opioids, tricyclic antidepressants, and anticonvulsants, have been linked with cognitive dysfunction and somnolence (Kerr et al., 1991; Spring et al., 1992; Raja et al., 2002; Salinsky et al., 2010). However, in the presence of chronic pain, negative effects of these treatments may be diminished, or cognitive performance may improve (Jamison et al., 2003; Kendall et al., 2010) as pain and its associated cognitive deficits are alleviated.

# Cognitive Measures

In regression analysis, a significant effect of pain condition was found on general intelligence as measured by the WAIS dyad, with patients having lower IQs than controls, despite matching, and statistically accounting, for years of education. This may be further evidence of cognitive impairment in pain patients, as the WAIS subtests draw on cognitive resources such as attention and memory (Groth-Marnat, 2009).

The fact that no association was found between pain and performance on the spatial span forward task is surprising, given evidence from previous clinical (Dick and Rashiq, 2007; Luerding et al., 2008) and preclinical (Leite-Almeida et al., 2009; Hu et al., 2010; Ren et al., 2011) studies showing a negative association between neuropathic pain and spatial memory. A trend for an effect of group was observed in the reversal of spatial memory assessed in the spatial span task, with patients scoring lower than controls. The reverse spatial span task shows some conceptual similarities to the rodent Morris water maze reversal task, and deficits in this task have previously been demonstrated in a rat model of neuropathic pain (Leite-Almeida et al., 2009; Moriarty et al., 2016). A significant group-age interaction effect was observed in the assessment of verbal recognition memory, with the data indicating a pain-related effect specifically in older participants. Deficits in recognition memory have been observed previously in chronic pain (fibromyalgia) patients, independent of an interaction with age (Park et al., 2001). Recognition memory deficits have also been shown in rodent models of chronic pain (Cain et al., 1997; Lindner et al., 1999; Millecamps et al., 2004; Kodama et al., 2011). Thus, pain, or the interaction of pain with age, contributed to decreased IQ and deficits on specific subtests of memory.

There was no effect of pain condition or pain-age interaction on the D-prime measure of attention in the CPT, contrary to previous findings of impaired attention in chronic pain (Eccleston, 1994; Grisart and Plaghki, 1999; Dick et al., 2002; Veldhuijzen et al., 2006; Dick and Rashiq, 2007; Oosterman et al., 2011) and pain-related attentional deficits in rodent models of chronic pain (Cain et al., 1997; Lindner et al., 1999; Millecamps et al., 2004; Boyette-Davis et al., 2008; Pais-Vieira et al., 2009; Kodama et al., 2011). A pain-age interaction effect was observed in the model for the number of false alarms (and for number of random responses, though this was just below the level of statistical significance). There was also a significant effect of the interaction term in the model for reaction time. Younger controls had shorter reaction times than younger pain patients, but older patients were quicker to respond than older controls. Thus, older patients had increased incorrect responses and also tended to respond faster, possibly indicative of impaired inhibitory control. Impaired inhibitory control could be associated with underlying dysfunction of the prefrontal cortex (PFC), and it is known that there are morphological, neuroplastic and neurochemical changes in the PFC related to chronic pain (Moriarty et al., 2011).

The absence of any effects of pain or pain-age interaction on the WCST was unexpected given the strong body of literature to suggest that executive functions (and analogous cognitive flexibility in rodents) are impaired in chronic pain (Grisart and Van der Linden, 2001; Apkarian et al., 2004a; Mongini et al., 2005; Karp et al., 2006; Leite-Almeida et al., 2009; Verdejo-Garcia et al., 2009; Walteros et al., 2011). However, a number of previous studies failed to show an effect of pain on executive function (Suhr, 2003; Scherder et al., 2008). There is therefore a need for further research on the effect of pain and the painage interaction on executive functioning, preferably utilizing a wide range of executive-function measures. We hypothesize that pain related cognitive impairments may be due to structural or molecular changes that occur in the brains of chronic pain patients, and given the overlap of pain- and cognitive-associated processing regions, that this may have functional consequences that impact on cognitive performance. Specifically related to executive functioning, structural and functional changes have been observed in both chronic pain patients and in animal models of chronic pain (Apkarian et al., 2004b; Luerding et al., 2008; Metz et al., 2009; Seminowicz et al., 2009). It is possible that

these changes are insufficient to produce impairments on a task such as the WCST in a robust and reliable manner. Chronic pain patients also show deficits on tasks that require rapid attentional switching (Ryan et al., 1993; Eccleston, 1994; Bosma and Kessels, 2002; Karp et al., 2006; Jongsma et al., 2011; Miro et al., 2011) and tasks involving emotional decision-making and emotional self-regulation (Apkarian et al., 2004a; Solberg Nes et al., 2009; Verdejo-Garcia et al., 2009). Performance on different tasks measuring executive-type functions may be mediated by different neuroanatomical subregions and using a single psychometric measure of performance may not be appropriate. We also noted that the overall performance of our sample (both patients and controls) on the WCST was relatively poor (scores of 89–93 compared with standard score average of 100). Thus, the present sample may have some characteristics that limit the extent to which we can generalize from these findings.

The number of painful areas reported was a predictor of verbal memory, with scores decreasing as pain diffuseness increased. Pain chronicity was found to be a positive predictor of IQ and verbal memory. Some studies have shown that pain chronicity was not associated with differences in cognitive function (Eccleston, 1994; Alanoglu et al., 2005), whereas others have shown that duration of a painful illness was inversely related to cognitive performance (Calandre et al., 2002; Apkarian et al., 2004a; Ryan, 2005; Verdejo-Garcia et al., 2009). A positive correlation, as observed herein, has not been demonstrated previously, but may represent a habituation or compensation mechanism induced in prolonged pain states. Further investigation of this theory is warranted. Previous reports of an inverse correlation included chronic migraine and diabetic patients (Calandre et al., 2002; Ryan, 2005). In the context of such illnesses, it is important to note that there may be a cumulative effect of disease duration and that the relationship between pain chronicity and cognitive function may be confounded by additional factors. A study by Seminowicz et al. (2011) showed, using functional imaging, that the dorsolateral PFC in chronic pain patients was thinner and abnormally activated during a cognitive task. However, following effective analgesic treatment, this region showed increased thickness and task-related activity patterns were normalized. These results suggest that painrelated cognitive impairment may be reversible with treatment over time. Thus, there may be a highly complex relationship between duration of pain, analgesic treatment and cognitivetask performance, an investigation of which was beyond the scope of the present study. There is also emerging evidence of positive adaptive plasticity that may derive from psychological interventions for chronic pain or use of specific coping strategies (see for example, Braden et al., 2016; Lazaridou et al., 2017; Malfliet et al., 2017) so that improvements in these psychological processes over time may explain a decreasing effect of pain despite increasing chronicity.

Pain intensity was not generally associated with alterations in cognitive function, which suggests that other aspects of the pain experience may contribute to changes in cognition. Chronic pain grade and pain-related disability negatively affected aspects of attention, suggesting that this cognitive domain may be more susceptible to the affective or disabling dimensions of pain. Interactions between pain variables and age also made contributions to memory and attention outcomes, again indicating an important interplay between these variables.

# Limitations

A sample size of 76 could be considered relatively small, however, sample sizes of this order have been used previously in paincognition studies (Eccleston, 1994; Grace et al., 1999; Grisart and Plaghki, 1999; Park et al., 2001; Dick et al., 2002; Apkarian et al., 2004a; Harman and Ruyak, 2005; Mongini et al., 2005; Karp et al., 2006; Dick and Rashiq, 2007; Rodriguez-Andreu et al., 2009; Verdejo-Garcia et al., 2009; Walteros et al., 2011). The pain variable measures and the records of current medication use relate only to the patient subset of participants and so sample size is further reduced for these measures (n = 38). Therefore, effects of medication on cognitive outcomes cannot be ruled out completely, notwithstanding the application of appropriate statistical controls. The apparent clinical under-recognition of depression within the present sample of pain patients means that the effects of depression could not be controlled for experimentally. Depressive symptoms was strongly correlated with presence of pain and was also significantly correlated with a number of cognitive outcomes. In spite of this, significant contributions of pain and pain-age interaction to some cognitive scores were found, over and above the contribution of depression. It is unlikely that the study is compromised by the presence of undiagnosed depression, since an increased prevalence of depression is known to be associated with chronic pain (Von Korff and Simon, 1996; Raftery et al., 2011). Finally, it is not known whether this series of tasks has been used in combination previously, it is possible that the order in which the tests were administered may have affected task performance. Motivation may affect performance of attentional tasks in chronic pain (Keogh et al., 2013) and was not measured here, but no obvious decrease in participant effort or motivation was observed during the assessment.

# CONCLUSIONS

This study provides further support for the theory that pain affects cognition and that the relationship is influenced by age. The cognitive outcomes affected were mainly within the domains of memory and attention, with IQ and psychomotor speed also affected. Further research is required to determine whether the present set of outcomes represents a specific signature of cognitive performance in neuropathic and radicular pain.

# ETHICS STATEMENT

This study was carried out in accordance with the recommendations of The Research Ethics Committees of Galway University Hospital and the National University of Ireland Galway with written informed consent from all subjects. All subjects gave written informed consent in accordance with the Declaration of Helsinki. The protocol was approved by The Research Ethics Committees of Galway University Hospital and the National University of Ireland Galway.

# AUTHOR CONTRIBUTIONS

All authors listed, have made substantial, direct and intellectual contribution to the work, and approved it for publication.

# ACKNOWLEDGMENTS

This research was conducted with the financial support of the Irish Higher Education Authority under the Programme for

# REFERENCES


Research in Third Level Institutions, Cycle 4. The technical and administrative assistance of Ms. Karen Walsh, Dr. Daniel Kerr, and Dr. Kate McDonnell-Dowling is gratefully acknowledged.

# SUPPLEMENTARY MATERIAL

The Supplementary Material for this article can be found online at: http://journal.frontiersin.org/article/10.3389/fnbeh. 2017.00100/full#supplementary-material


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2017 Moriarty, Ruane, O'Gorman, Maharaj, Mitchell, Sarma, Finn and McGuire. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

# The cooking task: making a meal of executive functions

#### **T. A. Doherty<sup>1</sup> , L. A. Barker <sup>1</sup>\*, R. Denniss <sup>1</sup> , A. Jalil <sup>2</sup> and M. D. Beer <sup>2</sup>**

<sup>1</sup> Brain Behaviour and Cognition Group, Department of Psychology, Sociology and Politics, Sheffield Hallam University, Sheffield, UK <sup>2</sup> Communication and Computing Research Centre (CCRC), Sheffield Hallam University, Sheffield, UK

#### **Edited by:**

Oliver T. Wolf, Ruhr University Bochum, Germany

#### **Reviewed by:**

Johanna Maria Kissler, University of Bielefeld, Germany Matthias Brand, University Duisburg-Essen, Germany

#### **\*Correspondence:**

L. A. Barker, Brain Behaviour and Cognition Group, Department of Psychology, Sociology and Politics, Sheffield Hallam University, Heart of Campus Building, Collegiate Crescent, Sheffield S10 2BP, UK e-mail: l.barker@shu.ac.uk

Current standardized neuropsychological tests may fail to accurately capture real-world executive deficits. We developed a computer-based Cooking Task (CT) assessment of executive functions and trialed the measure with a normative group before use with a head-injured population. Forty-six participants completed the computerized CT and subtests from standardized neuropsychological tasks, including the Tower and Sorting Tests of executive function from the Delis-Kaplan Executive Function System (D-KEFS) and the Cambridge prospective memory test (CAMPROMPT), in order to examine whether standardized executive function tasks, predicted performance on measurement indices from the CT. Findings showed that verbal comprehension, rule detection and prospective memory contributed to measures of prospective planning accuracy and strategy implementation of the CT. Results also showed that functions necessary for cooking efficacy differ as an effect of task demands (difficulty levels). Performance on rule detection, strategy implementation and flexible thinking executive function measures contributed to accuracy on the CT. These findings raise questions about the functions captured by present standardized tasks particularly at varying levels of difficulty and during dual-task performance. Our preliminary findings also indicate that CT measures can effectively distinguish between executive function and Full Scale IQ abilities. Results of the present study indicate that the CT shows promise as an ecologically valid measure of executive function for future use with a head-injured population and indexes selective executive function's captured by standardized tests.

**Keywords: executive function, head injury, ecological validity, cooking task, neuropsychological assessment**

# **INTRODUCTION**

Executive functions are higher-order cognitive processes associated with frontal brain networks essential for goal-directed behavior and include planning, temporal sequencing, and goalattainment functions (Shallice and Burgess, 1991; Miyake et al., 2000; Royall et al., 2002; Barker et al., 2010; Morton and Barker, 2010). Individuals with frontal pathology often show diminished planning, self-correction, goal attainment and decision making abilities thought to be important for "real world" activities of daily living (ADL's—Grafman et al., 1993; Godbout and Doyon, 1995; Godbout et al., 2005; Burgess et al., 2006). Consequently, executive function deficits may result in difficulty performing everyday tasks including shopping (Shallice et al., 1989; Shallice and Burgess, 1991), cooking a meal (Godbout et al., 2005), and simple tasks such as teeth brushing (Schwartz et al., 1998). However, research suggests that current executive function tasks have limited ability to predict ADL's (Eslinger and Damasio, 1985; Burgess et al., 2006; Chan et al., 2008). Similarly, there are several reported cases with frontal pathology and normal scores on executive function tests, but diminished capacity to engage in ADL's, suggesting that standard tests do not reliably capture "real world" problems (Shallice and Burgess, 1991; Chevignard et al., 2000; Andrés and Van der Linden, 2002; Barker et al., 2004).

The act of cooking a meal requires several executive functions, including capacity to multitask, plan, use prospective memory and maintain and complete, both sub and overall goals within a strict timeframe (Craik and Bialystok, 2006). Although there is limited research, previous findings suggest that cooking tasks (CT) may be more sensitive to patient deficits than traditional neuropsychological measures (Chevignard et al., 2000, 2008; Fortin et al., 2003; Craik and Bialystok, 2006; Tanguay et al., 2014). Fortin et al. (2003) found no difference between a head-injured group and controls on standardized assessment, although the patient group showed diminished ability to cook a meal. The authors concluded that impaired planning and prospective memory functions contributed to diminished ability to cook a meal in the patient group and these deficits were not captured by standardized tests. Chevignard et al. (2008) compared performance of brain injured participants and controls on a semistructured CT conducted in the occupational therapy kitchen and standardized measures of executive function. Patients made numerous errors, including context neglect, purposeless action and environmental adherence indicating abnormal responses to contextual and environmental cues. Cooking performance variables, including number of errors, cooking duration, goal achievement, and dangerous behaviors were all predicted by the Six Elements Task, a standardized version of the Multiple Errands Test (Shallice and Burgess, 1991; Wilson et al., 1997), indicating that their CT and an ecologically derived executive function measure, indexed similar functions in contrast to findings of Fortin et al. (2003). Kerr (unpublished) and Craik and Bialystok (2006) developed a CT to investigate planning ability in an elderly population and found that the task was sensitive to aging effects. However, task indices only weakly correlated with scores on standardized measures. They concluded that CT are potentially useful laboratory based methods of planning corresponding well to real world ADL's.

Previous findings indicate that cooking can provide a sensitive and reliable measure of executive function ability in a "real-world" context. However, "real" CT require elaborate setup, are time consuming and require ongoing monitoring of the individual's progress that is not easily standardized, for later follow-up or across group comparisons. Hence, a compromise must be made between unrealistic conditions of lab-based assessment and "real-world, real-time" tasks that are time costly and difficult to replicate when developing an ecologically valid task for clinical assessment and guiding rehabilitation programs. The core components of the real-life task should be captured by the ecologically valid version and be sufficiently standardized that performance can be compared across time points at follow up and across neuropathological groups. Additionally, mixed findings of previous research renders it difficult to establish whether "realworld" cooking ability corresponds to executive functions indexed by standardized clinical measures.

With this aim in mind the current study employed a computerbased simulation of cooking a meal based on the CT developed by Kerr (unpublished) and Craik and Bialystok (2006). The present CT shares some similarities with the original including a comparative user interface and secondary distracter task of table setting as well as copious modifications. In the current task, ability to pause an item whilst it was cooking was seen as a necessity; in real-world settings individuals can stop items cooking if they believe they have initiated cooking at the wrong time. This mid-plan adjustment seemed necessary to document as it increased the sensitivity of the measure beyond whether the end goal was completed or not. Additionally, the original task by Craik and Bialystok (2006) had no variety in the number of items to cook, simply the number of screens on which these items were presented. In the interests of maintaining ecological validity this screen-switching was dropped in favor of different difficulty levels pertaining to the number of items that required cooking within a set time frame, and whether or not setting the table was necessary. The current task also provided more detailed measures, which were calculated by the program itself. The present study compared indices of our newly developed computerized CT with standardized neuropsychological tasks thought to relate to cooking a meal, including measures of planning, prospective memory, and temporal sequencing in a normal population.

# **METHOD PARTICIPANTS**

**Table 1** shows the broad age range (early adulthood through to older adults) and the variability in Full Scale IQ, ranging from Low Average to Superior, of the forty-six participants, which is **Table 1 | Demographic data of participants (n = 46)**.


indicative of the diverse nature of the normative sample in the present study.

All participants gave their informed consent and the faculty research ethics board approved the research. We sampled participants from a broad demography to test the sensitivity of the task within a diverse sample. Participants completed three standardized executive function tests, a measure of IQ and the computerized CT in a laboratory setting. One of the main reasons for the present study, was to test the computational viability of our new CT, whether a new shortened version accurately indexed executive functions measured by selected tasks from our battery of tests and whether these functions were sensitively captured by the task with a small group of non-neuropathological controls, before trialing the task with a brain-injured cohort. Tests were administered in one session in counterbalanced order with participant-determined rest breaks. We selected executive function tests thought to contribute to real world cooking ability (Chevignard et al., 2008), or previously shown to be associated with ability on our earlier version of the CT (McFarquhar and Barker, 2012).

# **THE COOKING TASK**

The present task shares some similarities with an earlier task developed by Craik and Lockheart (2006) including a comparative user interface and secondary distracter task of table setting. In real-world settings individuals can stop items cooking if they believe they have initiated cooking at the wrong time. This midplan adjustment was important to measure because it increased task sensitivity and provided an index of prospective plan accuracy. Various adaptations were made to the original version of the CT in the present study to account for data collection with a non-neuropathological group and to shorten testing time, which was originally 3 h duration (McFarquhar and Barker, 2012). The computational design of the CT was a lengthy process because we wanted to generate a measure that was as similar as possible to "real-world" behavior whilst maintaining clear measurement indices and a relatively interactive user-interface. The development of the CT program will not be discussed further here except in relation to the measurement variables generated by the task, participant instructions and the appearance of the task.

The CT was programed using MIT App Inventor version 1.34 (M.I.T., 2014) and built for devices running the android operating system. The task was administered on a 16 Gb, 1.5 Ghz Quad Core Asus Google Nexus 7' tablet computer running Android OS 4.4 (KitKat). At start-up the task displayed a welcome screen and a keypress button for the "Instructions" screen. Participants were required to read the instructions carefully before returning to the welcome screen and proceeding to the first level of the task. The task comprised four levels (two tasks per level) and participants had to successfully complete each level before moving onto the next level. A task fail occurred if some items were not cooked within the given time frame, or if any of the items were left to "go cold" (are left cooked for 30 sec), whilst other items were cooking. Participants were permitted a second attempt if they failed the first task on a level; only the food items changed with no change in the task parameters from the first task to the second task at the same level. The first screen of each level of the CT informed participants of time available for task completion. The next screen was the planning screen loaded by pressing the "start planning" button and presented information on how many items to cook at that level and a keypress button to re-load the Instruction screen if needed. The planning screen presented the relevant cooking times for each item as well as a brief reminder of the relevant rules for successfully completing the task.

Food items were represented by an image with *"cook"* (which changed to *"stop"* once pressed however, the button remains inactive as an item can only be stopped once it has been fully cooked) and *"pause"* (which changed to *"resume"* once pressed) buttons below the image. Whilst the item cooked a timer bar (which reduced at a proportional rate to the length of time the item cooked for) was green, with the text stating "<sup>∗</sup> *item*<sup>∗</sup> *is cooking*", when the item cooked over the allotted time the timer bar disappeared and red text stated "<sup>∗</sup> *item*<sup>∗</sup> *is burning*". Finally when the item cooking was stopped a blue text message stated "<sup>∗</sup> *item*<sup>∗</sup> *is going cold*". Sound files for each item loaded during cooking time (recorded from real cooking of these items) in order to simulate a real world analog auditory prompt for the participant and improve the ecological validity of the task (McGuire, 2014). The cooking time for each item was presented at the bottom of the screen along with information outlining the basic parameters of the task and a "real time" clock present on each task screen (see **Figure 1**).

The data recorded during the task was cooking time for each item, burning time for each item (time left cooking over the suggested time), pause time for each item, time each item is left cold for and the remaining amount of allotted time for each task.

Level one of the CT required participants to cook two items within 2 min (Easy level), level two consisted of four items to cook within 4 min of cooking time (Moderate level), level three consisted of six items to be cooked within 5 min (Difficult level) and level four required participants to cook six items within 5 min and included a separate distracter task where the participant must lay a virtual table concurrent with the CT (Dualtask level). During the Dual-task level of the task, the screen additionally included an image of a dining table and crockery items. To lay the table participants were required to drag and drop items (a fork, knife, spoon and plate) to each of four empty table settings (participants could switch between this task and the primary CT for the task duration they had to complete the task however within the overall 5 min duration—see **Figure 2**). Performance on the secondary task was scored on a pass-fail basis. The time to complete the CT, ranged between 17–35 min, but

mostly took under 20 min to complete in this cohort of healthy controls.

Upon successful completion of each task a "congratulations" screen appeared with a keypress button that loaded the next level of the task. However, if the task was failed the program loaded a screen detailing why the task was failed and a reminder of rules transgressed during task completion. This screen also displayed a keypress button that loaded the second task of that level, or if a task on that level was failed twice a goodbye screen was loaded and the CT was completed.

#### **OUTCOME MEASURES GENERATED BY THE COOKING TASK**

During performance on the task the computer program recorded times for each level and burn time, pause time, cold time and remaining time for all items as well as an overall accuracy ratio. These raw data were then transformed into three specific variables based on scores originally used by Craik and Bialystok (2006) and used in our earlier computerized CT: Range, Discrepancy and Adjustment scores (McFarquhar and Barker, 2012).

#### **RANGE SCORE: MEASURE OF TIME-BASED STRATEGY USE**

The Range score calculated the difference between the time the first item was stopped and the last item was stopped. This provided a measure of prospective time-based strategy implementation and the value should therefore be close to zero. Much like a real world task it is impossible to stop all items at the same time however, this score accurately measures if a participant

**FIGURE 2 | Table setting with the bank of items (plates; knives; forks and spoons).**

has forgotten a specific item, it is therefore calculated on an itemby-item basis and the highest scoring item is taken as the range score for that level.

### **DISCREPANCY SCORE: MEASURE OF PROSPECTIVE PLAN IMPLEMENTATION**

The Discrepancy score calculated the difference between the actual amount of time an item was cooked for, including the burn time, and the prescribed cooking time. An average of all item scores was calculated to give a score for each level. This value provided a measure of prospective plan accuracy: the ability to plan, implement and remember to start and stop all items at the correct time. This value should therefore be close to zero. Previous studies indicated this provides a measure of prospective memory, as participants need to remember to start and stop the items at the correct time.

# **THE ADJUSTMENT SCORE: MEASURE OF PLAN ACCURACY**

The Adjustment score calculated the amount of time all the items were paused for. This provided a measure of plan accuracy as an accurate and effective plan should require no mid-task adjustments. This value should also therefore approach zero. Again an average of all item scores was calculated to provide a score for each level.

The CT program also generated two further variables in order to measure a participant's comprehensive performance, both per level and as a task overall, on completed levels of the CT.

### **THE ACHIEVEMENT SCORE**

A fourth measure, which was designed to be an achievement measure per level, was taken from the remaining time left from each level. This was calculated by dividing the number of items on the level by the time remaining for that level.

# **THE ACCURACY RATIO**

The final measure was an overall measure of accuracy, termed the accuracy ratio and was calculated by measuring the number of tasks attempted by a participant. We computed this measure in order to give an accuracy measurement of the number of levels/failed trials undertaken by a participant over the course of the entire task.

# **STANDARDIZED IQ AND EXECUTIVE FUNCTION MEASURES**

# **WASI (Wechsler abbreviated scale of intelligence—Wechsler, 1999)**

The WASI (Wechsler, 1999) was used to provide a measure of overall Full Scale and Verbal and Performance IQ scores and account for any potential individual differences that might affect scores on executive function and CT tests. We hypothesized that IQ subtests would predict strategy implementation (Range score) and prospective plan implementation (Discrepancy scores) of the CT, due to our previous findings from work with a lengthier and more time consuming version of the CT (McFarquhar and Barker, 2012). We also anticipated that Full Scale IQ would predict overall performance across all levels of the CT. Scores from the WASI have been found to produce reliability coefficients between *r* = 0.97 and *r* = 0.98.

### **D-KEFS—tower test (Delis-Kaplan Executive Function System—Delis et al., 2001)**

We selected measures from the Delis-Kaplan Executive Function System (D-KEFS) executive function battery for the present study as these measures are widely used in clinical and academic work with neuropathological groups (Baldo et al., 2003; Martin et al., 2003) and have good levels of reliability and sensitivity. The Tower Test indexes planning accuracy and rule detection ability (Crawford et al., 2011) and also generates several composite scores that we expected to contribute to performance on the CT. Tower Task time per move ratio provides a measure of the average time an examinee takes to make each move throughout the task. According to the manual normative samples show consistency in time spent "pausing and studying moves". We expected this variable to predict the CT measure of planning accuracy (Adjustment score). Tower Task rule violation per item ratio represents the number of rule breaks made over the course of all items. Thus this score provides a measure of rule detection ability; again we expected scores on this measure to predict CT planning accuracy scores (Adjustment scores), strategy implementation scores (Range scores) and prospective plan implementation score (Discrepancy scores) on the CT. The reliability co-efficient for total achievement score on the Tower Test is *r* = 0.44.

### **D-KEFS—sorting test (Delis et al., 2001)**

The Sorting Test measures flexible thinking ability, concept formation (verbal and non-verbal) and strategy initiation. These functions are thought to play a role in the capacity to cook a meal (Chevignard et al., 2008). This test has been shown to require strategy initiation and capacity to inhibit pre-potent responses and is sensitive to performance differences between neuropathological groups and controls (Parmenter et al., 2007; Heled et al., 2012). The Sorting Test also generates several composite scores that we expected to contribute to performance on the CT; composite scaled score provides a measure of accuracy in sorting rules, or concepts across free sort and sort recognition conditions, thus it combines performance scores across both conditions of the Sorting Test. According to the manual high scores represent effective use of high-level executive function concept-formation/strategy generation rules. We anticipated that the Sorting Test composite score would predict strategy implementation (Range score) of the CT. Sorting Test contrast scaled score provides a calculation of the difference between an individual's abilities to develop a sorting concept and describe that sorting concept providing an index of concept formation flexibility. We hypothesized that ability on these measures would predict planning accuracy (Adjustment scores) on the CT. Scores from the D-KEFS Sorting Test have been found to produce a reliability coefficient of *r* = 0.46 depending upon subtests used.

### **CAMPROMPT (Cambridge prospective memory test—Wilson et al., 2005)**

We selected a standardized prospective memory task because previous research found a relationship between performance on an earlier version of the present CT and prospective memory scores on a nonstandardized task (McFarquhar and Barker, 2012). In the present study we wanted to investigate whether a relationship between CT measures and prospective memory scores remained when the standardized Cambridge prospective memory test (CAMPROMPT) task used in clinical settings was used. Scores from the CAMPROMPT have been found to produce a reliability coefficient of *r* = 0.64. We expected CAMPROMPT subtests to predict prospective strategy implementation (Range scores) on the CT.

# **RESULTS**

All raw data were standardized using Z transformation to control for outliers and compare scores across neuropsychological and CT variables. Any outliers that exceeded 3.29 after transformation were excluded in line with recommendations for treatment of outliers in transformed datasets (Ratcliffe, 1993; Field, 2009). This included one case across each level of the CT mid-plan Adjustment variable. We also computed Pearson's correlation analyses for our selected variables for each regression analyses to thoroughly explore data and check for multicolinearity.



**Table 2** presents descriptive data for the neuropsychological tests used in the present study.

**Table 3** shows the CT variables.

Results of One-Way ANOVA for Range scores across different levels of the CT showed that performance was significantly different for this measure of time-based strategy implementation *F*(3,183) = 21.9, *p* = 0.00. Similarly, performance was significantly different for Discrepancy scores (measure of prospective plan implementation) across levels of task difficulty *F*(3,183) = 15.2, *p* = 0.00. Scores were also significantly different for the Adjustment variable (measure of plan accuracy) across different task levels *F*(3,178) = 4.74, *p* = 0.00. **Table 4** shows results of Tukey HSD *post hoc* analyses for comparison between each difficulty level for each CT measure. For the Range variable (measure of time-based strategy implementation) performance was different across all levels except for 3 and 4, for Discrepancy score (measure of prospective plan implementation) levels 1 and 3 and 1 and 4 were different, 2 and 3 and 2 and 4 were different and 1 and 2 and 3 and 4 were not different. For Adjustment score (measure of plan accuracy) levels 1 and 4, and 3 and 4 were different (see **Table 4**).

We developed predictor models on the basis of functions purportedly tapped by neuropsychological and corresponding CT variables as outlined previously. All reported significance levels are one-tailed due to our *apriori* hypotheses. We analyzed each CT level (levels 1–4) separately to establish whether the pattern of relationships between variables differed as an effect of level difficulty.

# **RANGE SCORE: A MEASURE OF TIME-BASED STRATEGY IMPLEMENTATION**

We entered Event Based Scores from the CAMPROMPT (episodic prospective memory; *r* = 0.14, *p* = 0.35), Perceptual Reasoning Index of the WASI (performance IQ; *r* = −0.28, *p* = 0.053),



**Table 4 | Results of Tukey HSD post hoc analyses across all difficulty levels (1–4), for range, discrepancy and adjustment measures of the cooking task**.


Tower Test Rule Violation Per Item Ratio (rule detection; *r* = −0.05, *p* = 0.69) and Sorting Test Confirmed Correct Sorts (concept formation; *r* = −0.25, *p* = 0.08). We expected performance on these measures to contribute to effective timebased strategy implementation. Results of Pearson's correlation showed a weak negative relationship between performance IQ, concept formation and Range 1 scores. For the easy level (Range 1), the model was not significant *F*(5,45) = 1.13, *p* > 0.05, and the only marginally significant predictor was performance IQ of the WASI (β = −0.25, *p* = 0.06). Results of Pearson's correlations for the moderate difficulty level (Range 2) showed only a moderate relationship between episodic prospective memory *r* = −0.30, *p* = 0.40 and Range 2 scores (performance IQ; *r* = 0.03, *p* = 0.80, rule detection; *r* = 0.04, *p* = 0.80 and concept formation; *r* = 0.02, *p* = 0.90). The model was not significant *F*(5,45) = 0.90, *p* > 0.1. However, episodic prospective memory was a significant predictor of Range 2 scores (β = −0.31, *p* = 0.02). For the difficult level results of Pearson's correlations showed a weak negative relationship between rule detection *r* = −0.28, *p* = 0.052 and Range 3 scores (episodic prospective memory; *r* = 0.05, *p* = 0.72, performance IQ; *r* = −0.02, *p* = 0.91, rule detection; *r* = −0.28, *p* = 0.052 and concept formation; *r* = 0.06, *p* = 0.68). Again the model was not significant *F*(5,45) = 1.31, *p* > 0.05, and rule detection was the only significant predictor of Range 3 scores, (β = −2.31, *p* = 0.01). Finally, at the dualtask level results of Pearson's correlations showed only a weak negative correlation (*r* = −0.26, *p* = 0.07) between concept formation and Range 4 scores (rule detection; *r* = −0.05, *p* = 0.73, episodic prospective memory; *r* = −0.11, *p* = 0.44, and performance IQ; *r* = −0.05, *p* = 0.73). The model was not significant *F*(5,45) = 1.20, *p* > 0.05, although concept formation (β = −2.13, *p* = 0.01), and episodic prospective memory (β = −1.43, *p* = 0.05) predicted Range scores at the dual-task level.

#### **DISCREPANCY SCORES: PROSPECTIVE PLAN IMPLEMENTATION**

Discrepancy scores on the CT represent ability to implement and follow a plan for accurate start and stop times for all CT stimulus items for each difficulty level. We anticipated that verbal working memory might contribute to plan generation and implementation and prospective memory indexed by the CAMPROMPT. So we entered the Vocabulary Comprehension Index (VCI—Verbal IQ) of the WASI and Event Based scores of the CAMPROMPT as predictors in the model with discrepancy scores as the criterion variable for each difficulty level. Results of Pearson's correlations were not significant for Verbal IQ (*r* = −0.13, *p* = 0.40) and prospective memory (*r* = −0.17, *p* = 0.26) and Discrepancy 1 scores. The model was not significant at the easy level (Discrepancy 1) *F*(2,45) = 1.01, *p* > 0.05 and prospective memory scores marginally predicted discrepancy scores (β = −1.23, *p* = 0.07). At the medium difficulty level results of Pearson's correlations showed a weak positive relationship (*r* = 0.33, *p* = 0.02) between prospective memory and Discrepancy 2 scores, but not for Verbal IQ and Discrepancy 2 scores (*r* = 0.02, *p* = 0.91). The model was significant at this level *F*(2,45) = 4.10, *p* < 0.01 and prospective memory (β = 0.34, *p* < 0.01) and Verbal IQ scores (β = 0.22, *p* = 0.05) predicted Discrepancy 2 scores. Results of Pearson's correlations at the difficult level showed a weak negative moderate correlation between Verbal IQ and Discrepancy 3 scores (*r* = −0.29, *p* = 0.054) and a weak relationship between prospective memory and Discrepancy 3 scores (*r* = 0.21, *p* = 0.16). At this level the model was not significant *F*(2,45) = 1.96, *p* = 0.06, and prospective memory (β = 0.21, *p* = 0.06) and Verbal IQ (β = −0.19, *p* > 0.05) scores only marginally predicted the criterion variable. At the dual-task level results of Pearson's correlations showed a weak negative relationship between Verbal IQ and Discrepancy 4 scores (*r* = −0.22, *p* = 0.14), not shown for prospective memory and Discrepancy 4 scores (*r* = 0.04, *p* = 0.78). The regression model was significant at the dual-task level *F*(2,45) = 2.63, *p* < 0.05, and Verbal IQ score was the unique predictor of Discrepancy score (β = −0.32, *p* < 0.05) at this level.

#### **ADJUSTMENT SCORES: A MEASURE OF PLAN ACCURACY**

Adjustment scores of the CT arguably measure the ability to generate an accurate plan. We hypothesized that performance on the Tower and Sorting tests would predict Adjustment scores because these indices capture components of planning likely to contribute to prospective plan generation for synchronous cooking of CT stimulus (food) items. We entered Tower Test Time Per Move Ratio (time-based plan accuracy and implementation), Tower Test Rule Violations (rule detection) and Sorting Test Contrast Score (flexible thinking). Results of Pearson's correlations for time based plan accuracy (*r* = 0.15, *p* = 0.92), rule detection (*r* = −0.30, *p* = 0.05) and flexible thinking (*r* = 0.16, *p* = 0.28) showed only a weak relationship between rule detection and adjustment score at the easy level. The regression model was not significant for the easy level *F*(3,44) = 1.87, *p* > 0.05 and rule detection score was the only significant predictor of Adjustment 1 scores (β = 0.30, *p* < 0.05). At the moderate level of the task, results of Pearson's correlation showed a very weak relationship between plan accuracy (*r* = 0.04, *p* = 0.78), rule detection (*r* = −0.01, *p* = 0.99) and flexible thinking (*r* = 0.08, *p* = 0.61) and Adjustment 2 scores. The regression model was not significant for the moderate level of the task *F*(3,44) = 0.15, *p* > 0.05 and none of the variables predicted performance on Adjustment 2 scores. At the difficult level, results of Pearson's correlation showed a weak relationship between flexible thinking (*r* = 0.32, *p* = 0.03) and Adjustment 3 scores, but plan accuracy (*r* = 0.21, *p* = 0.17) and rule detection scores did not significantly correlate with Adjustment 3 scores. The regression model was not significant *F*(3,44) = 1.59, *p* > 0.05 although flexible thinking was a significant predictor (β = −0.29, *p* < 0.05) of Adjustment 3 scores.

Finally, at the dual-task level there was a significant negative moderate relationship between rule detection (*r* = −0.40, *p* = 0.01) and Adjustment 4 scores, not present for plan accuracy (*r* = −0.16, *p* = 0.29) or flexible thinking (*r* = 0.17, *p* = 0.25) and Adjustment scores. The regression model was significant at this level *F*(4,45) = 2.90, *p* > 0.01, and rule detection score was the only significant predictor of plan accuracy (Adjustment 4) at this level.

#### **RESIDUAL TIME: A MEASURE OF TASK ACCURACY**

We entered Sorting Test Recognition Description Score (verbal concept formation), Sorting Test Composite Score (strategy initiation) and Tower Test Rule Violation Per Item (rule detection) as predictors.

Results of Pearson's correlations for verbal concept formation (*r* = 0.04, *p* = 0.79), strategy initiation (*r* = 0.20, *p* = 0.17) and rule detection (*r* = 0.15, *p* = 0.30) showed a moderate relationship between rule detection and residual time at the easy level. The regression model was significant at this level *F*(3,45) = 3.10, *p* < 0.05, and rule detection was the only significant predictor (β = 0.38, *p* < 0.01) of the criterion variable. At the moderate task difficulty level, results of Pearson's correlations for verbal concept formation (*r* = −0.07, *p* = 0.64), strategy initiation (*r* = 0.05, *p* = 0.74) and rule detection (*r* = 0.46, *p* = 0.00) again showed a moderate relationship between rule detection and residual time. The model was also significant at the moderate level *F*(3,45) = 6.10, *p* < 0.01, and verbal concept formation (β = −0.80, *p* < 0.001), strategy initiation (β = 0.47, *p* < 0.001) and rule detection (β = 0.64, *p* < 0.05) were significant predictors of residual time at this level. Results of Pearson's correlations for verbal concept formation (*r* = 0.18, *p* = 0.22), strategy initiation (*r* = 0.23, *p* = 0.12) and rule detection (*r* = 0.22, *p* = 0.14) showed only a weak relationship between EF variables and residual time left at the difficult level. The model was not significant at the difficult level of the CT *F*(3,45) = 1.35, *p* > 0.05 and strategy initiation was the only significant predictor (β = 0.40, *p* = 0.05) of the criterion at this level. At the dual-task level, results of Pearson's correlations for verbal concept formation (*r* = 0.04, *p* = 0.79), strategy initiation (*r* = −0.05, *p* = 0.75) and rule detection (*r* = 0.13, *p* = 0.39) showed only a very weak relationship between EF variables and residual time left. Similarly, at the dual-task level the model was not significant *F*(3,45) = 1.41, *p* > 0.05, strategy initiation (β = −0.77, *p* < 0.05) and verbal concept formation (β = 0.71, *p* < 0.05) scores were significant predictors of the criterion variable at this level.

#### **ACCURACY RATIO: OVERALL TASK PERFORMANCE**

We hypothesized that overall IQ might predict overall task accuracy and completion rates and entered Full Scale IQ scores of the WASI into the regression model. The model was significant *F*(1,45) = 9.11, *p* > 0.001, β = 0.41, *p* = 0.001 (*r* = 0.41, *p* = 0.01) indicating the important contribution of general intelligence to overall task completion accuracy. In addition, on the basis of correlation data, findings showed that Discrepancy (prospective plan implementation) and Overall task accuracy ratio scores constitute CT variables that show the most consistent relationship with EF and IQ variables across difficulty levels, although the conventional caveats should be borne in mind when interpreting correlation data. Except for the easy level, there was a consistent association between prospective memory, Verbal IQ and Discrepancy score although the direction of this relationship, and contribution of predictors to the criterion variable was different as task difficulty increased across levels, arguably suggesting shared processing resource costs across these variables as a consequence of increased task difficulty. Overall accuracy ratio scores showed a moderate relationship with Full Scale IQ on the basis of correlation data.

# **DISCUSSION**

The present study investigated whether a newly developed interactive computerized CT functioned as an ecological measure of executive processes and captured similar functions as current off-the-shelf standardized tests in a normal group before trials with TBI cohorts. The CT had four difficulty levels (easy/moderate/difficult/dual-task) and findings indicated that each level had different processing demands likely due to the additional cognitive load required for difficult and dual-task levels. We expected CT variables to be associated with specific subtests of standardized tasks rather than global or overall scores because we designed CT variables with the intention of tapping into very specific processes likely recruited during real world CT. An expected finding was that executive function subtest measures predicted CT variables; however, an unexpected finding was that the relationship between executive function predictor and CT criterion variables differed as an effect of difficulty level based on regression analyses.

Thus, rather than difficulty level making increased demands on the same processes, findings of regression analyses indicate that the moderate and difficult levels of our task difficulty actually recruited different cognitive resources. This finding is a useful cautionary note because it indicates that unless test designers carefully evaluate the processes contributing to varying levels of task difficulty, as we have done here, it might be wrongly assumed that standardized task measure the same cognitive processes to greater or lesser degree rather than task demands actually initiating the implementation of different processes as a function of task difficulty.

Several variables predicted the CT strategy implementation measure (Range score) including scores on measures of prospective memory, performance-based IQ and executive function measures of concept formation and rule detection comprising verbal and performance-spatial based processes. The contribution of these variables to the CT variable differed for each difficulty level. Findings suggest that at the easy level strategy implementation on the CT was driven by non-verbal performance based reasoning, but as task difficulty increased strategy implementation depended more on prospective verbalbased planning and application of performance based planning strategies. At a broader level these findings challenge current conceptualizations of executive function because rather than overarching "executive" functions governing task-based activity on "real-world" tasks, findings suggest fluidly organized processes incorporating verbal and non-verbal working memory processes and strategy based executive functions that correspond well to the notion of a fractionated and *malleable* executive function system (Roca et al., 2014), but also indicate a central role of working memory and general intelligence to performance on tasks thought to depend primarily on executive functions (Royall and Palmer, 2014).

Verbal IQ and event-based prospective memory predicted the prospective plan implementation (Discrepancy score) measure of the CT. Prospective memory contributed to performance on the easy level but again as task demands increased verbal IQ was the unique predictor of prospective plan implementation. This finding indicates that at the dual-task level of our task capacity to draw on effective verbal reasoning is the key component for switching between the two tasks and implementing an effective plan. Again, whilst switching is typically defined as an executive function, again our findings indicate a key role of verbal IQ to prospective planning on the CT.

For the CT planning accuracy was quantified as time spent making mid-plan adjustments to items in order to achieve the end goal within the given time frame. Previous research using real-life CT have shown normal individuals to make significantly fewer errors (often zero) compared to those with frontal pathology (Godbout et al., 2005; Chevignard et al., 2008) and due to our non-pathological cohort, some ceiling effects on this measure were found. However, the adjustment scores also showed promise by indexing standardized cognitive measures, with our analyses showing that executive function measures of flexible thinking, accurate planning implementation and rule detection, predicted cross-level planning accuracy. Again, the relationship between predictors and the criterion variable differed as an effect of increased task difficulty based on regression results. Flexible thinking was an important mediator of plan accuracy at the easier levels and rule detection capacity predicted plan accuracy at the dual-task level. Overall, findings indicated that at all levels plan accuracy on the CT made similar demands on planning functions indexed by standardized executive function tests.

This version of the CT incorporated modifications from an earlier design to capture residual time; amount of time remaining from the tasks prescribed time limit and a measure of overall task accuracy. Rule detection and verbal concept formation were found to be significant predicators of residual time score towards for easier CT levels although strategy initiation was a significant predictor for three of the four levels including the dual-task level. Again, as with other CT measures, results indicated that that the more difficult levels general verbal based strategy processes contributed to task efficiency.

Finally, percentage accuracy ratio measured performance ability across levels of the CT based upon number of failures occurring at overall CT task and/or level. Full Scale IQ scores significantly predicted this criterion and arguably suggest that the CT distinguished between executive function and intelligence contributions to performance by showing selective executive function contributions to certain task components, but a key contribution of IQ to overall task accuracy. The notion that performance on executive function and IQ measures depends upon shared processes is a key debate in the literature (Royall and Palmer, 2014) and our preliminary findings indicate that CT measures can effectively distinguish between executive functions and FSIQ abilities. However, one limitation of our present findings is that regression analyses showed only moderate contribution of standardized neuropsychological predictor variables to CT criterion variables. The strong relationship between overall IQ score and overall accuracy on the CT task arguably suggest that overall intelligence is a crucial factor in task accuracy on this cooking measure. It is likely that both performance and verbal IQ contribute to effective task completion because sequential ordering of start and stop times for cooking items is likely mediated by a verbal plan of execution, with performance IQ contributing to spatial and non-verbal components of cooking ability. This area represents a further line of enquiry with real-world executive function analog tasks, with future research comprehensively distinguishing between nonverbal performance and verbal IQ contributions to successful task completion.

Overall, findings indicated that several standardized IQ, memory and executive function subtests predicted performance on the CT indicating that our real world simulation of an everyday activity reliably captured these functions in a normal cohort. The pattern of relationships between variables differed as a consequence of task difficulty and the use of a secondary task. Of note, in the present study all but one of the participants passed the secondary task (table setting was programed for pass/fail outcome only) suggesting the possibility of an accuracy tradeoff across the primary cooking task activities and the secondary task.

Although at an early stage of development, the relationships found between CT indices and standardized measures holds great promise for the use of the CT as an ecologically valid measure of executive function. An updated version of the task, applicable to a greater number of platforms is currently under development and we hope to utilize this version in a TBI population to better understand the functions contributing to real world abilities, improve the predictive utility of clinical assessment and inform strategic rehabilitative approaches.

### **ACKNOWLEDGMENTS**

The authors would like to thank participants for their involvement in the study and Martyn McFarquhar for his work on the original version of the CT.

### **REFERENCES**


using execution of a cooking task. *Neuropsychol. Rehabil.* 18, 461–485. doi: 10. 1080/09602010701643472


**Conflict of Interest Statement**: The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

*Received: 07 April 2014; accepted: 20 January 2015; published online: 11 February 2015*.

*Citation: Doherty TA, Barker LA, Denniss R, Jalil A and Beer MD (2015) The cooking task: making a meal of executive functions. Front. Behav. Neurosci. 9:22. doi: 10.3389/fnbeh.2015.00022*

*This article was submitted to the journal Frontiers in Behavioral Neuroscience*.

*Copyright © 2015 Doherty, Barker, Denniss, Jalil and Beer. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution and reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms*.

# Cooking breakfast after a brain injury

# *Annick N. Tanguay1, Patrick S. R. Davidson1,2,3\*, Karla V. Guerrero Nuñez <sup>1</sup> and Mark B. Ferland1,4,5*

*<sup>1</sup> School of Psychology, University of Ottawa, Ottawa, ON, Canada*

*<sup>3</sup> Canadian Partnership for Stroke Recovery, Heart and Stroke Foundation of Canada, Ottawa, ON, Canada*

*<sup>5</sup> Ottawa Hospital Research Institute, Ottawa, ON, Canada*

#### *Edited by:*

*Lynne Ann Barker, Sheffield Hallam University, UK*

*Reviewed by:*

*Mamiko Koshiba, Saitama Medical University, Japan Brian McGuire, National University of Ireland, Galway, Ireland*

#### *\*Correspondence:*

*Patrick S. R. Davidson, School of Psychology, University of Ottawa, 136 Jean-Jacques Lussier Priv., Ottawa, ON K1N 6N5, Canada e-mail: patrick.davidson@uottawa.ca* Acquired brain injury (ABI) often compromises the ability to carry out instrumental activities of daily living such as cooking. ABI patients' difficulties with executive functions and memory result in less independent and efficient meal preparation. Accurately assessing safety and proficiency in cooking is essential for successful community reintegration following ABI, but *in vivo* assessment of cooking by clinicians is time-consuming, costly, and difficult to standardize. Accordingly, we examined the usefulness of a computerized meal preparation task (the Breakfast Task; Craik and Bialystok, 2006) as an indicator of real life meal preparation skills. Twenty-two ABI patients and 22 age-matched controls completed the Breakfast Task. Patients also completed the Rehabilitation Activities of Daily Living Survey (RADLS; Salmon, 2003) and prepared actual meals that were rated by members of the clinical team. As expected, the ABI patients had significant difficulty on all aspects of the Breakfast Task (failing to have all their foods ready at the same time, over- and under-cooking foods, setting fewer places at the table, and so on) relative to controls. Surprisingly, however, patients' Breakfast Task performance was not correlated with their *in vivo* meal preparation. These results indicate caution when endeavoring to replace traditional evaluation methods with computerized tasks for the sake of expediency.

**Keywords: cooking, acquired brain injury, independent activities of daily living, executive functions, simulated/computerized cooking, ecological validity, rehabilitation**

# **INTRODUCTION**

The executive functions are a family of processes that support goal-setting, planning, organizing, monitoring, and the flexible control of cognition and behavior. Although executive dysfunction is one of the most common and clinically significant consequences of brain injury, there remains much controversy on exactly how to assess it (Spooner and Pachana, 2006; Lowenstein and Acevedo, 2010). For decades, the dominant strategy has been to employ a handful of brief, non-natural tasks, for example, the Wisconsin Card Sorting Test (WCST). This approach has many advantages: Tests can be standardized in their administration, scoring, and interpretation; can (or, at least, can strive to) isolate one or more putative executive processes from others (e.g., Stuss et al., 2000; Barceló and Knight, 2002; Specht et al., 2009); and can be based on increasingly sophisticated neurocognitive models, allowing for patient data to be compared against human neuroimaging and animal physiological and lesion findings (e.g., Nyhus and Barceló, 2009). This approach is not without its difficulties, however, including the often surprisingly poor generalizability to behavior in the real world: Low scores on classical measures of executive function such as the WCST do not necessarily imply poor executive behavior in everyday life, and, conversely, good performance on classical executive measures can be accompanied by severely dysexecutive comportment in everyday life (e.g., Eslinger and Damasio, 1985; Chevignard et al., 2000; Andrés and Van der Linden, 2002; Fortin et al., 2003; Barker et al., 2004; Manchester et al., 2004).

Recently, an alternative approach has taken root. It entails the use of more complex tasks that incorporate multiple executive functions to carry out a scenario from the real world, such as running errands (Shallice and Burgess, 1991; Knight et al., 2002) or managing the front desk of a hotel (Manly et al., 2002; see also Lamberts et al., 2010; for a review see Poulin et al., 2013). The goal of using these more representative (i.e., corresponding more closely in form and context to situations outside the clinic/lab) scenarios is to yield results that are more generalizable (i.e., enabling better prediction of performance outside the clinic/lab; Burgess et al., 2006) than classical measures of executive function such as the WCST (Chaytor and Schmitter-Edgecombe, 2003). Our goal here was to examine brain-injured patients' performance on one such scenario: Cooking a meal.

Cooking is a good example of a real world task that often draws heavily on executive functioning. The classic illustration of this comes from neurosurgeon Wilder Penfield (Penfield and Evans, 1935), who performed an extensive right frontal lobe resection on his sister. Her inability to orchestrate a small dinner a year later was seen as emblematic of her general problems with executive functioning: "She had planned to get a simple supper for one guest and four members of her own family...When the appointed hour arrived the food was all there, one or two things [were] on

*<sup>2</sup> Bruyère Research Institute, University of Ottawa, Ottawa, ON, Canada*

*<sup>4</sup> The Robin Easey Centre, Ottawa Hospital Rehabilitation Centre, Ottawa, ON, Canada*

the stove, but the salad was not ready, the meat had not been started and she was distressed and confused by her long continued effort alone." Myriad subsequent research has demonstrated that brain injuries impair cooking (Dawson and Chipman, 1995; Chevignard et al., 2000, 2008; Fortin et al., 2003; Corrigan et al., 2004; Godbout et al., 2005; Baguena et al., 2006; Lillie et al., 2010; Frisch et al., 2012) but cooking is not convincingly correlated with performance on traditional tests of executive functions (Semkovska et al., 2004; Baum et al., 2008; Chevignard et al., 2008, 2010; Yantz et al., 2010; Provencher et al., 2012).

To strike a balance between the advantages of standardization and experimental control inherent to traditional tests of executive functions and the intricate and varying demands placed on executive functions by real-world scenarios, Craik and Bialystok (2006) developed the Breakfast Task (from Kerr, 1991). It is a computerized simulation in which participants use a touch screen to virtually "cook" five foods (each requiring a different cooking time) and ensure that they are all ready at the same time, while simultaneously setting the places at a virtual table. The task has three levels of difficulty, each thought to place heavier demands on executive functioning than the previous level (a *1-screen version*, in which all five foods and the table are shown on the same screen; a *2-screen version*, in which the five foods are shown on one screen and the table on a separate one, requiring participants to switch between the two screens; and a *6-screen version*, in which each of the five foods and the table are shown on a separate screen, requiring participants to switch among the six screens; **Figure 1**). Successful performance on the Breakfast Task (especially on the 6-screen version) requires participants to plan, multitask, hold different elements of one's plan and one's activities in mind while operating, monitor performance, and at times inhibit one behavior and switch to another. These are the hallmarks of executive functioning.

Healthy older adults, who on average have mild difficulties with executive functioning, performed more poorly on most aspects of the Breakfast Task than did young people, especially on the most executively-demanding 6-screen version (Craik and Bialystok, 2006). The only published study of neurological patients on the Breakfast Task of which we are aware involved Parkinson's disease (Bialystok et al., 2008). Although typically Parkinson's patients are thought of as having mild-to-moderate difficulties with planning and executive control, they performed

as well as or better than older controls on most Breakfast Task measures. Though this result might at first glance seem surprising, the patients' good performance on breakfast-making came at a cost: They by and large neglected to set places at the virtual table. The authors hypothesized that this may have been a compensatory strategy initiated by the patients: Because they quickly realized they would have difficulty doing everything asked of them, they deliberately neglected the secondary task (i.e., place-setting) to ensure good performance on the main task (i.e., breakfast-making).

What does the Breakfast Task tell us about cooking in everyday life? The fact that it is representative of cooking, relatively easy to comprehend, and reported by participants to be enjoyable-yet-challenging (Craik and Bialystok, 2006), bodes well for its generalizability (Chaytor and Schmitter-Edgecombe, 2003; Burgess et al., 2006; Chan et al., 2008). As additional support, Craik and Bialystok cited data linking Breakfast Task performance with strategy use among older adults during real life meal preparation (Edwards and Ryan, 2004). Yet, although the Breakfast Task looks promising, at the moment there are still too few neuropsychological studies to know exactly what to make of it. To this end, here we examined the potential generalizability of the Breakfast Task to real cooking in people with acquired brain injuries (ABI). Accurately assessing safety and proficiency in cooking is essential for successful community reintegration following ABI, but *in vivo* assessment of cooking by clinicians is time-consuming, costly, and difficult to standardize. Accordingly, we examined the usefulness of the Breakfast Task as an indicator of real life meal preparation skills:


# **MATERIALS AND METHODS**

#### **PARTICIPANTS**

Twenty-two people with ABI participated in the study [7 females; mean age = 45.55 years (*SD* = 12.44), mean education = 14.14 years (*SD* = 2.87)]. Three additional ABI patients started but then declined to complete the full Breakfast Task, and were therefore excluded from all analyses. All ABI participants were English-speaking, and were free of acute psychiatric symptoms, hemiparesis in the dominant hand, and visual field cut/hemineglect. All had received services within the last 36 months from the Robin Easey Centre, a Transitional Living Rehabilitation Program staffed with a full team of health care providers for adults having sustained an ABI. The Program focuses on reducing levels of disability via training and use of compensatory strategies. In general, of the clients obtaining services at the Centre, approximately 40% have sustained a traumatic brain injury, with the remainder having suffered acquired brain insults of varying etiologies (e.g., encephalopathy, ruptured aneurysm, tumor). The average duration of stay within the residential program is 161 days (*SD* = 87). Typically, clients are past the most acute stages of recovery (i.e., more than 6 months post-insult) by the time they are admitted into the residential program. Most clients have undergone a stay in an acute care facility followed by an admission to an in-hospital rehabilitation program before being referred to services at the Robin Easy Centre. Others are already several years post-insult when admitted, and seeking to acquire the skills needed for greater living independence.

Twenty-two healthy controls also participated (14 females; mean age = 39.86 years, *SD* = 17.96, mean education = 16.61 years, *SD* = 2.35). Four additional control participants were excluded: Two for Montreal Cognitive Assessment scores below the cut-off of 26/30 (Nasredinne et al., 2005), one for difficulties understanding and/or following the instructions and using the touch screen, and one (a younger woman, randomly selected from 4) to improve the match in age and sex ratio with the ABI group. The control and patients groups were similar in age *t*(42) = 1.220, *p* = 0.229, although they differed in sex distribution *<sup>x</sup>*2(1) <sup>=</sup> <sup>4</sup>.464, *<sup>p</sup>* <sup>=</sup> <sup>0</sup>.035, <sup>φ</sup> <sup>=</sup> <sup>0</sup>.<sup>319</sup> and years of education *t*(42) = −3.137, *p* = 0.003. [Note that these demographic differences appeared not to be the drivers of the very large differences in performance between groups on the Breakfast Task: When we re-ran our analyses using a smaller control group (*n* = 15) more closely matched to the patients, we found essentially the same results on the Breakfast Task (ANCOVAs using the larger control group were precluded because assumptions of homogeneity of regression were not met). We have included the entire control sample in the current version, to give a fuller picture of normal performance.]

#### **MATERIALS AND INSTRUMENTATION**

All participants were assessed on the Breakfast Task, and in addition the ABI patients were assessed on a self-report measure and an *in vivo* cooking task. The Research Ethics Boards of the Ottawa Hospital Research Institute and the University of Ottawa committee approved the study.

# *The Breakfast Task*

The Craik and Bialystok (2006) and Bialystok et al. (2008) computerized meal simulation task was completed using a touchscreen monitor. The main objectives were to cook the five breakfast food items [in the following order: eggs (ideal cooking time = 5.5 min/330 s), coffee 4 min/240 s, sausage 3.5 min/210 s, pancakes 3 min/180 s, and toast 2 min/120 s] thoroughly and to have them ready at the same time, while simultaneously setting places at a virtual table.

To start cooking a food item, the participants pressed its associated "Start" icon. This highlighted the name of the food item in green and made a blue bar appear on a timer, a vertical column. This blue bar dropped toward the zero mark, reflecting the remaining time, in real time, before the food would be ready. When the bar reached the zero mark, the participants pressed the "Stop" icon to stop the food cooking. No further indications were provided once the blue bar had reached the zero mark; consequently, cooking the food items required constant monitoring. When the "Stop" icon was pressed, the name of the food item was highlighted in red. The "Start" and "Stop" buttons could only be pressed once, thus participants could not stop the cooking of a food item if it had been started early.

The secondary objective was to set as many table settings as possible, setting repeatedly a table arranged to accommodate 4 guests. The location of the plate and utensils followed typical etiquette.

In the practice trials, the participants cooked two breakfast food items, whereas in the test trials, the participants cooked five breakfast foods. For both the demonstration and test trials, the participants completed three versions of the task. These versions of the task differed in their number of screens: 1-screen, 2-screen (one table setting screen and one cooking screen), or 6-screen (one table setting screen and a screen for each food item). The level of difficulty across the Breakfast Task is postulated to increase because of the greater executive/working memory demands. Like Craik and Bialystok (2006) and Bialystok et al. (2008) we presented the practice and test trials in a fixed order going from the 1-screen to the 2-screen, and ending with the 6-screen.

# *Self-report measure: Rehabilitation Activities of Daily Living Survey (RADLS)*

Within a few days of patients' discharge from the residential program, trained life skills counselors administered the Rehabilitation Activities of Daily Living Survey (RADLS; Salmon, 2003). The RADLS is a survey that measures patients' perceived cognitive, emotional and physical impairments. It assesses daily living tasks such as bathing, climbing stairs, relating to friends and family, banking, etc. Participants report their percent of limitation related to each activity before and after onset of the injury/illness. Responses were reversed and averaged to reflect abilities from 0% = full assistance/cannot do at all to 100% = fully independent. Items are divided into 10 composite categories, two of which are of particular interest: Meals, and Cognitive Activities (e.g., paying bills, running errands).

# *In vivo cooking assessment*

During the first 4 weeks of stay within the treatment facility, participants underwent a comprehensive assessment. Actual meal preparations were observed and evaluated by an occupational therapist and/or a life skills counselor. Patients were asked to prepare one meal a week that would feed themselves and other clients at the Centre (i.e., 3–5 persons in all). Clients were allowed to choose their own menu, but sometimes suggestions were made and consideration was given to maintaining preparation time within 1–1.5 h. The staff kept the context stable from 1 week to the next, and free from interruptions by other residents. The evaluators typically limited their involvement to observations and ratings, except if an obvious safety risk arose.

On the basis of 4 meal preparations, the occupational therapist and life skills counselor summarized their impressions of a patient's meal preparation skills. Also noted was the spontaneous use of strategies by the client and recommendations were made about additional training in meal preparation and suitable compensatory strategies. Such strategies might include separating planning from execution, strategies to better organize space, use of timers and alarms, modifying and simplifying recipes, use of adapted tools for the kitchen, keeping track of the passage of time, limiting the level of multi-tasking and so forth.

Subsequently, clients continued to prepare a similar group meal once a week for the remainder of their stay. These meal preparations were also supervised and evaluated by the occupational therapist or a life skills counselor. The meal preparations were used to teach clients new compensatory strategies. A discharge report was also prepared, including information about meal preparation skills, and any gains made with the use of compensatory strategies.

The neuropsychologist and a senior life skills counselor reviewed the admission and discharge reports for sections in relation to meal preparation. The rating guidelines were developed by the neuropsychologist to examine three dimensions of cooking: efficiency of execution, successful use of client-generated or trained compensatory strategies, and overall independence with meal preparation. Clients were rated relative to the range of abilities demonstrated across clients seen within the Residential Program. Both the team neuropsychologist and one senior life skills counselor familiarized themselves with the guidelines and both practiced using the method before rating participants. Subsequently, they used a 4-point Likert scale to rate each subject along the 3 dimensions (i.e., not effective to very effective strategy use, very slow to mildly/not slow, not at all/virtually not at all independent to very independent with meal preparation). The two raters obtained a good agreement; the intra-class correlation reached 0.875 (with 95% confidence intervals from 0.658 to 0.952).

# **PROCEDURE**

The intake coordinator reviewed the list of clients who had received services from the Robin Easey Centre in the last 36 months and who met the inclusionary criteria for participation. The intake coordinator contacted the patients to describe the study, and invited them to participate. Patients did not receive compensation.

The control participants were recruited in order to match the ABI patients as a group on age, sex, and education. One author (Karla V. Guerrero Nuñez) tested the patients and trained another author (Annick N. Tanguay), who recruited and tested the controls. Testing conditions were otherwise similar for both groups (identical procedure, instructions, reminders, etc.). Controls were recruited from the community and an undergraduate psychology students' pool, and received, respectively, \$10 or partial course credit. Some controls were tested at home for their convenience (note that the Robin Easy Centre acts as a temporary home for ABI patients).

The first step of each testing session involved a thorough description of the study and informed consent. Participants were reminded that they could withdraw from the study at any point during the experiment and that their participation was voluntary. We then obtained demographic information and for the patients corroborated it with their health care records when necessary, with their agreement.

The experimenter gave verbal instructions to participants and demonstrated the Breakfast Task. When the patients felt comfortable with the instructions, they moved on to completing the demonstration trials during which they could continue to ask questions. Before beginning each trial, the experimenter briefly reminded the participants of the main goals (i.e., having all food cooked and ready at the same time) and the secondary objective (i.e., setting the table as often as possible) of the task as well as other key points, such as taking note of the different cooking times and that the "Start" and "Stop" icons could only be pressed once. The experimenter sat at the back of the room to observe and take notes of the participants' performance. During the test trials, the experimenter no longer provided clarifications. The same fix order was used for practice and test trials: 1-screen, 2-screen, and 6-screen. It took approximately 45 min to complete the testing session, from obtaining consent to completing all levels of the Breakfast Task.

#### **STATISTICAL ANALYSES**

We tested for Group differences (ABI patients, controls) and within-subject differences across the 3 Breakfast Task Versions (1 screen, 2-screen, 6-screen) using 2 × 3 mixed ANOVAs, following up with *post-hoc t*-tests where necessary. Because the majority of the scores are based on reaction times (with inherently positivelyskewed distributions and some outliers, especially among the patients), we show raw scores in the Figures but performed log−<sup>10</sup> transformations of the data before conducting the ANOVAs and *post-hocs*. For ease of interpretation and to provide a better indication of the Breakfast Task's potential clinical usefulness, we used raw scores when examining the relationship (Spearman's rho) between the Breakfast Task and real world indices of cooking ability.

#### **RESULTS**

#### **COMPARISONS BETWEEN ABI AND CONTROLS** *Total task time*

The Breakfast Task consists of 5 food items, with the eggs always taking the longest to cook (5.5 min = 330 s). Thus, on each version of the Breakfast Task the optimal total task time is 5.5 min. Taking less than 5.5 min would render the eggs under-cooked, whereas taking longer than 5.5 min would indicate a lack of efficiency/organization, with at least one breakfast item likely ending up cold or burned. Overall, the patients took only slightly longer than the controls to complete the task, *F*(1, 42) = 2.161, *MSE* <sup>=</sup> <sup>0</sup>.009, *<sup>p</sup>* <sup>=</sup> <sup>0</sup>.149; <sup>η</sup><sup>2</sup> <sup>=</sup> <sup>0</sup>.049 (see Supplementary Table 1 and **Figure 2**). The 3 versions of the Breakfast Task took different times to complete, *F*(2, 84) = 4.595, MSE = 0.002, *p* = 0.013, <sup>η</sup><sup>2</sup> <sup>=</sup> <sup>0</sup>.099. The 6-screen version (*<sup>M</sup>* <sup>=</sup> 2.559, *SD* <sup>=</sup> <sup>0</sup>.058) took more time than the 1-screen version (*M* = 2.529, *SD* = 0.049), *t*(43) = −3.165, *p* = 0.003. The 2-screen version (*M* = 2.55, *SD* = 0.091) did not differ significantly from either the 1-screen, *t*(43) = −1.927, *p* = 0.061, or 6-screen version, *t*(43) = −0.821, *p* = 0.416.

The interaction between Group and Version tended toward but did not obtain significance, *F*(2, 84) = 2.277, MSE = 0.002, *<sup>p</sup>* <sup>=</sup> <sup>0</sup>.109, <sup>η</sup><sup>2</sup> <sup>=</sup> <sup>0</sup>.051.

#### *Average discrepancy in cooking time*

As in real life, each of the Breakfast Task's foods has an ideal cooking time, which is computed and displayed for participants (e.g., the eggs take 5.5 *min* = 330 *s*). Any deviation from the ideal cooking time will lead to an over- or under-cooked item. We obtained the average discrepancy in cooking time by computing the difference between the actual cooking time of each food and its ideal cooking time and then averaging the absolute scores across each of the 5 foods. The ABI patients showed a greater discrepancy than controls, *F*(1, 42) = 21.403, MSE <sup>=</sup> 0.295, *<sup>p</sup>* <sup>&</sup>lt; <sup>0</sup>.001, <sup>η</sup><sup>2</sup> <sup>=</sup> <sup>0</sup>.338 (see Supplementary Table 1 and **Figure 3**). We also found a main effect of Breakfast Task Version, *<sup>F</sup>*(2, 84) <sup>=</sup> <sup>18</sup>.237, MSE = 0.082, *<sup>p</sup>* <sup>&</sup>lt; <sup>0</sup>.001, <sup>η</sup><sup>2</sup> <sup>=</sup> <sup>0</sup>.303, but no interaction between Group and Version, *F*(2, 84) = 1.230, MSE = 0.082, *<sup>p</sup>* <sup>=</sup> <sup>0</sup>.297, <sup>η</sup><sup>2</sup> <sup>=</sup> <sup>0</sup>.028. The 1-screen (*<sup>M</sup>* <sup>=</sup> <sup>0</sup>.75, *SD* = 0.533) and 2-screen (*M* = 0.744, *SD* = 0.422) versions did not differ from one another, *t*(43) = 0.094, *p* = 0.926, but both involved lower average discrepancy scores than the 6-screen version (*M* = 1.066, *SD* = 0.371), *t*(43) ≤ −4.734, *p* < 0.001 and *t*(43) = −5.991, *p* = 0.001, respectively.

#### *Early stopping vs. late stopping*

The "average discrepancy in cooking time" scores (reported immediately above) indicated that the ABI patients stopped

cooking their foods at the wrong times. We then looked more closely at these data to find out whether people were stopping too soon or too late. To obtain the average discrepancy in cooking time, each of the food items' ideal cooking times had been subtracted from their actual cooking times. Negative discrepancies result from undercooking (i.e., early stopping) and positive discrepancies from overcooking (i.e., late stopping). Negative and positive discrepancies of the 5 food items were averaged separately to obtain a measure representing early and late stopping times, respectively.

When participants undercooked their foods (i.e., stopped their foods too soon), the two groups were not significantly different overall, *<sup>F</sup>*(1, 42) <sup>=</sup> <sup>0</sup>.956, MSE = 0.523, *<sup>p</sup>* <sup>=</sup> <sup>0</sup>.334, <sup>η</sup><sup>2</sup> <sup>=</sup> <sup>0</sup>.022 (see Supplementary Table 1 and **Figure 4**). There was a main effect of Breakfast Task version, *F*(2, 84) = 4.574, MSE = 0.194, *p* = 0.013, <sup>η</sup><sup>2</sup> <sup>=</sup> <sup>0</sup>.098, qualified by an interaction between Group and Version, *<sup>F</sup>*(2, 84) <sup>=</sup> <sup>4</sup>.639, MSE <sup>=</sup> 0.194, *<sup>p</sup>* <sup>=</sup> <sup>0</sup>.012, <sup>η</sup><sup>2</sup> <sup>=</sup> <sup>0</sup>.099. This reflected the fact that patients (*M* = 0.766, *SD* = 0.806) stopped cooking their foods significantly earlier than ideal compared to the controls (*M* = 0.349, *SD* = 0.417) only on the 1-screen version, *t*(42) = 2.156, *p* = 0.039 [2-screen, *t*(42) = 0.690, *p* = 0.494; 6-screen, *t*(42) = −1.048, *p* = 0.301]. While the controls tended to stop the food just as early across the versions, the patients disproportionally stopped the food early on the 1-screen version. The tendency to stop food earlier than ideal decreased sharply with the 2-screen (ABI *M* = 0.597, *SD* = 0.539; Controls *M* = 0.491, *SD* = 0.482) and 6-screen version, so much so that ABI patients stopped food less early than controls on the 6-screen version on average (ABI *M* = 0.228, *SD* = 0.416; Controls *M* = 0.382, *SD* = 0.551).

When people overcooked their foods (i.e., stopped their foods too late), the patients did so significantly more than ideal compared to controls, *F*(1, 42) = 16.598, MSE = 0.571, *p* < 0.001, <sup>η</sup><sup>2</sup> <sup>=</sup> <sup>0</sup>.283 (see Supplementary Table 1 and **Figure 4**). There was an effect of Breakfast Task Version, *F*(2, 84) = 25.390, MSE = 0.157, *<sup>p</sup>* <sup>&</sup>lt; <sup>0</sup>.001, <sup>η</sup><sup>2</sup> <sup>=</sup> <sup>0</sup>.377, but no interaction between Group and Version, *<sup>F</sup>*(2, 84) <sup>=</sup> <sup>0</sup>.435, MSE <sup>=</sup> 0.157, *<sup>p</sup>* <sup>=</sup> <sup>0</sup>.649, <sup>η</sup><sup>2</sup> <sup>=</sup> 0.01. Participants stopped the food later than ideal in the 6 screen (*M* = 1.618, *SD* = 0.44) than the 1-screen (*M* = 1.069,

*SD* = 0.692) and 2-screen versions (*M* = 1.129, *SD* = 0.645), *t*(43) ≤ −5.761, *p* < 0.001. The 1-screen and 2-screen did not differ, *t*(43) = −0.762, *p* = 0.45.

#### *Average range of stop times*

The instructions always emphasized the importance of serving the foods at the same time (i.e., of having a range of stop times approaching zero). The aforementioned "discrepancy in cooking time" score and the range of stop times are related but not redundant, in that a person might choose to serve under- or over-cooked foods (i.e., high discrepancy in cooking time), but serve all foods at once (i.e., low average range of stop times). Conversely, a person might choose to serve perfectly-cooked food items (i.e., low discrepancy in cooking time) but not serve all items at the same time (i.e., high average range of stop times). Patients showed a significantly wider range of stop times than controls, *<sup>F</sup>*(1, 42) <sup>=</sup> <sup>13</sup>.409, MSE <sup>=</sup> 0.57, *<sup>p</sup>* <sup>=</sup> <sup>0</sup>.001, <sup>η</sup><sup>2</sup> <sup>=</sup> <sup>0</sup>.<sup>242</sup> (see Supplementary Table 1 and **Figure 5**). The main effect of the Breakfast Task Version showed a trend toward significance, Huynh-Feldt *<sup>F</sup>*(1.87, <sup>78</sup>.6) <sup>=</sup> <sup>2</sup>.576, MSE <sup>=</sup> 0.159, *<sup>p</sup>* <sup>=</sup> <sup>0</sup>.086, <sup>η</sup><sup>2</sup> <sup>=</sup> 0.058 with no interaction, *F*(1.87, <sup>78</sup>.6) = 1.359, MSE = 0.159, *<sup>p</sup>* <sup>=</sup> <sup>0</sup>.262, <sup>η</sup><sup>2</sup> <sup>=</sup> <sup>0</sup>.031.

#### *Average deviation of start times*

Each food item has an ideal start time, which is contingent on the previously-started items (except for the eggs, which take 5.5 min to cook and should be started at the onset of the task). The coffee takes 4 min to brew, so the coffee should be started 1.5 min after the eggs. If, for example, the starting time of the coffee is 0.5 min early, what will be the ideal start time of the third food, the sausage (which needs 3.5 min to cook)? In order to reduce the range of stop times, one may decide to start the third food item based on the first item (i.e., 2 min later) or the second item (i.e., 1 min later), or a combination of both (i.e., 1.75 min). The ideal start times for the third, fourth, and fifth food items are an average of the ideal start time based on the first item (e.g., 2 min for the sausage) and the relative ideal start time based on the previous food items (e.g., the actual start time of coffee +0.5 min). Absolute deviations of start time for the food items were then averaged. Patients showed a greater average deviation of start times than controls, *F*(1, 42) = 14.656, MSE = 0.542, *p* < 0.001, <sup>η</sup><sup>2</sup> <sup>=</sup> <sup>0</sup>.259 (see Supplementary Table 1 and **Figure 6**). No main effect of the Breakfast Task Version, *F*(2, 84) = 1.499, MSE = 0.07,

*<sup>p</sup>* <sup>=</sup> <sup>0</sup>.229, <sup>η</sup><sup>2</sup> <sup>=</sup> <sup>0</sup>.034, and no interaction between Group and Version, *<sup>F</sup>*(2, 84) <sup>=</sup> <sup>0</sup>.018, MSE <sup>=</sup> 0.07, *<sup>p</sup>* <sup>=</sup> <sup>0</sup>.982, <sup>η</sup><sup>2</sup> <sup>=</sup> <sup>0</sup>.000, were found.

#### *Early start vs. late start*

The ABI patients missed the ideal start times more than controls, and we further asked whether they started food earlier and/or later than ideal. The food items' ideal start times (as described above) were subtracted from the actual start times. Starting the cooking of a food item early is indicated by an average of the five food items' positive deviations; negative deviations for later than ideal start times. Patients did start the food earlier than ideal compared to the controls, *F*(1, 42) = 7.239, MSE = 1.103, *<sup>p</sup>* <sup>=</sup> <sup>0</sup>.01, <sup>η</sup><sup>2</sup> <sup>=</sup> <sup>0</sup>.147 (see Supplementary Table 1 and **Figure 7**). There was an effect of Breakfast Task Version, *F*(2, 84) = 18.252, MSE <sup>=</sup> 0.175, *<sup>p</sup>* <sup>&</sup>lt; <sup>0</sup>.001, <sup>η</sup><sup>2</sup> <sup>=</sup> <sup>0</sup>.303, but no interaction with Group, *<sup>F</sup>*(2, 84) <sup>=</sup> <sup>0</sup>.204, MSE <sup>=</sup> 0.175, *<sup>p</sup>* <sup>=</sup> <sup>0</sup>.816, <sup>η</sup><sup>2</sup> <sup>=</sup> <sup>0</sup>.005. The 3 Breakfast Task versions all differed from one another, *t*(43) ≥ 2.742, *p* < 0.009. Participants started foods earlier than their ideal start times on the 1-screen (*M* = 1.524, *SD* = 0.669) than on the 2-screen (*M* = 1.26, *SD* = 0.695) and than on the 6-screen (*M* = 0.985, *SD* = 0.822).

ABI patients started the food no later than ideal compared to controls, *<sup>F</sup>*(1, 42) <sup>=</sup> <sup>1</sup>.656, *MSE* <sup>=</sup> <sup>0</sup>.834, *<sup>p</sup>* <sup>=</sup> <sup>0</sup>.205, <sup>η</sup><sup>2</sup> <sup>=</sup> <sup>0</sup>.<sup>038</sup> (see Supplementary Table 1 and **Figure 7**). There was an effect

of Breakfast Task Version, Huynh-Feldt *F*(1.86, <sup>77</sup>.95) = 11.899, *MSE* <sup>=</sup> <sup>0</sup>.465, *<sup>p</sup>* <sup>&</sup>lt; <sup>0</sup>.001, <sup>η</sup><sup>2</sup> <sup>=</sup> <sup>0</sup>.221 that did not interact with Group, *<sup>F</sup>*(1.86, <sup>77</sup>.95) <sup>=</sup> <sup>10</sup>.386, *MSE* <sup>=</sup> <sup>0</sup>.465, *<sup>p</sup>* <sup>=</sup> <sup>0</sup>.666, <sup>η</sup><sup>2</sup> <sup>=</sup> 0.009. Participants started the foods later than ideal in the 6 screen (*M* = 1.223, *SD* = 0.795) than 2-screen version (*M* = 0.975, *SD* = 0.742), and both later than in the 1-screen version (*M* = 0.547, *SD* = 0.718), *t*(43) ≤ −2.066, *p* ≤ 0.045.

#### *Sequencing*

Starting or stopping a food a little early or late (while following the proper order of starting eggs, then coffee, sausage, pancakes, and finally toast) could be considered a minor error, but starting foods out of their proper order could be considered a more serious problem. To look at sequencing errors, we subtracted a point on 5 for each food started in incorrect sequence. Participants were not penalized for a previous error, for example if someone began with eggs, then started the sausage, then the coffee (instead of the proper order of eggs, coffee, sausage... ), only one point was taken off. All food items other than toast started as the last food item also warranted a penalty. Sequencing errors were relatively rare, especially for controls, and therefore we combined the scores from the 3 Breakfast Task versions and used a Mann-Whitney Test. Patients (Mdn = 14, Minimum = 2, and Maximum = 15, where 15 represents the ideal score) committed more sequencing errors than controls (Mdn = 15, Minimum = 10, and Maximum = 15, where 15 represents the ideal score), *U* = 139.5, *z* = −2.851, *p* = 0.004, *r* = 0.43. Of note, four ABI participants omitted to cook one of the food items in the 6-screen version. None of the controls made such an omission. The closest was a control participant who thought she had pressed the start button for the eggs, but only noticed at the end that she had never actually started them. Rather than deprive her virtual breakfast guests of their eggs, the participant chose to start the eggs and keep setting the table while waiting for them to finish (such a strategy also increased her range of stop times, presented above).

#### *Percentage of time spent cooking*

Participants were required to balance their time between cooking and setting as many places at the virtual table as possible. ABI patients spent a greater percentage of their time on the cooking than did the controls, *F*(1, 42) = 7.630, MSE = 570.088, *p* = <sup>0</sup>.008, <sup>η</sup><sup>2</sup> <sup>=</sup> <sup>0</sup>.154 (see Supplementary Table 1 and **Figure 8**; note that because these scores were normally distributed, we computed the ANOVAs on untransformed scores). There was a main effect of Breakfast Task Version, *F*(1, 42) = 5.649, MSE = 67.036, *<sup>p</sup>* <sup>=</sup> <sup>0</sup>.022, <sup>η</sup><sup>2</sup> <sup>=</sup> <sup>0</sup>.119, with no interaction between Group and Version, *<sup>F</sup>*(1, 42) <sup>=</sup> <sup>0</sup>.048, MSE <sup>=</sup> 67.036, *<sup>p</sup>* <sup>=</sup> <sup>0</sup>.827, <sup>η</sup><sup>2</sup> <sup>=</sup> <sup>0</sup>.001. The 6-screen version (*M* = 43.393, *SE* = 2.703) involved a significantly higher percentage of cooking time than the 2-screen version (*M* = 39.244, *SE* = 2.679). The 1-screen version was not included in the analyses because teasing apart the time spent on table or cooking entails potential inexactitude.

#### *Number of table settings*

Patients set fewer places at the virtual table than did controls, *<sup>F</sup>*(1, 42) <sup>=</sup> <sup>26</sup>.222, MSE <sup>=</sup> 470.386, *<sup>p</sup>* <sup>&</sup>lt; <sup>0</sup>.001, <sup>η</sup><sup>2</sup> <sup>=</sup> <sup>0</sup>.384 (see Supplementary Table 1 and **Figure 9**). There were no main effects

of the Breakfast Task Version, *F*(2, 84) = 1.331, MSE = 48.146, *<sup>p</sup>* <sup>=</sup> <sup>0</sup>.270, <sup>η</sup><sup>2</sup> <sup>=</sup> <sup>0</sup>.031, and no interaction *<sup>F</sup>*(2, 84) <sup>=</sup> <sup>1</sup>.823, MSE <sup>=</sup> 48.146, *<sup>p</sup>* <sup>=</sup> <sup>0</sup>.168, <sup>η</sup><sup>2</sup> <sup>=</sup> <sup>0</sup>.042.

To determine whether patients set fewer table settings relative to the time dedicated to this part of the task, we divided the total time spent on the tables setting by the number of table settings on the 2- and 6-screen versions. The 1-screen version was excluded because of the inherent difficulty in assessing the total time spent on table setting. Patients set fewer places while on the table setting screen compared to controls, *F*(1, 42) = 16.940, MSE <sup>=</sup> 10.013, *<sup>p</sup>* <sup>&</sup>lt; <sup>0</sup>.001, <sup>η</sup><sup>2</sup> <sup>=</sup> <sup>0</sup>.287 (see Supplementary Table 1 and **Figure 10**). The main effect of the Breakfast Task Version was not significant, *F*(1, 42) = 0.667, MSE = 2.181, *p* = 0.419, <sup>η</sup><sup>2</sup> <sup>=</sup> <sup>0</sup>.016, and did not interact with Group, *<sup>F</sup>*(1, 42) <sup>=</sup> <sup>0</sup>.067, MSE <sup>=</sup> 2.181, *<sup>p</sup>* <sup>=</sup> <sup>0</sup>.798, <sup>η</sup><sup>2</sup> <sup>=</sup> <sup>0</sup>.002. The number of table settings and the average time per place setting data fit a normal distribution, so the untransformed data were used in these analyses.

#### *Number of food checks*

Controls monitored the cooking of their foods [i.e., switching to the food screen(s) from the place setting screen in the 2-and 6 screen versions] more often than did the ABI patients, *F*(1, 42) = <sup>5</sup>.477, MSE <sup>=</sup> 47.936, *<sup>p</sup>* <sup>=</sup> <sup>0</sup>.024, <sup>η</sup><sup>2</sup> <sup>=</sup> <sup>0</sup>.115 (see Supplementary Table 1 and **Figure 11**). Controls and ABI patients made more food checks on the 6-screen version (*M* = 19.364, *SE* = 1.2) than the 2-screen version (*M* = 10.409, *SE* = 0.590), *F*(1, 42) = <sup>57</sup>.432, MSE <sup>=</sup> 30.715, *<sup>p</sup>* <sup>&</sup>lt; <sup>0</sup>.001, <sup>η</sup><sup>2</sup> <sup>=</sup> <sup>0</sup>.578 [no group interaction, *<sup>F</sup>*(1, 42) <sup>=</sup> <sup>0</sup>.095, MSE <sup>=</sup> 30.715, *<sup>p</sup>* <sup>=</sup> <sup>0</sup>.760, <sup>η</sup><sup>2</sup> <sup>=</sup> <sup>0</sup>.002].

Because food items and the table setting all show on the same screen for the 1-screen version, food checks can be performed by a simple shift of glance and hence cannot be evaluated separately from cooking time within the existing Breakfast Task paradigm.

#### **CORRELATIONS WITH REAL WORLD PERFORMANCE**

We had expected that ABI patients' performance on the Breakfast Task would be positively correlated with their ability to prepare a real meal. To answer this question, we first had to construct an average overall score:

[0.40(Average discrepancy) + 0.40(Range of stop times) − 0.20(Number of table settings)] + 20

We assigned weights to the three components that made up the overall Breakfast Task score, based on the task instructions, which place greater importance on the cooking (having all foods prepared at the same time, with none over- or under-cooked) than on the table setting. Higher scores for cooking performance (i.e., average discrepancy and range of stop times) reflect poorer performance and higher scores for table setting reflect better performance, so we reversed the table setting score and gave it a lower weight relative to the two other scores. For ease of interpretation, we added 20 to all scores in order to make them all positive (i.e., above zero), and then transformed them to meet assumptions of normality.

To avoid alpha inflation, we summed the scores of the 3 Breakfast Task versions and explored their relationships with the *in vivo* evaluation of independence in cooking and self-report scores. [Separate correlations for each version can be found in Supplementary Table 2.] The self-report measures are composite scales of meal preparation/planning abilities, and a cognitive composite evaluating tasks such as managing finances. Complete data on real world functioning were available for a subset of 16 ABI participants. Patients' actual meal preparation (as assessed by the clinical team) was significantly correlated with their selfreported meal preparation abilities, *rs* = 0.536, *p* = 0.032.

Self-report of meal preparation abilities was significantly correlated with the Overall Breakfast Task score (*rs* = −0.594, *p* = 0.014), along with the aggregate measures of Discrepancy of Cooking Time (*rs* = −0.646, *p* = 0.007), Deviation of Start Times (*rs* = −0.666, *p* = 0.005), Early Start (*rs* = −0.607, *p* = 0.013), and Late Stop (*rs* = −0.760, *p* = 0.001). The patients' independence while preparing real meals (as assessed by the clinical team) was not significantly correlated with their Breakfast Task overall score, *rs* = −0.075, *p* = 0.783.

# **DISCUSSION**

On the virtual meal-cooking task (the "Breakfast Task," Craik and Bialystok, 2006), our ABI patients all seemed to have understood the instructions, had the opportunity to go through practice trials before each version of the task, and generally appeared to grasp the gist of what they were to do [which is not surprising, given that semantic knowledge of cooking (including simple script generation) is usually unimpaired in people with brain injuries (Fortin et al., 2003; Godbout et al., 2005; Baguena et al., 2006) and that all of our ABI patients had previous experience with cooking].

Despite this, the patient group showed poorer performance than normal on every Breakfast Task score that we examined. On average, the patients were less likely to have all their foods ready at the same time, over-cooking some foods and under-cooking others. Although some patients were indistinguishable from the healthy participants, up to a third scored outside the range of controls (depending on the particular measure) and we observed some behaviors that truly seemed "dysexecutive." Moreover, some behaviors suggested not only weaknesses in managing transitions between cooking and table setting (i.e., multi-tasking and prospective memory) but also in forming a valid plan at the outset or keeping such a plan in mind, or both. For example, some patients would run through the practice trials too quickly or begin the task by setting several places at the virtual table rather than by starting the first breakfast item [consistent with Frisch et al. (2012), who found their stroke patients to be less likely to read the instructions carefully before beginning cooking]. Also notable, several of the ABI patients cooked their foods in the wrong sequence, a type of error rarely seen in our controls and not reported at all in the two previous studies with older adults and Parkinson's patients (Craik and Bialystok, 2006; Bialystok et al., 2008). Two patients in particular did not start the eggs first even though these obviously take the longest to cook, and one started all the food items in reverse order from the ideal on one version of the task. Four ABI participants neglected to cook one food item at all on the 6-screen version (three neglecting the last food item, i.e., the toast). Also, occasionally patients would continue setting places at the virtual table after they had stopped cooking all of the food items, or would interrupt their cooking mid-sequence to return to table setting, which was inefficient. Weak performance early in the sequence would bring about a distinct challenge for the patients: Judging the best course of action given that the two primary task objectives (i.e., having the right cooking time for each food, and having all foods finished at about the same time) had now become essentially irreconcilable. Indeed, the mean average deviation in start times for our ABI patients was more than twice as long as that noted for older adults and Parkinson's patients (i.e., 35 vs. approx. 12–15 s; Craik and Bialystok, 2006; Bialystok et al., 2008). Our instructions did not explicitly state what patients should do if they realized later in the task that they had made the error of poor early planning. In future, manipulating instructions and asking patients during or after the task could help elucidate whether they tried to balance the two main objectives of the task, arbitrarily focused on one or the other, or focused specifically on one based on real life priorities. That is, in the last instance, they could have focused on minimizing discrepancies in cooking times over minimizing average range of stop times because of their perception of graver implications of serving under- or over-cooked foods.

Our ABI patients looked quite different from the only other neuropsychological group to have performed the Breakfast task. Bialystok et al. (2008) found that Parkinson's patients performed surprisingly well on the breakfast-making part of the task, but in order to do so they may have strategically ignored the tablesetting part of the task. Some of our ABI patients may have been trying to do something similar. For instance, they devoted more time to cooking than to place-setting relative to controls. In fact, on the most challenging version of the task (i.e., the 6 screen version), some ABI patients spent almost all of their time on cooking. Yet, our ABI patients performed significantly more poorly than controls on *both* the cooking and table setting components. The PD patients may have been less impaired than ABI patients overall, or may actually show a different profile of executive impairment than ABI patients (e.g., Zgaljardic et al., 2003). One additional surprising finding in the present study was that although we thought the ABI patients would be most clearly impaired on the 6-screen version of the task (owing to its arguably greater demands on executive functioning), this was not the case. This may stem in part from most of the controls scoring relatively close to zero on several of the Breakfast Task measures (e.g., Discrepancy, Average Deviation of Start and Stop Times), and should be explored further in future.

#### **THE BREAKFAST TASK REFLECTS MULTIPLE ASPECTS OF EXECUTIVE FUNCTIONING**

In this study we did not have consistent neuropsychological data available on the patients (owing to significant variability in the time between date of injury/illness onset and neuropsychological testing, as well as variability in the test batteries employed depending on specific diagnoses and by which clinical service they were seen). In future, it would be interesting to know the extent to which traditional executive measures predict Breakfast Task performance. We would note, however, that the Breakfast Task comes from a modern impetus to create relatively realistic, complex tasks that rely on multiple executive functions (e.g., Shallice and Burgess, 1991; Knight et al., 2002; Manly et al., 2002; Lamberts et al., 2010; for a review see Poulin et al., 2013). Such an approach is a double-edged sword: Although it provides participants with an engaging experience with a relatively representative scenario (Burgess et al., 2006), it is not designed to precisely differentiate among the executive processes that might contribute to performance. A happy medium might be to use complex, realistic scenarios such as the Breakfast Task to generate hypotheses about those executive functions that might be impaired in a particular patient, and then isolate those functions using simpler, more traditional measures. Although this approach is potentially useful, it must be borne in mind that these complex scenarios were developed because it might only be on them [i.e., open-ended complex tasks in which a basic context and goal(s) have been provided but important sub-goals may emerge as the task unfolds] that executive problems become apparent (for an impressive attempt to strike a balance, see Wilson et al., 1996).

# **THE BREAKFAST TASK AS A MODEL OF PERFORMANCE IN THE REAL WORLD**

As the *in vivo* cooking task involved preparing multiple foods for dinner (requiring planning and multi-tasking), the Breakfast Task seemed well-matched to it. We found a significant relationship between ABI patients' self-rated ability to plan and prepare meals and their performance on the Breakfast Task. Because the self-report ratings were taken following several months in a residential rehabilitation program with intensive life skills retraining including weekly meal preparations and feedback about their performance, it could be expected that clients would develop some degree of realistic self-appraisal over time. In keeping with this suggestion, the overall independence scores for *in vivo* cooking did relate positively to patients' self-reports of meal preparation abilities. Yet, surprisingly, overall performance on the Breakfast Task was only weakly (and non-significantly) correlated with overall ratings of independence in preparing a group meal.

The surprisingly low correlation between Breakfast Task performance and real meal preparation is vexing, and although we are not alone in finding this (e.g., Semkovska et al., 2004; Baum et al., 2008; Chevignard et al., 2008, 2010; Yantz et al., 2010; Provencher et al., 2012) we can only speculate as to why it occurred. First, there may actually exist a subtle relationship between these two variables but our subgroup of *n* = 16 patients provided insufficient statistical power to detect it. And yet, other correlations (for example, between patients' self-ratings of meal preparation and their actual performance) were readily apparent even with our small group size. Second, unlike other cooking paradigms (e.g., Neistadt, 1994; Giovannetti et al., 2008), the Breakfast Task primarily measures the executive aspects of cooking, rather than the actual procedures one runs through in the preparation of food (e.g., pouring real coffee, dicing real mushrooms). In this respect, it seemed reasonably well-suited to our ABI patients, who mostly appear to have difficulty with the planning and executive aspects of cooking rather than the procedural ones: Interventions to train our clients on basic tasks such as preparing coffee or a sandwich are not typical, whereas training of complex meal preparation is more commonly needed. Nevertheless, a subset of our ABI patients may have exhibited challenges with the routine procedures involved in cooking (e.g., slowness dicing foods) that were not adequately measured by the Breakfast Task. A more fine-grained assessment of their *in vivo* cooking might help us distinguish patients who appear to have more "executive" problems when cooking from those who have more "procedural" problems, with patients in the former group perhaps showing a closer correspondence between real world and Breakfast Task performance. Third, our ratings of *in vivo* cooking were relatively coarse, using 4-point scales limited to two specific dimensions (i.e., speed of execution, and use of compensatory strategies) and a global one, in keeping with the fact that ratings were made on the basis of summary narratives drawn from reports. Despite the advantage of our summary narratives being based on multiple samplings of meal preparation, the scale that we used here included less detailed information than other coding methods (e.g., Giovannetti et al., 2008; Frisch et al., 2012). Finally, in some ways, the *in vivo* task and computerized tasks were different from one another. Notably, the *in vivo* assessment placed considerable value on the use of compensatory strategies to mitigate risk and produce a good meal, with no direct parallel on the Breakfast Task. One solution might be to separately assess procedural aspects of cooking using a basic task, as a baseline from which to gauge the role of "executive" problems in a more complex one (see Schmitter-Edgecombe et al., 2011 for a good example of such tasks).

Despite this surprising null correlation, to our minds the Breakfast task shows potential and deserves further work. All of the ABI patients we recruited could understand and adequately perform the task, almost all (22 out of 25 = 88%) completed it, and many found it to be enjoyable and engaging. In general, computerized assessment of complex tasks such as meal preparation has real advantages over *in vivo* assessment, including standardized administration and scoring, reduced potential for scorer bias, the collection of more data than one observer could possibly generate (even with video-recording for later analysis), millisecond timing (which is impossible with scoring real or video-recorded sessions), indirect observation (instead of needing to be quite close to the food and the client for *in vivo* analysis), the relative independence of administration (e.g., a single tester is required to explain and start the task and then can keep an eye on performance while carrying out other duties), portability (i.e., the assessment does not require a real kitchen), and the possibility of varying task demands easily and uniformly across participants (e.g., using 3 versions, as in the present study). Although video recordings may be useful for less fine-grained analyses or for establishing the merits of a scoring system, the regular need for video recordings to apply such a method would typically be prohibitive within many clinical settings for the aforementioned reasons.

# **FUTURE WORK**

Notwithstanding the potential advantages of computerized assessment, more work is required before replacing traditional *in vivo* evaluation methods. In particular, one strategy to learn more about the correspondence between computerized and realworld cooking might be to make the computerized and real world tasks more similar to one another: How would the patients perform if asked to cook a real breakfast of eggs, coffee, sausage, pancakes, and toast in 5.5 min in the kitchen? One could also use virtual reality to make the computerized task look and feel as similar as possible to real life. Several reports have recently emerged of strong correlations between real and virtual performance of executively-demanding scenarios, including cooking (e.g., Zhang et al., 2003; Renison et al., 2012). Virtual reality assessments retain most if not all of the advantages of other kinds of computerized assessment, and will undoubtedly become less expensive and easier to administer in the near future.

By comparing the breakfast simulation task with *in vivo* cooking, the present study has helped us identify important translational issues. Use of simulation tasks requires careful consideration of how patients, under certain circumstances, may intentionally deviate from instructions aimed at weighted sampling of various cognitive abilities, drawing instead on affordable knowledge about *cooking for people.* One important question concerns to what extent the task and its instructions should constrain or anticipate and allow changes in approach to the task as it is unfolding based on principles of real world cooking and human need. For example, the importance of being able to detect midtask adjustments vs. poor or random performance seems sensible not only in light of the application of common sense notions and options available to us in real life cooking but also when one considers the arithmetical and working memory burden associated with resetting "ideal" start times of later food items via the expectation of an elaborate mathematical averaging process should a person err with food sequences or start times early in the task. Placing considerable effort on estimating new ideal start times for later foods based on error introduced with early food start times would not necessarily constitute the best "real life" strategy because of the information processing load, proving detrimental to the allocation of cognitive resources to other ongoing aspects of the task and only aggravating matters. Future studies should measure process-related (i.e., interpretative) issues, likely through questioning of patients both during training as well as following testing. Finally, as the focus of such studies shifts toward predicting real life performances, there may be a need for a parallel shift in focus from measuring cognitive impairment to one of measuring disability. Building on clients' abilities to adapt their approach to tasks using compensatory strategies is a cornerstone principle in the rehabilitation of neurologically compromised individuals. In future research we will need to consider how to incorporate such opportunities into simulation tasks as well.

# **AUTHOR CONTRIBUTIONS**

Annick N. Tanguay tested participants, ran statistical analyses, and contributed to writing the manuscript. Patrick S. R. Davidson co-conceived the project and contributed to writing the manuscript. Karla V. Guerrero Nuñez tested participants and contributed to writing the manuscript. Mark B. Ferland co-conceived the project and contributed to writing the manuscript.

#### **ACKNOWLEDGMENTS**

We are grateful to Marielle Young-Bernier for assistance with creating the figures, to Héloïse Drouin for setting up the computer program, and to Jennifer Smithson for checking the data. We are also grateful for the help of the Robin Easey Centre team, especially to Johanne Larente for assisting with the development of and inter-rater reliability of the scales used to evaluate *in vivo* meal preparations, as well as to the Natural Sciences and Engineering Research Council (Canada) for Discovery grant and scholarship support.

# **SUPPLEMENTARY MATERIAL**

The Supplementary Material for this article can be found online at: http://www.frontiersin.org/journal/10.3389/fnbeh.2014. 00272/abstract

# **REFERENCES**


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

*Received: 01 April 2014; accepted: 23 July 2014; published online: 02 September 2014. Citation: Tanguay AN, Davidson PSR, Guerrero Nuñez KV and Ferland MB (2014) Cooking breakfast after a brain injury. Front. Behav. Neurosci. 8:272. doi: 10.3389/ fnbeh.2014.00272*

*This article was submitted to the journal Frontiers in Behavioral Neuroscience. Copyright © 2014 Tanguay, Davidson, Guerrero Nuñez and Ferland. This is an openaccess article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.*

# Assessing complex executive functions with computerized tests: is that toast burning?

# *Brian E. McGuire\**

*School of Psychology, National University of Ireland, Galway, Ireland \*Correspondence: brian.mcguire@nuigalway.ie*

#### *Edited by:*

*Lynne Ann Barker, Sheffield Hallam University, UK*

#### *Reviewed by:*

*Lynne Ann Barker, Sheffield Hallam University, UK Nicholas Morton, Tickhill Road Hospital, UK*

**Keywords: brain injury, chronic, executive function, psychometric test, compuetrized test, neuropsychological assessment**

#### **A commentary on**

#### **Cooking breakfast after a brain injury**

*by Tanguay, A. N., Davidson, P. S., Guerrero Nuñez, K., and Ferland, M. B. (2014). Front. Behav. Neurosci. 8:272. doi: 10.3389/fnbeh.2014.00272*

Can a computer simulate the smell of burning toast? The paper by Tanguay et al. (2014) examined the utility of a computer simulated cooking task for people with an acquired brain injury. The paper highlights an important challenge in clinical neuropsychology—that of developing methods for testing everyday functioning without having to be in everyday situations.

But why is this important? There are already a great many questionnaires used to assess functional capacity after brain injury. However, it is now well recognized that people with brain injury may have impaired self-awareness and thus may not be able to provide an accurate self-assessment of their abilities (McBrinn et al., 2008; Caldwell et al., 2014). The reliability of third-party report from caregivers has also been questioned (Barker et al., 2011; McGuire et al., 2014). There is therefore a significant benefit in conducting assessment of real-life performance of functional tasks. Clinically, it is preferable to assess functional performance in the context within which the skills are to be applied—this is the best way to know how a person will perform in a given task or situation. Tests such as the Multiple Errands Test which is in essence a shopping task (Shallice and Burgess, 1991) and the Executive Function Performance Test (simple cooking, telephone use, medication management, and bill payment) (Baum et al., 2008), apply this principle capably, by using the natural environment as the "laboratory" but while also applying a degree of scientific rigor through the use of a standardized testing protocol.

However, the conduct of tests in their naturalistic environment poses a number of challenges for both clinicians and researchers. For example, there are very practical considerations such as the availability of a suitable environment in which to conduct testing. Evaluating the ability to negotiate the multiple aisles of a large supermarket may not be easy for those in rural areas where there may not be a large market within easy commuting distance. Being able to regulate a gasoperated stove, which is qualitatively different to cooking with an electric stove, will depend on the availability of gas in the area in which testing is being done. Assessing the ability to catch the right bus and alight at the correct stop poses logistical demands on the assessor. There are also possibly additional health and safety challenges associated with testing people with impaired abilities in the naturalistic environment. Arguably it is also more difficult to standardize the evaluation process in a situation where the very nature of the environment is that is not standardized: buses run late, the products change in shop aisles, each cooker is a little bit different. These challenges highlight a tension between the benefits of ecologically valid testing and the practical difficulties this type of testing entails.

In this context, a small number of computerized tests have been developed to simulate functional tasks of everyday life such as working in an office environment (Lamberts et al., 2010) or a virtual version of the multiple errands task (Rand et al., 2009). The paper by Tanguay et al. is an example of the growing interest in harnessing the capabilities of computer technology to evaluate functional abilities. It is easy to understand the appeal of this approach. For example, standardized testing is much easier to achieve—the assessor determines the parameters to be tested; automated recording is possible, such as reaction time or time taken to achieve a task; the need to have an "actual" testing environment is removed; the testing can be done without any need for special planning so logistical demands are negated; environmental hazards are removed and it may also reduce the demand on therapist time.

However, the cost of developing these technologies is significant and there may still be a significant gap between the technology used in computer simulations and the scientific requirements associated with psychometric testing. It is also the case that computer simulations cannot fully replicate the uncertainties of everyday life. In the paper by Tanguay et al. the test demands focused on timing of food preparation and setting a table. However, what happens in a real kitchen is multisensorial—one will hear the microwave bing and the kettle whistling, smell the toast burning and visualize the eggs to see if they are cooking evenly. It is understandable that there may be concern that the ecological validity of a test will be compromised if the test is not conducted in the relevant naturalistic environment. However, the rapid evolution of interactive computing such as that used in serious gaming and virtual reality applications points to the potential for exceptionally "life-like" testing environments, including 4-D simulations which can include a variety of sensory stimuli such as vibration, odors, and tactile components. Ultimately, the value of computerized simulations will be determined by the extent to which they can predict everyday functional performance in the real world. The study by Tanguay has made a useful contribution to this field but also highlights the ongoing challenge to maximize the potential of computer simulations within the complex world of clinical practice.

# **REFERENCES**

Barker, L., Morton, N., Morrison, T. M., and McGuire, B. E. (2011). Inter-rater reliability of the Dysexecutive Questionnaire (DEX): comparative data from non-clinician respondents–all raters are not equal. *Brain Inj.* 25, 997–1004. doi: 10.3109/02699052.2011.597046


MET as an assessment tool for executive functions. *Neuropsychol. Rehabil.* 19, 583–602. doi: 10.1080/09602010802469074


**Conflict of Interest Statement:** The author declares that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

*Received: 12 August 2014; accepted: 30 September 2014; published online: 22 October 2014.*

*Citation: McGuire BE (2014) Assessing complex executive functions with computerized tests: is that toast burning? Front. Behav. Neurosci. 8:362. doi: 10.3389/ fnbeh.2014.00362*

*This article was submitted to the journal Frontiers in Behavioral Neuroscience.*

*Copyright © 2014 McGuire. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.*

# Virtual multiple errands test (VMET): a virtual reality-based tool to detect early executive functions deficit in Parkinson's disease

**Pietro Cipresso<sup>1</sup>\*, Giovanni Albani <sup>2</sup> , Silvia Serino<sup>1</sup> , Elisa Pedroli <sup>1</sup> , Federica Pallavicini <sup>1</sup> , Alessandro Mauro<sup>2</sup> and Giuseppe Riva1,3**

<sup>1</sup> Applied Technology for Neuro-Psychology Lab, IRCCS Istituto Auxologico Italiano, Milano, Italy

<sup>2</sup> Division of Neurology and Neurorehabilitation, IRCCS Istituto Auxologico Italiano, Oggebbio, Italy

<sup>3</sup> Department of Psychology, Università Cattolica del Sacro Cuore, Milano, Italy

#### **Edited by:**

Nuno Sousa, University of Minho, Portugal

#### **Reviewed by:**

Lynne Ann Barker, Sheffield Hallam University, UK Nicholas Morton, Rotherham, Doncaster and South Humber Mental Health NHS Foundation Trust, UK

#### **\*Correspondence:**

Pietro Cipresso, Applied Technology for Neuro-Psychology Lab, IRCCS Istituto Auxologico Italiano, Via Ariosto, 13, Milano, 20145, Italy e-mail: p.cipresso@auxologico.it

**Introduction**: Several recent studies have pointed out that early impairment of executive functions (EFs) in Parkinson's Disease (PD) may be a crucial marker to detect patients at risk for developing dementia. The main objective of this study was to compare the performances of PD patients with mild cognitive impairment (PD-MCI) with PD patients with normal cognition (PD-NC) and a control group (CG) using a traditional assessment of EFs and the Virtual Multiple Errands Test (VMET), a virtual reality (VR)-based tool. In order to understand which subcomponents of EFs are early impaired, this experimental study aimed to investigate specifically which instrument best discriminates among these three groups.

**Materials and methods**: The study included three groups of 15 individuals each (for a total of 45 participants): 15 PD-NC; 15 PD-MCI, and 15 cognitively healthy individuals (CG). To assess the global neuropsychological functioning and the EFs, several tests (including the Mini Mental State Examination (MMSE), Clock Drawing Test, and Tower of London test) were administered to the participants. The VMET was used for a more ecologically valid neuropsychological evaluation of EFs.

**Results**: Findings revealed significant differences in the VMET scores between the PD-NC patients vs. the controls. In particular, patients made more errors in the tasks of the VMET, and showed a poorer ability to use effective strategies to complete the tasks. This VMET result seems to be more sensitive in the early detection of executive deficits because these two groups did not differ in the traditional assessment of EFs (neuropsychological battery).

**Conclusion**: This study offers initial evidence that a more ecologically valid evaluation of EFs is more likely to lead to detection of subtle executive deficits.

**Keywords: virtual reality, executive function, VMET, psychometric assessment, Parkinson's disease, mild cognitive impairment**

# **INTRODUCTION**

The umbrella term "executive function" (EF) refers to a broad set of high-level cognitive abilities used to regulate actions (Burgess and Simons, 2005; Chan et al., 2008; Otero and Barker, 2013). These cognitive abilities range from the capacity to problem solve, plan, sustain attention, utilize internal/external feedback, multitasking and cognitive flexibility and ability to deal with novelty (Damasio, 1995; Stuss et al., 1995; Grafman and Litvan, 1999; Burgess et al., 2000; Miller and Cohen, 2001; Strauss et al., 2006; Stuss, 2007; Chan et al., 2008; Goldberg, 2009). Impairment of EF is extremely common in neurological patients, specifically in those presenting with frontal pathology (Bechara et al., 1994; Stuss et al., 1995; Burgess and Shallice, 1996a,b; Dreher et al., 2008; Barker et al., 2010; Morton and Barker, 2010; Cole et al., 2013). Although EFs are thought to be mediated by frontal brain regions, frontal areas have multiple connections with cortical and subcortical regions, as well as to the amygdala, cerebellum, and basal ganglia (for a review, see Tekin and Cummings, 2002). Specifically, functional magnetic resonance imaging (fMRI) studies have shown that BOLD signals increase in the basal ganglia during the performance of EF tasks which require cognitive flexibility, shifting of mental sets, and updating of working representations (Cools et al., 2004; Leber et al., 2008; Hikosaka and Isoda, 2010). Further evidence that the basal ganglia is part of the circuitry crucial for executive functioning comes from studies with patients with basal ganglia lesions, specifically patients who suffer from Parkinson's Disease (PD; Cools et al., 1984, 2001; McKinlay et al., 2010). Indeed, in addition to the typical motor signs, a number of different cognitive deficits have received relevant clinical attention in PD (Levy et al., 2002; Vingerhoets et al., 2003; Foltynie et al., 2004; Muslimovi´c et al., 2005; Williams-Gray et al., 2009). The characteristics of cognitive impairment in PD may be extremely variable in regard to the timing of the onset and the rate of progression (Aarsland et al., 2005, 2007; Buter et al., 2008; Hely et al., 2008), and in terms of what cognitive functions are impaired (Verleden et al., 2007; Kehagia et al., 2010). Even if the neuropsychological profile of patients who suffer from PD is heterogeneous, including memory deficits (Whittington et al., 2006; Ramanan and Kumar, 2013) and visuo-spatial impairments (Montse et al., 2001; Kemps et al., 2005), it is marked specifically by executive deficits (Cools et al., 2001; McKinlay et al., 2010). Moreover, the impairment of EFs appears to be the core feature of a neuropsychological profile in PD-related dementia (Girotti et al., 1988; Jacobs et al., 1995; Padovani et al., 2006; Pagonabarraga and Kulisevsky, 2012; Kudlicka et al., 2013). Similar executive deficits also can be found in nondemented PD patients (for reviews, see Kudlicka et al., 2011; Ceravolo et al., 2012), but they are more severe in patients who suffer from dementia. Following this direction, several recent studies have pointed out the predictive value of early EF deficits in the transitional stage of mild cognitive impairment (MCI) of the disease (Levy et al., 2002; Woods and Tröster, 2003). The concept of MCI, originally introduced to identify the earliest cognitive changes due to Alzheimer's Disease (AD; Petersen et al., 2001; Petersen, 2004), has also been applied to PD to improve the detection of patients at risk for developing dementia (Aarsland et al., 2011). Litvan et al. (2012) proposed the diagnostic guidelines to facilitate the diagnosis of "mild cognitive impairment in Parkinson's Disease" (PD-MCI). These criteria are generally based on the established principles of MCI given by Petersen, namely, subjective cognitive decline and objective evidence of impairment assessed by neuropsychological evaluation that does not interfere with functional independence (Petersen, 2004). Similar to AD, the risk of developing dementia increases appreciably with the presence of PD-MCI (Janvin et al., 2006). As underlined by Biundo et al. (2013), a great challenge today is to characterize the neuropsychological profile of PD-MCI and to evaluate the screening power of traditional neuropsychological tests. In their work, 104 PD patients were given an extensive neuropsychological evaluation. Results showed that specific neuropsychological tests measuring attentional/setshifting, verbal memory, and visual-spatial functions are the best predictors of PD-MCI. In this perspective, EF dysfunction is a possible marker of potentially more severe cognitive impairment and may indicate a likely decline into dementia. Similarly, Goldberg proposed that EF deficits are also key markers for later dementia in AD (Goldberg, 2009). Petrova et al. compared the performances of 23 patients suffering from amnestic PD-MCI utilizing 25 cognitively healthy controls to investigate which subcomponent of EFs are impaired in PD-MCI patients (Petrova et al., 2010). The diagnosis of MCI was made according to modified criteria proposed by Petersen et al. (2001). They found that amnestic PD-MCI patients showed impairment in several aspects of attention/EFs, including the ability to inhibit irrelevant responses and in cognitive flexibility, as measured by the Stroop test (Stroop, 1935) and Modified Wisconsin Card Sorting Test (Nelson, 1976), in formulating and following a complex plan, as revealed by Trail Making Test (Greenlief et al., 1985), and in sustaining a cognitive load during a language test, as highlighted by the phonemic and semantic verbal fluency test (Lezak, 1995). These findings underline the need for a complex evaluation of EFs in MCI-PD patients, especially in the possible relationship between these early executive impairments and behavioral change.

Previous studies indicate a need for rigorous ecologically valid assessments that reliably capture subtle impairments that may be markers for later dementia. In fact, there are some critical issues in the traditional neuropsychological evaluation of EFs (Chan et al., 2008). A more ecological and prompt assessment of EFs is essential to evaluate the specific cognitive profile of different individuals (Goldstein, 1996; Chaytor and Schmitter-Edgecombe, 2003; Burgess et al., 2006). Indeed, the traditional evaluation does not reflect the complexity of EFs in everyday situations. A more detailed assessment may evaluate if individuals are able to formulate, store, and check all the goals and subgoals in order to effectively respond to environmental and/or internal demands. In this direction, there are some instruments developed to measure executive deficits in situations similar to daily ones, such as the Behavioral Assessment of Dysexecutive Syndrome (BADS; Wilson et al., 1996) and the Multiple Errands Test (MET; Shallice and Burgess, 1991). The BADS (Wilson et al., 1996) consists of six subtests and a Dysexecutive Questionnaire (DEX). The DEX is designed to assess everyday cognitive, emotional, and behavioral changes, and it is completed by the patient (self-rating: DEX-S) and a person who knows the patient (independent rater: DEX-I). Although the BADS has good validity (Wilson et al., 1998), and the DEX was recently found to be, with some limitations, a useful instrument for capturing changes in to day to day functioning (Barker et al., 2011), it does not measure performance during real-life tasks. An interesting example of a functional instrument is the MET (Shallice and Burgess, 1991), in which participants are invited to complete different tasks following specific rules to adhere to within a specified time frame. Even the simplified versions of the MET, however, adapted especially to be performed in a hospital setting or a nearby shopping mall (Alderman et al., 2003), can be particularly demanding for a patient because these versions require good motor skills; for a clinician these versions are time consuming and demand high economic costs.

To address the issue of ecological validity and clinical utility, virtual reality (VR) appears to be an appropriate instrument for the evaluation of EFs because it provides the chance to deliver different tasks within ecologically valid, controlled, and secure environments (for a review, see Bohil et al., 2011). Based on this, the virtual version of the Multiple Errands Test (VMET) has been recently developed and tested in different clinical populations (Albani et al., 2011; Raspelli et al., 2012; Cipresso et al., 2013a). The VMET is a VR-based tool aimed at evaluating different aspects of EFs by enabling active exploration of a virtual supermarket, where participants are requested to buy various products presented on shelves and to abide by different rules. Thanks to the potential of the VR, with the VMET the real functional status of patients can be easily evaluated, as manifested in executive dysfunctions, which had not been fully acknowledged in laboratory tests. Specifically, the VMET measures a patient's ability to formulate, store, and check all the goals and subgoals to effectively respond to environmental demands in ecological situations and to complete the specified task. The VMET has demonstrated good inter-rate reliability, showing an intraclass correlation coefficient (ICC) of 0.88 (Cipresso et al., 2013b) and good usability (Pedroli et al., 2013). This test has demonstrated that it can be used with patients who are not familiar with computerized tests. On the basis of these methodolical strengths, we argue that the VMET may significantly improve the traditional assessment of EFs in PD-MCI patients.

The main objective of this study is to compare the performances of PD-MCI with PD with normal cognition and cognitively healthy controls using traditional assessments of EFs and the VMET. In order to understand which subcomponents of EFs are early impaired, this experimental study aimed to specifically investigate the instruments that best discriminate among these three groups.

# **MATERIALS AND METHODS**

#### **PARTICIPANTS**

A total of 45 participants allocated to three groups were included in the study: 15 PD patients with normal cognition (PD-NC), 15 PD patients suffering from MCI, and 15 cognitively healthy individuals (CG, control group). The PD-NC group was composed of six women (40%) and nine men (60%), while the PD-MCI and the CG included seven women (46.7%) and eight men (53.3%) and nine women (60%) and six men (40%), respectively. CG and PD patients were recruited from the San Giuseppe Hospital's Istituto Auxologico Italiano in Verbania, Italy. Individuals did not receive money for their participation in the study. Detailed demographic and clinical characteristics of the three groups are reported in **Table 1**. Individuals gave their written consent for the procedures, which were approved by the Ethical Committee of the Istituto Auxologico Italiano.

**Table 1 | Demographic characteristics of the three groups of the study: PD patients with normal cognition (PD-NC), PD patients suffering from mild cognitive impairment (PD-MCI), and healthy individuals (CG, control group)**.


#### **NEUROPSYCHOLOGICAL GLOBAL ASSESSMENT AND PARKINSON's DISEASE CLASSIFICATION**

PD patients were classified into the two cognitive groups (PD-NC and PD-MCI), following the guidelines of the Task Force for the diagnosis of PD-MCI (Litvan et al., 2012). The proposed PD-MCI criteria utilized a two-level schema depending on the comprehensiveness of the neuropsychological testing. The Level I and II categories represent PD-MCI, but they differ in regard to the type of neuropsychological assessment and, consequently, the level of diagnostic certainty. Specifically, for the diagnosis of PD-MCI by Level II criteria, the Task Force recommends comprehensive neuropsychological testing that highlights either two impaired tests in one cognitive domain or one impaired test in two different cognitive domains. For the division of PD patients into PD-ND and PD-MCI (Level II), a comprehensive neuropsychological battery with at least two neuropsychological tests per cognitive domain was employed. First, to evaluate the cognitive functioning of the participants in the study, the Mini Mental State Examination (MMSE; Folstein et al., 1975) was administered. The MMSE is a brief questionnaire widely used to obtain a picture of an individual's present cognitive performance in different cognitive domains (short- and long-term memory, orientation, attention, verbal fluency, and constructional apraxia). A score of <24 is generally the accepted cutoff, indicating the presence of cognitive impairment. The MMSE has been validated in the Italian sample with 1019 elderly subjects (aged 65–89 years) (Magni et al., 1996).

To evaluate the visuo-spatial function, the Behavioral Inattention Test (BIT; Wilson et al., 1987) was used. The BIT is traditionally used to screen for neglect behaviors, and it consists of six conventional pencil and paper subtests and nine behavioral subtests reflecting aspects of daily life. In the present study, the Italian validation of the BIT's conventional subtests was administered (Wilson et al., 2010): line crossing, letter cancellation, star cancellation, figure and shape copying, line bisection, and representational drawing. The maximum total score is 146 points.

To assess language comprehension abilities, the Token test was administered within the brief neuropsychological examination (Mondini et al., 2003). This is a simple test which requires 20 tokens that vary in shape, color, and size. The Italian validated test has 32 commands, each of which requires the attention and/or the manipulation of one or more of the tokens (e.g., "Put the small red square under the white large circle.").

The Italian validated Digit Span was used to evaluate shortterm memory abilities (Orsini et al., 1987). In this easy-toadminister test, the researcher reads a series of digits aloud to the participant, who is requested to repeat back the same series of digits in the same sequence (i.e., 9–1–7 for 9–1–7). To assess long-term memory abilities, the Short Story test (Novelli et al., 1986a) was administered. The researcher read aloud the Short Story, required participants to provide a first immediate recall, then read aloud the story again, requesting another immediate recall. After a delay of around 15 min, participants were asked for a delayed retrieval. The final score is the average of the number of correctly recalled morphological units over three recall trials.



In order to specifically evaluate the spatial memory abilities of the study's participants, the following standard neuropsychological test was administered: the Corsi Block Test (Corsi, Unpublished Thesis; Spinnler and Tognoni, 1987). This task is used to measure short-term spatial memory (Corsi Span) and long-term spatial memory (Corsi Supraspan). The participants are invited to tap a sequence of wooden blocks in the same order as the researcher, with increasing span length on each trial.

Neuropsychological data for the three groups are reported in **Table 2**.

All scores obtained from these neuropsychological tests have been corrected for age, education level, and gender, according to Italian normative data.

#### **EXECUTIVE FUNCTIONS EVALUATION**

In order to fully evaluate the executive functioning of the study participants, a comprehensive standard neuropsychological battery focused on the different aspects of EF was administered.

The Clock Drawing Test (Freedman et al., 1994; Caffarra et al., 2011) has been traditionally used to assess a wide range of cognitive abilities including EFs, specifically understanding verbal instructions and abstract thinking and planning abilities. It is brief, easy to administer, and has excellent patient acceptability. Participants were required to draw numbers in a circle on a paper to resemble a clock and then draw the hands of the clock to read "10 after 11".

To evaluate multi-tasking and cognitive flexibility, two types of verbal fluency tests were employed. Phonological verbal fluency (Novelli et al., 1986b; Lezak, 1995) is a traditional neuropsychological measure of language production in which a number of words are given with an initial letter (e.g., F). Semantic verbal fluency (Novelli et al., 1986b; Lezak, 1995) is a more complex traditional neuropsychological measure of language production in which the number of words in a specific category produced in 60 s (e.g., animals) is evaluated. Both tests require participants to use executive processes to solve them because an efficient and creative organization of the verbal retrieved material, as well as the inhibition of responses when appropriate, is crucial.

To specifically detect early deficits in problem-solving and planning, the Tower of London test (Shallice, 1982; Fancello et al., 2006) was administered. The researcher explained the rules of the task (e.g., don't make more moves than necessary), and then used one tower with three rods of descending heights and a set of beads to display the desired goal: Participants are invited to rearrange the set of beads on the tower to match the examiner's configuration.

#### **THE VIRTUAL MULTIPLE ERRANDS TEST (VMET)**

The VMET consists of a Blender-based application that enables the active exploration of a virtual supermarket, where participants are requested to select and buy various products presented on shelves. From a technical point of view, the VMET was created with the software NeuroVR<sup>1</sup> (Riva et al., 2011), a free virtual-reality platform for creating virtual environments useful for neuropsychological assessment and neurorehabilitation. NeuroVR is software that allows nonexpert users to adapt the content of several virtual environments to the specific needs of the clinical and research setting. Thanks to the NeuroVR Player, it is possible to visualize virtual environments: The user enters the virtual supermarket, and he/she is presented with virtual objects of the various items to be purchased. Each virtual object has been inserted through the NeuroVR Editor, which offers a rich database of 2D and 3D objects; these can be easily placed into the predesigned virtual scenario by using an icon-based interface. Using a joystick, the participant is able to freely navigate the various aisles (using the up-down joystick arrows) and to collect products (by pressing a button placed on the right side of the joystick), after having selected them with the viewfinder. After an initial training phase with a smaller supermarket, the user enters the virtual supermarket and is presented with virtual objects of the various items to be purchased (**Figure 1**).

The virtual supermarket contains products grouped into the main grocery categories, including beverages, fruits and vegetables, breakfast foods, hygiene products, frozen foods, garden products, and pet products. Signs at the top of each section indicate the product categories as an aid for navigation.

Participants are also given a shopping list, a map of the supermarket, some information about the supermarket (opening and closing times, products on sale, etc.), a pen, a wrist watch, and the instruction sheet. The instructions are fully illustrated for the participants, and the rules are explained with precise reference to the instruction sheet. The VMET test is composed of four main tasks. The first involves purchasing six items (e.g., one product on sale). The second involves asking the examiner information about one item to be purchased. The third involves writing the shopping list 5 min after beginning the test. The fourth involves responding to some questions at the end of the virtual session by using useful materials (e.g., the closing time of the virtual supermarket). To complete the task, participants have to follow several rules: (1) they have to execute all the proposed tasks; (2) they can execute all the tasks in any order; (3) they cannot go to a place unless it is a part of a task; (4) they cannot pass through the same passage more than once; (5) they cannot buy more than two items per category (look at the chart); (6) they have to take as little time as possible to complete the exercise; (7) they cannot talk to the researcher unless this is a part of the task; and (8) they have to go to their "shopping cart" after 5 min from the beginning of the task and make a list of all their products. The time is stopped when the

<sup>1</sup>www.neurovr.org

participant says, "I finished." During the task, the examiner takes notes on the participant's behaviors in the virtual environment. As suggested by Shallice and Burgess (1991), the following errors were recorded (please also see the VMET validation procedure in Raspelli et al., 2012): task failures, inefficiencies, strategies, rule breaks, and interpretation failures. A task failure occurs when a subtask is not completed satisfactorily; for example, the first task required participants to purchase six items, so it was composed of six subtasks. For errors in executing the tasks, the scoring range was from 11 (the participants had correctly done the 11 subtasks) to 33 (the participants had totally omitted the 11 subtasks). The scoring scale for each task failure was from 1 to 3 (1 = the participant performed the task 100% correctly as indicated by the test; 2 = the participant performed aspects of the task, but not completed 100% accurately; 3 = the participant totally omitted the task). An inefficiency occurs when a more effective strategy could have been applied to accomplish the task. An example of the eight inefficiencies is not grouping similar tasks when it is possible. The scoring range was from 8 (many inefficiencies) to 32 (no inefficiencies). More precisely, the scoring scale for each inefficiency was from 1 to 4 (1 = always; 2 = more than once; 3 = once; 4 = never). To measure the participant's ability to use effective strategies that facilitate carrying out the tasks, it is possible to evaluate 13 possible strategies. An example of a good strategy is doing accurate planning before starting a specific subtask. For each strategy, the scoring scale for each strategy was from 1 to 4 (1 = always; 2 = more than once; 3 = once; 4 = never). The total score range was from 13 (good strategies) to 52 (no strategies). A rule break occurs when one of the eight rules listed in the instruction sheet has been violated (e.g., talking with the examiner when not necessary). The scoring scale for each rule break was from 1 to 4 (1 = always; 2 = more than once; 3 = once; 4 = never). For

rule breaks, the scoring range was from 8 (a large number of rule breaks) to 32 (no rule breaks). Finally, an interpretation failure occurs when the requirements of a particular task are misunderstood; for example, when a participant thinks that the subtasks all have to be done in the order presented in the information sheet. The scoring range was from 3 (a large number of interpretation failures) to 6 (no interpretation failures). The scoring scale for each interpretation failure was from 1 to 2 (1 = yes; 2 = no).

### **PROCEDURE**

After participants gave written informed consent to participate, they underwent a neuropsychological global assessment; this was done in order to obtain an accurate overview of their cognitive function and to split the PD sample according to the guidelines of the Task Force for the diagnosis of PD-MCI (Litvan et al., 2012). Then, all participants were required to complete the neuropsychological functions evaluation. At the beginning of the experimental session, participants were asked to sit at a desk in front of a computer monitor to complete the VMET. The VMET was rendered using a portable computer (Intel Core 2 Duo with graphics board OpenGL compatible and 256 MB video memory; operative System was Microsoft Windows XP). Participants also had a gamepad (Logitech Rumble F510), which allowed them to explore and interact with the environment. Then they were asked to complete the VMET procedure after a training session. A training period of about 15 min was first provided in a smaller version of the virtual supermarket environment in order to familiarize participants with the navigation and shopping tasks.

# **RESULTS**

Data were entered into Microsoft Excel and analyzed using SPSS version 18 (Statistical Package for the Social Sciences–SPSS for Windows, Chicago, IL, USA). To investigate differences in EFs and VMET scores between groups (CG vs. PD-NC vs. PD-MCI), a series of analysis of variance were calculated. *Post hoc* tests (with Bonferroni's adjustment) were carried out to compare significant differences. The level of significance was set at α = 0.05.

#### **EXECUTIVE FUNCTION SCORES**

In order to investigate differences in neuropsychological evaluation of EFs, a series of analysis of variance were computed with *groups* (CG vs. PD-NC vs. PD-MCI) as between variable. Five participants (three of the PD-MCI and two of the PD-NC group) were not included in the Clock Drawing Test analyses for errors in the score recording. Moreover, one patient from the PD-MCI group was excluded from the analyses of the phonological and semantic verbal fluency tests.

Regarding the Clock Drawing Test, results showed significant differences between groups [*F*(2,37) = 9.82, *p* < 0.001, η*<sup>p</sup>* <sup>2</sup> = 0.347]. In particular, *post hoc* comparisons indicated that PD-MCI patients performed significantly poorer (*M* = 7.62, *SD* = 2.25) when compared with the CG (*M* = 9.83, *SD* = 0.224, *p* < 0.001) and with the PD-NC group (*M* = 9.3, *SD* = 0.804, *p* < 0.01).

In regard to the Phonological verbal fluency test, findings showed significant differences between groups [*F*(2,41) = 34.7, *p* < 0.001, η*<sup>p</sup>* <sup>2</sup> = 0.629]. *Post hoc* comparisons demonstrated that the CG performed significantly better (*M* = 50.1, *SD* = 8.55) when compared with the PD-MCI (*M* = 22.7, *SD* = 10.1, *p* < 0.001) and PD-NC group (*M* = 32.9, *SD* = 8.55, *p* < 0.001). Moreover, mean scores of the PD-NC group were significantly higher (*p* < 0.05) when compared with those of the PD-MCI group.

In regard to the Semantic verbal fluency test, the oneway ANOVA showed significant differences between groups [*F*(2,41) = 21.8, *p* < 0.001, η*<sup>p</sup>* <sup>2</sup> = 0.516]. In particular, *post hoc* comparisons revealed that the CG (*M* = 53.4, *SD* = 7.8) performed significantly better when compared with the PD-NC (*M* = 43.5, *SD* = 9.2, *p* < 0.01) and the PD-MCI group (*M* = 33.2, *SD* = 8.01, *p* < 0.001). More interestingly, findings showed that the PD-MCI group was significantly worse (*p* < 0.01) than the PD-NC group.

Finally, analysis conducted on the Tower of London test revealed significant differences between groups [*F*(2,42) = 16.5, *p* < 0.001, η*<sup>p</sup>* <sup>2</sup> = 0.441]. *Post hoc* comparisons showed that the PD-MCI group performed significantly poorer (*M* = 15.5, *SD* = 5.01) than the CG (*M* = 26.4, *SD* = 4.64, *p* < 0.001) and the PD-NC group (*M* = 23, *SD* = 5.01, *p* < 0.001).

**Table 3** summarized the main results.

#### **VMET SCORES**

In order to investigate differences in VMET scores, a series of analysis of variance were computed with *groups* (CG vs. PD-NC vs. PD-MCI) as between variable. First of all, in regard to the time needed for each participant to complete the task, analysis showed significant differences between groups [*F*(2,42) = 3.83, *p* < 0.05, η*p* <sup>2</sup>= 0.154]. In particular, *post hoc* analyses indicated that the PD-MCI group took significantly less time (*M* = 1223, *SD* = 579, *p* < 0.05) compared with CG (*M* = 727, *SD* = 308).

Concerning the task failure, results showed significant differences between groups [*F*(2,42) = 20.2, *p* < 0.001, η*<sup>p</sup>* <sup>2</sup>= 0.491]. In particular, *post hoc* comparisons indicated that the CG performed significantly better (*M* = 14.3, *SD* = 2.32) when compared with the PD-NC group (*M* = 22.3, *SD* = 4.25, *p* < 0.001) and the PD-MCI group (*M* = 22.8, SD = 5.22, *p* < 0.001).

Regarding inefficiencies, findings revealed significant differences between groups [*F*(2,42) = 3.58, *p* < 0.05, η*<sup>p</sup>* <sup>2</sup> = 0.146]. *Post hoc* comparisons indicated that the PD-MCI group performed significantly worse (*M* = 18.6, *SD* = 4.03, *p* < 0.05) with respect to the CG (*M* = 24.2, *SD* = 8.18).

Results also showed significant differences between groups in the strategies [*F*(2,42) = 9.82, *p* < 0.001, η*<sup>p</sup>* <sup>2</sup> = 0.319]. In particular, *post hoc* comparisons indicated that the CG used significantly more effective strategies (*M* = 32.2, *SD* = 5.3) when compared with the PD-NC group (*M* = 40.5, *SD* = 8.69, *p* < 0.01) and the PD-MCI group (*M* = 43.6, *SD* = 7.42, *p* < 0.001).

Finally, no significant differences between groups were found in the rule breaks and in the interpretation failure. Results are summarized in **Table 4**.

#### **DISCUSSION AND CONCLUSION**

Because cognitive impairment is a common complication of PD and is associated with significant disability for patients and a burden for caregivers, it is crucial to fully investigate the distinguishing features of the neuropsychological profile in this clinical population (Aarsland et al., 1999, 2000; Schrag et al., 2000). As the PD progresses, a relevant proportion of patients will develop dementia (Aarsland et al., 2003; Bosboom et al., 2004; Hely et al., 2008). Specifically, Aarsland et al. (2005) found that more than 30% of PD patients have dementia. So the focus now is to identify patients with a potentially higher risk of dementia, with the possibility to implement an early and individualized cognitive rehabilitation treatment to improve their quality of life.

**Table 3 | One-way ANOVA results of mean scores obtained by participants divided into the three groups at the EF tasks.**


Values are shown as mean (SD).

p values: \*\*\* <0.001, \*\* <0.01, \* <0.05, N.S. = Nonsignificant.


**Table 4 | One-way ANOVA results of mean scores obtained by participants divided into the three groups at the VMET.**

Values are shown as mean (SD).

p values: \*\*\* <0.001, \*\* <0.01, \* <0.05, N.S. = Nonsignificant.

Particularly, an increasing number of studies have suggested that the executive deficits in PD are predictive of the conversion to dementia (Levy et al., 2002; Woods and Tröster, 2003).

On these premises, the main objective of this study was to investigate the potentiality of the VMET, to integrate the traditional neuropsychological evaluation of EFs in PD with a more ecologically valid evaluation. This study offers initial evidence that a more ecologically valid evaluation of EFs is more likely to lead to detection of subtle executive deficits in PD patients. VMET specifically seems to capture the early executive dysfunctions of PD-NC patients, while they did not differ in the traditional assessment of EFs when compared to CG.

First, although some recent reviews suggested that executive deficits are present in the early stage of PD (Kudlicka et al., 2011; Ceravolo et al., 2012), our results showed that PD-NC patients were not impaired in the traditional neuropsychological evaluation of EFs when compared with the CG. In fact, in their review, Kudlicka et al. (2011) underlined that studies on EFs in PD are marked by a general lack of clarity in regard to the measure selection and their clinical interpretation. Obviously, it is crucial to acknowledge the possibility that different results across studies might reflect the different tests used, and the underlying functions that the tests are thought to capture. So it is crucial to fully understand which subcomponents of EFs are impaired early in this population. In this direction, Kudlicka et al. (2013) used a data-driven approach to investigate which areas of EF are particularly deficient in 34 patients with PD. Results showed that the impairment was more profound in tests requiring timeefficient attentional control; for example, the Trail Making Test (Tombaugh, 2004).

Our findings showed only a significant difference in the semantic verbal and phonetic verbal fluencies between PD-NC and cognitively healthy participants. As previously explained, verbal fluency tests measure several EF components, including setswitching, strategy generation, and rule attainment, along with other non-EF components such as semantic memory and verbal lexicon. Our results are consistent with a recent meta-analysis that reports verbal fluency deficits in PD (Henry and Crawford, 2004). Specifically, Henry and Crawford (2004) found that PD patients were significantly more impaired in semantic fluency, concluding that this deficit may be associated not only with a problem in executive functioning, but also properly with an initial disorder in the semantic memory (namely, concept-based knowledge). Also, in an interesting study with 88 PD patients and 65 healthy participants, Koerts et al. (2013) pointed out that verbal fluency deficits can be interpreted in light of the progression of the disease and the dysfunctions in other cognitive domains. The performance in the verbal fluency tests is explained by the psychomotor speed in the mild stage of PD, while the cognitive flexibility accounts for deficits in those tests in the moderate phases of the disease.

Concerning the VMET, as previously indicated, our main findings revealed significant differences in some VMET scores between the PD-NC and the cognitively healthy participants. Specifically, within all VMET scores, it is interesting to note a significant difference in task failure and strategies between these two groups. PD-ND patients, compared with cognitively healthy controls, made a greater number of errors in completing the subtasks of VMET. Furthermore, compared with the CG, PD-ND patients showed poorer ability in using effective strategies that facilitate the carrying out of the tasks; for example, accurate planning before starting a specific subtask or using the map for navigating the virtual supermarket. These executive deficits may reflect a specific deficit in cognitive flexibility; namely, the ability with which a person's conceptualization changes selectively to effectively respond to external/internal stimulation. This may also explain why there is no significant difference between PD-ND and PD-MCI in these VMET scores. Indeed, to discriminate between PD-MCI and PD-ND, it is important to follow the recent guidelines of the Task Force (Litvan et al., 2012). So our findings confirm that the traditional assessment of EFs appears to be more useful to detect differences in the EFs between these two cognitive groups.

In conclusion, our results showed that the VMET appears sensitive to evaluate the functional status of PD with normal cognition, as manifested in terms of executive deficits, which had not been fully acknowledged by traditional neuropsychological evaluations. The VMET allows the possibility to evaluate some subcomponents of EFs in ecological settings, giving a more accurate estimate of the patients' deficits that are difficult to detect with traditional tests.

As previously explained, one of the most crucial criticisms of the neuropsychological tests is the lack of ecological validity (Goldstein, 1996; Chaytor and Schmitter-Edgecombe, 2003; Chan et al., 2008). Even though patients with supposed executive deficits may perform as well as controls on traditional neuropsychological tests, they may experience difficulties in real world situations. VR may be used to offer a new human-computer interaction paradigm in which patients are active participants within an ecological virtual world (Riva, 2009). In virtual tasks such as the VMET, it is possible to simulate life-like challenges, which require a more complex series of goals to achieve and the cognitive flexibility to elaborate different strategies to accomplish them and to inhibit inappropriate actions.

Our results may also represent a theoretical contribution in the attempt to isolate the specific subcomponents of EF. Most of the traditional neuropsychological tests, therefore, measure one specific EF component, but they don't reflect a true picture of a functional patient's status. According to different theories, however, EF is best conceptualized as a system of interconnected processes guided necessarily by a central supervisor system to facilitate goal-oriented behavior (Luria, 1966; Norman and Shallice, 1986; Miller and Cohen, 2001; Miller et al., 2002). Our findings contribute to emphasize the idea that a breakdown in the executive control mechanisms is reflected in deficits in many multitasking behaviors, such as effective planning and strategy allocation and monitoring.

The findings of this study are interesting and valuable, but there are some limitations. First, the small sample size of 45 participants may limit the generalizability of the results. The sample, however, was carefully evaluated with a comprehensive neuropsychological assessment according to the criteria established by Litvan et al. (2012). Second, considering the use of computerized tests for PD patients with motor deficits, it would be important to also assess the individual's perception of VMET usability (for example, difficulties during the experience in using the joystick, selecting products from aisles, and learning to move in the supermarket). As explained above, a recent study showed good usability of this virtual instrument (Pedroli et al., 2013). The performance on the VMET, however, must be read with consideration of the motor deficit. A final limitation of our study is the difference between the PD and CGs in terms of years of educations. All scores obtained from neuropsychological tests were corrected for education level according to Italian normative data, but the results from VMET must be viewed according to this potential limit. A future challenge is to explore the relative impact of age, gender, education on VMET scores: for example, in an interesting work of Boone (1999) it was found that the impact of educational level and gender was limited to some Wisconsin Card Sorting Test score. Obviously, further studies are needed to evaluate the potentiality of the VMET, especially in terms of its temporal stability, namely, test–retest reliability and criterion validity for PD. This study, however, provides initial evidence that a more ecological evaluation of EFs may provide the possibility to also detect subtle executive deficits in PD-ND patients.

All participants' data were memorized in encrypted and password-protected files, following the criteria to protect personal health information (El Emam et al., 2011) and using PsychoPass method (Cipresso et al., 2012) to generate and share passwords information among colleagues.

#### **AUTHORS' CONTRIBUTION**

Conceived and designed the experiments: Pietro Cipresso, Giovanni Albani, Silvia Serino, Alessandro Mauro, Giuseppe Riva. Performed the experiments: Elisa Pedroli. Analyzed the data: Pietro Cipresso, Silvia Serino, Federica Pallavicini. Wrote the first version of the paper: Silvia Serino. Revised and contributed to the last version of the paper: Pietro Cipresso, Giovanni Albani, Silvia Serino, Elisa Pedroli, Federica Pallavicini, Alessandro Mauro, Giuseppe Riva.

# **ACKNOWLEDGMENTS**

This study was supported by the Italian funded project "VRehab. Virtual Reality in the Assessment and TeleRehabilitation of Parkinson's Disease and Post-Stroke Disabilities"—RF-2009- 1472190.

The authors are grateful to the anonymous Reviewers for their advice and suggestions, many of which were used in the final version, making the paper better than the early version.

# **REFERENCES**


**Conflict of Interest Statement**: The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

*Received: 17 May 2014; accepted: 07 November 2014; published online: 05 December 2014*.

*Citation: Cipresso P, Albani G, Serino S, Pedroli E, Pallavicini F, Mauro A and Riva G (2014) Virtual multiple errands test (VMET): a virtual reality-based tool to detect early executive functions deficit in Parkinson's disease. Front. Behav. Neurosci. 8:405. doi: 10.3389/fnbeh.2014.00405*

*This article was submitted to the journal Frontiers in Behavioral Neuroscience*.

*Copyright © 2014 Cipresso, Albani, Serino, Pedroli, Pallavicini, Mauro and Riva. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution and reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms*.

# Impaired self-awareness after traumatic brain injury: inter-rater reliability and factor structure of the Dysexecutive Questionnaire (DEX) in patients, significant others and clinicians

**Brian E. McGuire<sup>1</sup>\*, Todd G. Morrison<sup>2</sup> , Lynne A. Barker <sup>3</sup> , Nicholas Morton<sup>4</sup> , Judith McBrinn<sup>1</sup> , Sheena Caldwell <sup>5</sup> , Colin F. Wilson<sup>5</sup> , John McCann<sup>5</sup> , Simone Carton<sup>6</sup> , Mark Delargy<sup>6</sup> and Jane Walsh<sup>1</sup>**

<sup>1</sup> School of Psychology, National University of Ireland Galway, Galway, Ireland

<sup>2</sup> Department of Psychology, University of Saskatchewan, Saskatoon, SK, Canada

<sup>3</sup> Department of Psychology, Sheffield Hallam University, Sheffield, UK

<sup>4</sup> Neurorehabilitation Services, Doncaster Rotherham and South Humber NHS Foundation Trust, Doncaster, UK

<sup>5</sup> Regional Acquired Brain Injury Unit, Musgrave Park Hospital, Belfast, UK

<sup>6</sup> National Rehabilitation Hospital, Dublin, Ireland

#### **Edited by:**

Regina M. Sullivan, Nathan Kline Institute and NYU School of Medicine, USA

#### **Reviewed by:**

Stefano Sensi, University of California, Irvine, USA Donal Gerard Fortune, Health Service Executive, Ireland

#### **\*Correspondence:**

Brian E. McGuire, School of Psychology, National University of Ireland Galway, Galway, Ireland e-mail: Brian.mcguire@nuigalway.ie **Aims**: This study sought to address two questions: (1) what is the inter-rater reliability of the Dysexecutive Questionnaire (DEX) when completed by patients, their significant others, and clinicians; and (2) does the factor structure of the DEX vary for these three groups?

**Methods**: We obtained DEX ratings for 113 patients with an acquired brain injury from two brain injury services in the UK and two services in Ireland. We gathered data from two groups of raters—"significant others" (DEX-SO) such as partners and close family members and "clinicians" (DEX-C), who were psychologists or rehabilitation physicians working closely with the patient and who were able to provide an opinion about the patient's level of everyday executive functioning. Intra-class correlation coefficients and their 95% confidence intervals were calculated between each of the three groups (self, significant other, clinician). Principal axis factor (PAF) analyses were also conducted for each of the three groups.

**Results**: The factor analysis revealed a consistent one-factor model for each of the three groups of raters. However, the inter-rater reliability analyses showed a low level of agreement between the self-ratings and the ratings of the two groups of independent raters. We also found low agreement between the significant others and the clinicians.

**Conclusion**: Although there was a consistent finding of a single factor solution for each of the three groups, the low level of agreement between significant others and clinicians raises a question about the reliability of the DEX.

**Keywords: brain injury, dysexecutive, reliability, factor analysis, care giver**

#### **INTRODUCTION**

Impaired self-awareness is a common cognitive deficit after traumatic brain injury and can lead to problems with self-monitoring and behavioral self-regulation (McBrinn et al., 2008). These problems may contribute to difficulties undertaking many everyday functions such as engaging in interpersonal communication, budgeting, household chores and carrying out vocational activities (Godbout et al., 2005). The cognitive capacities associated with self-awareness and self-regulation are considered to be part of the executive system in the frontal lobes of the brain. Executive functions include cognitive operations that contribute to the ability to initiate, inhibit and integrate other functions, simultaneously termed supervisory, attentional or control processes (Shallice and Burgess, 1991; Miyake et al., 2000; Stuss and Alexander, 2007).

A cluster of symptoms associated with these functions is thought to present in "dysexecutive syndrome", the central component of the cluster being impairment in self-awareness and self-regulation (Morton and Barker, 2010). This impairment is assumed to arise from damage to critical areas of the brain that are integral to behavioral self-regulation, typically the frontal lobes. There is, however, growing recognition that self-awareness is a highly complex and multifaceted process that is not exclusive to the frontal lobes. Efforts to identify specific brain areas that may be responsible for self-monitoring and self-regulation have led researchers to acknowledge that multiple pathways may be involved—impaired self-awareness does not appear to be linked exclusively to focal or generalized brain damage or to specific neurocognitive test profiles (Philippi et al., 2012; Caldwell et al., 2014; Ham et al., 2014).

While the study of the underlying processes involved in impaired self-regulation continues, clinicians agree that the capacity for self-monitoring and behavioral self-regulation is important to successful rehabilitation outcomes after brain injury (Winkens et al., 2014). To that end, clinicians are reliant on existing psychometric tools for identifying and quantifying impaired self-regulation. However, measurement of executive ability is challenging as executive function tests are not process-pure they will invariably and unavoidably involve other non-executive functions that may be variously spared or compromised after brain injury (Barker et al., 2010). One commonly used measure of the behavioral manifestation of dysexecutive impairment is the Dysexecutive Questionnaire (DEX; Wilson et al., 1997). The DEX is purported to be an ecologically valid test; that is, it provides an estimation of executive function as applied to everyday life challenges. The interpretation of the DEX score is based on the difference between the client's self-report and the report of another person who knows the client well, with any resultant discrepancy assumed to reflect a lack of self-awareness in the brain-injured person.

As a relatively quick and easy questionnaire to complete, the DEX offers an appealing method of quantifying a complex neuropsychological process. However, the utility of the test relies on two important premises: (1) the third party respondent can give a true and accurate account of the injured person's functioning; and (2) the psychometric validity of the measurement tool is constant across users (i.e., both client and independent rater "versions" are measuring the same construct or factor[s]). Each of these premises is considered briefly.

Regarding the first premise, there is certainly evidence that patient self-reports differ from the reports of their significant others. This finding is not unexpected as the DEX is designed to identify discrepancies in scores that may reflect impairment in self-awareness in people following brain injury. However, some evidence suggests that independent raters may not respond in a similar way about the same person. For example, a study of the inter-rater reliability of the ratings of family members found a low level of agreement among three independent raters reporting about the same individual with a brain injury (Barker et al., 2011). The authors concluded that all raters do not respond in a comparable manner and, thus, it would be erroneous for clinicians to assume that DEX ratings by significant others are always accurate (Barker et al., 2011). The problem of ascertaining whether a rating by a family member is accurate is more complex than it may appear. For example, if one does not use independent ratings of the level of impaired awareness of the person with brain injury, then the other main source of information is objective neuropsychological data. However, the situation here is far from clear—there is not a direct correlation between overall severity of cognitive impairment and level of impaired self-awareness or between scores on specific tests of executive function and level of impaired self-awareness (Barker et al., 2004). Thus, there are ongoing questions regarding the precise nature of impaired selfawareness, its link to overall executive functioning, and the extent to which the construct can be measured by existing questionnairebased tools.

With respect to the second premise of whether the DEX selfrated questionnaire measures the same construct(s) as the DEX completed by independent others, several studies have examined the factor structure of the DEX focusing on DEX self-ratings. Variable findings have been obtained. For instance, a study using a large community sample of more than 1100 people identified a single underlying factor (Gerstorf et al., 2008). Conversely, a study using non-clinical (*N* = 293) and clinical (*N* = 49) samples found a 4-factor solution with factors best described as inhibition, intention, social regulation, and abstract problemsolving (Mooney et al., 2006). A 4-factor model also was identified by Bodenburg and Dopslaff (2008); however, based on different loadings, their interpretations of these factors were: initiating and sustaining actions, impulse control, psychophysical and mental excitability, and social conventions. A study of the factor structure of the DEX in the context of normal aging (Amieva et al., 2003) identified a 5-factor solution: intentionality, interference management, inhibition, planning, and social regulation. Thus, substantial variability is evident in the dimensionality of the DEX.

Only one previous study has tested the factor structure of the DEX amongst independent raters. Using the significant others of 46 adults with varying neurological conditions, that study obtained a 3-factor solution described as behavioral inhibition, goal-directed behavior/intentionality, and executive memory/cognition (Chaytor and Schmitter-Edgecombe, 2007). No studies have yet examined the factor structure of the DEX when completed by independent raters who are reporting about the degree of impairment associated with acquired brain injury. Further, no previous study has compared the factor structure of the DEX when completed by two or more independent raters in relation to the same patient. The fundamental questions addressed by this study are: (1) what are the levels of inter-rater consistency when the DEX is completed by patients, significant others, and clinicians; and (2) does the dimensionality of the DEX vary as a function of the individuals completing it (e.g., client vs. clinician)?

# **METHOD**

# **MEASURES**

The Behavioral Assessment of the Dysexecutive Syndrome (BADS) is considered an ecologically valid, multidimensional measure of executive function comprising six sub-tests and a questionnaire which probes symptoms of Dysexecutive syndrome, called the DEX (Wilson et al., 1997). The DEX is a 20 item questionnaire which the authors describe as having three factors assessing everyday changes in cognition, emotion and behavior after an acquired brain injury or other brain trauma. The DEX is completed by the patient (self-rating: DEX-S) and by a person who knows the patient well (independent rater). In this study, we gathered data from two groups of independent raters— "significant others" (DEX-SO) such as partners and close family members and "clinicians" (DEX-C), who were psychologists or rehabilitation physicians working closely with the patient and who were able to provide an opinion about the patient's level of everyday executive functioning.

Ethical Approval: Each participant and their significant other provided consent to take part and each of the participating services received ethical approval from their local institutional research ethics committee.

#### **PARTICIPANTS**

The number of patients included in this study was 113 (87 males, *M* age = 37.77, *SD* = 12.76; 26 females, *M* age = 38.96, *SD* = 12.06) from two brain injury services in the UK and two services in Ireland. The participants were identified by the service managers and clinicians by virtue of being a client of the service and meeting the inclusion criteria. Inclusion criteria for the study were: 18 years or older, had experienced an acquired brain injury, had sufficient cognitive and physical ability to give informed consent to participate, able to read and respond to the questionnaires. Exclusion criteria were major psychiatric illness or cognitive impairment of such severity that would prevent the ability to consent and/or to respond to the questionnaires. None of those identified by the services as potentially suitable participants refused to participate. In each center, an unspecified number of patients were deemed by the clinician or service manager to not meet the inclusion criteria. The sample in the study would, in the authors' view, be considered typical of those accessing brain injury support services in the UK and Ireland with moderately severe brain injury. All were in the post-acute phase of rehabilitation, typically receiving support services focused on optimizing independent functioning. The mean duration of injury was 57.49 months (*SD* = 44.24), with minimal and maximal time periods of 10 months and 168 months, respectively. The median for duration was 36 months (25th percentile = 24 months; 75th percentile = 84 months). Type of injury data indicated that an overwhelming majority of clients had experienced traumatic head injuries (95%). Finally, with respect to current occupation, the most commonly selected options were: currently unemployed (24.1%), supported training/employment (20.4%) and retired (20.4%).

Data also were collected from caregiver/significant other (DEX-SO; *N* = 101) and clinician (DEX-C; *N* = 64) raters. Within the caregiver/significant others group, parents (*n* = 40), spouses (*n* = 38), siblings (*n* = 9), adult offspring (*n* = 6), friends (*n* = 5), and other family members (*n* = 2) were represented.

#### **RESULTS**

**Table 1** provides descriptive statistics and reliability coefficients for the DEX-S (i.e., self-ratings) as well as the DEX-SO and the DEX-C.

Cronbach's alpha coefficients and their 95% confidence intervals suggest excellent scale score reliability within each respondent group. However, upper bound estimates, particularly for the DEX-SO and DEX-C, suggest that item redundancy may be of concern (Streiner, 2003). As noted in previous research, DEX-SO scores were higher than DEX-S ratings, although this difference was not statistically significant (LSD, *p* = 0.07). DEX-C scores



were lowest of all and differed significantly from DEX-SO scores (LSD, *p* = 0.001).

Intra-class correlation coefficients and their 95% confidence intervals were calculated between DEX-S and DEX-SO items; DEX-S and DEX-C items; and DEX-SO and DEX-C items. This analysis permits one to determine the degree of consistency between self-, significant other, and clinician ratings, with ICC values >0.74 representing an excellent level of agreement; values between 0.60 and 0.74 reflecting good agreement; and values between 0.40 and 0.59 representing fair agreement. Absolute agreement ICCs were estimated using a one-way random effects model (see **Table 2**).

The average level of agreement between self- and significant other ratings was 0.41 (*SD* = 0.09). The averages for self- and clinician ratings and significant other and clinician ratings were 0.15 (*SD* = 0.09) and 0.31 (*SD* = 0.13), respectively. *Post-hoc* testing revealed that these averages differed significantly: self and significant other vs. self and clinician (LSD, *p* < 0.001); self and significant other vs. significant other and clinician (LSD, *p* < 0.01); self and clinician vs. significant other and clinician (LSD, *p* < 0.001). Importantly, the average level of agreement between self and significant other ratings was at the bottom end of the stratum denoting "fair agreement". The remaining averages were poor. These findings suggest there is only nominal consistency in ratings on the DEX among patients, their caregivers, and clinicians.

To assess the dimensionality of the DEX when completed by patients, significant others and clinicians, three principal axis factor (PAF) analyses were conducted. This factor analytic technique is recommended when data have the potential to be nonnormally distributed (Finch and West, 1997). Diagnostics, such as the Kaiser-Meyer-Olkin (KMO) measure of sampling adequacy and Bartlett's test of sphericity, were conducted for each PAF analysis and deemed to be satisfactory (i.e., KMO exceeded 0.90 and Bartlett's test was statistically significant permitting one to reject the null hypothesis that associations among the DEX items may be represented as an identity matrix). Parallel analysis and inspection of the unrotated factor solution were used to assist with factor retention.

When completed by patients, a one-factor solution appeared to best represent the data (i.e., the unrotated solution revealed that

#### **Table 2 | Intra-class correlation coefficients and their 95% confidence intervals between DEX-Self (DEX-S), DEX-Significant Other (DEX-SO) and DEX-Clinician (DEX-C)**.


no items loaded uniquely on any factor besides the first one and there was a negligible difference between the eigenvalue associated with factor 2 [1.09] for the random data and the eigenvalue associated with factor 2 [1.49] for the real data). The eigenvalue associated with the first factor was 8.46 (42.29% of the variance accounted for). Factor loadings ranged from 0.33 to 0.78 (see **Table 3**).

A similar solution emerged when significant others completed the DEX. Specifically, one factor appeared to best represent the data (eigenvalue = 10.46, accounting for 52.32% of the variance). The factor loadings ranged from 0.58 to 0.83 (see **Table 3**). Finally, a one-factor solution also was optimal for the clinicians (eigenvalue = 12.54, accounting for 62.72% of the variance). For this group, factor loadings ranged from 0.49 to 0.92.

#### **DISCUSSION**

This study sought to address two questions: (1) what is the inter-rater reliability of the DEX when completed by patients, their significant others, and clinicians; and (2) what is the factor structure of the DEX for these three groups?

Results suggest there is only nominal agreement in item ratings on the DEX among patients, their caregivers, and clinicians. The fact that self-rating and ratings by others is different is not surprising—the very purpose of the measure is to detect a lack of self-awareness in people with brain injury, operationalized as a discrepancy between the patient and their significant other. However, it is concerning that there is a large discrepancy in the ratings of other people who know the patient well: significant others and clinicians attributed quite variable scores across the range of items, indicating a low level of agreement between raters. This finding is in keeping with previous research which showed that the DEX ratings of significant others (mainly family members rather than clinicians) were variable (Barker et al., 2011).

The fact that third party raters can differ quite significantly when reporting about the same individual raises an important question about the reliability of the DEX. It could be suggested that clinician respondents might, by virtue of their professional training, be able to provide a more accurate appraisal of the level of executive function impairment. This is difficult to confirm, however, since clinical judgment is inherently subjective. In the

#### **Table 3 | Principal axis factor loadings for DEX when completed by patients, significant others, and clinicians**.


authors' experience, neuropsychological testing may also not be especially helpful in this regard, as performance on tests of executive function does not always correlate strongly with functional ability (Chaytor et al., 2006; Razani et al., 2007). It may be the case that the best and, perhaps, only reliable way to measure executive function impairment in everyday situations is through a combination of behavioral, task-based measures such as the Multiple Errands Test (Shallice and Burgess, 1991) and a consensus-based response to the DEX where a discussion of the individual items between the respondents may lead to a more accurate description of the problems encountered by the person with brain injury. The feasibility of including ecologically-valid behavioral testing has been improved with the development of virtual reality-based technologies. For example, a virtual reality version of the Multiple Errands Test (Raspelli, 2014) has been developed and offers the potential to measure real-life challenges coupled with the convenience of being able to do the assessment within a clinical setting.

With respect to the dimensionality of the DEX, PAF analysis suggested that a single factor offered the best fit for all three groups. This finding indicates that DEX items are best construed as representing a single construct of executive dysfunction. It should be noted that other researchers have identified a similar factor structure. For instance, using two independent samples of community-dwelling persons, Gerstorf et al. (2008) identified a single factor solution as being optimal for the self-rated version. Specifically, these authors report that "independent of specifying an orthogonal or oblique solution, we found that the eigenvalue for one factor was consistently above 7 whereas four or five other factors could have been extracted but their eigenvalues were only marginally larger than 1" (pp. 432–433). We observed a similar outcome across *three* different categories of respondent: self, significant other, and clinician. To our knowledge, only one other study (Chaytor and Schmitter-Edgecombe, 2007) has examined the factor structure of the DEX when completed by third party respondents (*N* = 46). These researchers identified a five-component solution, with the first three components corresponding well with the inhibition, intentionality, and executive memory factors specified in other psychometric studies assessing the self-rated version. The authors conclude that these three components appear to be replicable whereas components 4 and 5 are, perhaps, idiosyncratic (i.e., components unique to the specific sample being tested). However, the validity of their three-component interpretation may be questioned. First, the authors appear to have relied on the "eigenvalue greater than 1 rule", which many have argued is among the least accurate methods for identifying factor/component retention (i.e., it often results in over-extraction) (Costello and Osborne, 2005). Second, although the authors do not provide the intercorrelations among the components, based on the study conducted by Gerstorf and associates (Barker et al., 2004), it is possible they are of sufficient magnitude so as to suggest redundancy (i.e., for the factors representing inhibition, intentionality, and executive memory, *r*-values obtained by Gerstorf et al. (2008) ranged from 0.93 to 0.99).

As with all studies, the current investigation possesses certain limitations that warrant discussion. First, the number of participants recruited was modest, especially for the clinician subsample. It should be noted, however, that other researchers have published psychometric assessments of the DEX using similar (or smaller) numbers of participants (e.g., *N* = 20 (Amieva et al., 2003); *N* = 46 (Amieva et al., 2003); *N* = 93 (Bachmann et al., 2008)). Further, MacCallum et al. (1999) demonstrate that "rules of thumb" about sample size are less important than the degree to which a factor solution is characterized by factor over-determination (i.e., the number of indicators per factor, with a common ratio being 5:1) and strong communality values (i.e., the proportion of variance in each item accounted for by the extracted factor[s]). In the current study, communality estimates were variable; strong overdetermination was present (i.e., *p* [variable]: *r* [factor] ratio was 20:1); and, for the smallest subsample (clinician group), 19 of the 20 variables had large structure coefficients (>0.60) suggesting that one can be reasonably confident in the reproducibility of the obtained factor solutions. Larger samples are clearly needed, however, if one were to conduct subgroup analyses based on variables such as type of injury, gender of patient, or relationship between patient and significant other (e.g., spouse vs. sibling). Another limitation pertains to the small set of variables that were measured. Gerstorf et al. (2008), for example, assessed a host of individual difference variables including neuroticism, depression, subjective health, trait anxiety, positive and negative affect, and cognitive functioning. In the current study, as only the DEX and a small number of sociodemographic items (e.g., age) were used, the convergent validity of this instrument when completed by patients, significant others and clinicians could not be tested. Future studies might consider the use of alternative methodologies, such as those used in clinical judgment studies (e.g., Bachmann et al., 2008), to look at the cues and weightings used by respondents to arrive at their judgments regarding the presence and extent of any difficulties in executive functioning.

In conclusion, our dimensionality evaluation suggests that the DEX is best construed as a single factor measure of dysexecutive syndrome. The inter-rater reliability analysis suggests that there is a low level of agreement in item ratings on the DEX among patients, their caregivers, and clinicians. The fact that evaluations by two raters are not highly correlated in reference to the same patient raises a question about this element of the reliability of the DEX. While it is well recognized that executive function deficits occur frequently after traumatic brain injury and this is often associated with impaired self-awareness, we are as yet limited in our ability to measure and quantify these impairments. The difficulties arising from measuring deficits in executive function also presents challenges in how best to involve patients in aspects of their own rehabilitation such as patient-determined goals and outcomes (Hogan et al., 2013) when self-awareness is compromised.

#### **REFERENCES**


**Conflict of Interest Statement**: The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

*Received: 01 July 2014; accepted: 19 September 2014; published online: 10 October 2014*.

*Citation: McGuire BE, Morrison TG, Barker LA, Morton N, McBrinn J, Caldwell S, Wilson CF, McCann J, Carton S, Delargy M and Walsh J (2014) Impaired selfawareness after traumatic brain injury: inter-rater reliability and factor structure of the Dysexecutive Questionnaire (DEX) in patients, significant others and clinicians. Front. Behav. Neurosci. 8:352. doi: 10.3389/fnbeh.2014.00352*

*This article was submitted to the journal Frontiers in Behavioral Neuroscience*.

*Copyright © 2014 McGuire, Morrison, Barker, Morton, McBrinn, Caldwell, Wilson, McCann, Carton, Delargy and Walsh. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution and reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms*.

# "Neural Efficiency" of Athletes' Brain during Visuo-Spatial Task: An fMRI Study on Table Tennis Players

Zhiping Guo<sup>1</sup> \* † , Anmin Li <sup>1</sup> \* and Lin Yu<sup>2</sup>

<sup>1</sup>School of Kinesiology, Shanghai University of Sport, Shanghai, China, <sup>2</sup>Neurocognition and Action—Biomechanics Research Group Center of Excellence—Cognitive Interaction Technology (CITEC), Bielefeld University, Bielefeld, Germany

Long-term training leads experts to develop a focused and efficient organization of task-related neural networks. "Neural efficiency" hypothesis posits that neural activity is reduced in experts. Here we tested the following working hypotheses: compared to nonathletes, athletes showed lower cortical activation in task-sensitive brain areas during the processing of sports related and sports unrelated visuo-spatial tasks. To address this issue, cortical activation was examined with fMRI in 14 table tennis athletes and 14 non-athletes while performing the visuo-spatial tasks. Behavioral results showed that athletes reacted faster than non-athletes during both types of the tasks, and no accuracy difference was found between athletes and non-athletes. fMRI data showed that, athletes exhibited less brain activation than non-athletes in the bilateral middle frontal gyrus, right middle orbitofrontal area, right supplementary motor area, right paracentral lobule, right precuneus, left supramarginal gyrus, right angular gyrus, left inferior temporal gyrus, left middle temporal gyrus, bilateral lingual gyrus and left cerebellum crus. No region was significantly more activated in the athletes than in the non-athletes. These findings possibly suggest that long-standing training prompt athletes develop a focused and efficient organization of task-related neural networks, as a possible index of "neural efficiency" in athletes engaged in visuo-spatial tasks, and this functional reorganization is possibly task-specific.

Keywords: neural efficiency, visuo-spatial information processing, sports training, brain activation, table tennis players, functional magnetic resonance imaging

# INTRODUCTION

Extensive practice over a long period of time leads expert athletes to develop a focused and efficient organization of task-related neural networks (Milton et al., 2007), and the functional reorganization is task-specific rather than general in terms of improved motor abilities (Schwenkreis et al., 2007). ''Neural efficiency'' hypothesis posits that neural activity is reduced in experts (Del Percio et al., 2009a). Present studies investigating expert athletes' specific brain activation are somewhat inconsistent.

Numerous previous studies showed that compared to novices/non-athletes, expert athletes have less brain activation during resting state or performing cognitive/motor tasks. For example, in the condition of resting state, karate athletes exhibited less cortical activation over frontal, central, parietal or occipital areas than non-athletes (Babiloni et al., 2010a; Del Percio et al., 2011b). During viewing pictures/videos of real competition performances, alpha event-related

#### Edited by:

Lynne Ann Barker, Sheffield Hallam University, UK

#### Reviewed by:

David Reynolds, Sheffield Hallam University, UK Valter Prpic, De Montfort University, UK

#### \*Correspondence:

Zhiping Guo guoguopinger@hotmail.com Anmin Li anminli@tom.com

#### †Present address:

Zhiping Guo, Department of Physical Education, Hubei University for Nationalities, Hubei, China

> Received: 19 April 2016 Accepted: 07 April 2017 Published: 26 April 2017

#### Citation:

Guo Z, Li A and Yu L (2017) "Neural Efficiency" of Athletes' Brain during Visuo-Spatial Task: An fMRI Study on Table Tennis Players. Front. Behav. Neurosci. 11:72. doi: 10.3389/fnbeh.2017.00072 desynchronization (ERD) was lower in mirror system in athletes than in non-athletes (Babiloni et al., 2009, 2010b). During the 6 s pre-shot period, athletes exhibited greater alpha power than novices in occipital areas (Loze et al., 2001), parietal (Baumeister et al., 2008) and the whole scalp (Del Percio et al., 2009b). Besides, compared to non-athletes and skilled athletes, elite athletes showed lower coherence values, which imply the refinement of cortical networks in experts and differences in strategic planning related to memory processes and executive influence over visualspatial cues (Deeny et al., 2009). During the execution of upright standing, less alpha ERD was observed in frontal, central and parietal areas in athletes (Del Percio et al., 2009a). Similar results were observed in primary motor area, lateral and medial premotor areas in athletes while performing wrist extension task (Del Percio et al., 2010).

However, many other studies reported more, or partly, cortical activation in expert athletes than in non-athletes. For instance, alpha power in athletes was reduced significantly (more cortical activation) while they observed sports videos, which was not found in novices (Orgs et al., 2008). Besides, a TMS study observed greater activation in the frontal mirror system in athletes than in novices during observation of sports videos (Aglioti et al., 2008), and two fMRI studies also observed greater activation in task related brain areas in athletes than in novices/non-athletes while they observed sports videos (Wright et al., 2011) or judged the line orientation (Seo et al., 2012). In addition, during preparing or executing a motor task, athletes exhibited higher alpha coherence values in parietal, temporal and occipital areas (Del Percio et al., 2011a) or more alpha ERD in ventral centro-parietal pathway than novices (Del Percio et al., 2007a). It's worth noting that a few fMRI studies examined the effect of task familiarity on athletes' brain activation and found greater cortical activation in task-sensitive areas (e.g., the mirror system, motor areas) in athletes while performing familiar tasks than less familiar tasks (Calvo-Merino et al., 2005, 2006; Lyons et al., 2010; Woods et al., 2014). These different findings might be related to practice-related decrease (mainly in frontal cortex areas), increase (mainly in task-relevant brain areas), redistribution and reorganization of regional activation of cognitive and sensorimotor processes (Kelly and Garavan, 2005; Babiloni et al., 2010b; Hardwick et al., 2013).

Considering the inconsistent results of the brain activation in athletes, most of these studies employed motor or motor related tasks and few studies adopted cognitive tasks, the present fMRI study contributed to the debate on the more or less brain activation in athletes during cognitive tasks. The cortical activation was examined when athletes and non-athletes performed visuo-spatial tasks. Based on the ''Type Token Model'' (Zimmer and Ecker, 2010) and the item characteristic of table tennis, we used a visuo-spatial task that included sports related condition and sports unrelated condition, in which participants were asked to recognize the figure (circle or cross-star) with notch angle of 135◦ . The following hypothesis was tested in the present study: athletes exhibited lower cortical activation in task-sensitive brain areas than non-athletes during processing of sports related and sports unrelated visuo-spatial task. The ventral and dorsal cortical visual pathways were considered as they were respectively involved in the recognition of objects (Braddick and Atkinson, 2007) and the analysis of visual space (Rolls and Stringer, 2006). In addition, after reviewing studies from functional and structural neuroimaging paradigms, Jung and Haier (2007) report a striking consensus suggesting that variations in a distributed network predict individual differences found on intelligence and reasoning tasks, and they describe this network as the Parieto-Frontal Integration Theory (P-FIT). According to the P-FIT, the extrastriate cortex, fusiform gyrus, supramarginal, superior parietal, angular gyri, frontal regions and anterior cingulated are the very critical brain areas in solving a given problem (Jung and Haier, 2007). Statistical analysis of the present study focused on the following brain areas/cortex: extrastriate cortex, fusiform gyrus, supramarginal, superior parietal, angular gyri, cingulated, frontal regions and cerebellum.

# MATERIALS AND METHODS

# Participants

A total of 28 right handed male subjects, 14 table tennis players (mean age, 19.64 ± 1.50 years) and 14 non-athletes (mean age, 21.50 ± 1.83 years) participated in the experiment. None of the non-athletes had any formal table tennis training experience. All of the table tennis players were above the 2nd level of national standard and had been practicing table tennis for more than 8 years at least five times a week. All subjects reported normal or corrected vision and no history of mental disorders problems.

This study was approved by the Ethics Committee of Scientific Research of Shanghai University of Sport (no. 2014066) and carried out strictly in accordance with the approved guidelines. All participants gave informed written consent.

# Experiment Task

The experimental task was a go/no-go visuo-spatial task. ''Type Token'' model, a theoretical model of long-term object memory,

#### FIGURE 1 | Schematic illustration of the stimulus for one trail. Each trails starts with a 500 ms fixation of cross on gray background. At the end of the fixation, 500 ms/1000 ms/1500 ms of a jitter will appear, and then appears the 500 ms probe stimuli. After the probe stimuli, there is 1000 ms of a gray screen for subject's response.

Guo et al. "Neural Efficiency" of Athletes' Brain

TABLE 1 | Behavioral measurement of athletes and non-athletes under sports related task and sports unrelated task.


suggesting that perceptual priming and episodic recognition are phenomena based on distinct kinds of representations, i.e., types and tokens. Types are prototypical representations needed for object identification, mainly include the outline and three-dimension information. Tokens support episodic recognition, mainly store the orientation and color information, and the tokens can be bound preserved with types. Individuals can simplify the types and tokens to form a special bundled representation for a long time of contacting with some objects (Zimmer and Ecker, 2010). Based on the ''Type Token'' model and the item characteristic of table tennis, circle with notch angle over 45◦ , 135◦ , 225◦or 315◦ was employed as sport related stimulus for its similarity on the ball and the hitting point (Zhang, 2014). The cross-star with notch angle over 45◦ , 135◦ , 225◦ or 315◦ was employed as sport unrelated stimuli for its shape's unfamiliarity in table tennis. The target stimulus was the shape with notch angle over 135◦ and only appeared at one location of the picture (There were four shapes in one picture). The ratio of the target and non-target stimulus is 50%, respectively. Participants were asked to press the left key with right index finger when the circle target stimulus displayed, press the right key with the right third finger when the cross-star target stimulus showed, and instructed not to press key while non-target stimulus displayed. All stimuli appeared in a pseudorandom order. The total number of trials was 256, 60 go trials for circle and cross-star respectively, 60 no-go trials for circle and cross-star trials respectively, 16 no-go trials for blank screen as baseline. The schematic illustration of the stimulus for one trail was shown in **Figure 1**.

# Image Acquisition/Scanning Parameters

fMRI scanning was conducted using a Siemens Magnetom Verio 3T MRI scanner and a 32-channel head coil. Functional data consisted of 384 volumes using a T2-weighted echo planar imaging sequence with 33 contiguous sagittal slices covering the whole brain. The data was acquired with an FOV of 220 × 220 mm, flip angle 90◦ , TR of 2000 ms, TE of 30 ms and slice thickness of 3 mm. The resulting voxel resolution was 3.4 × 3.4 × 3.0 mm.

Participants indicated their judgment by pressing one of two buttons of an MRI-compatible response device held in the right hand (left button for sport related go stimuli and right button for sport non-related go stimuli).

# Image Analysis

Image processing and statistical analyses were based on MATLAB (The Mathworks Inc., Natick, MA, USA, release 9) and SPM12 (SPM; Wellcome Department of Imaging Neuroscience, London, UK; online at http://www.fil.ion.ucl.ac.uk), and the result was visualized using xjView toolbox (online at http://www.alivelearn.net/xjview). Preprocessing included realignment, slice-time correction and normalization to the standard space of the Montreal Neurological Institute brain (MNI brain). Smoothing was conducted with an isotropic threedimensional Gaussian filter with a full-width-at-half-maximum kernel (FWHM) of 6 mm. The functional images were corrected for sequential slice timing, and all images were realigned to the middle image to correct for head movement between scans. The realigned images were then mean-adjusted by proportional scaling and spatially normalized into standard stereotactic space to fit a MNI template based on the standard coordinate system.

The pre-processed fMRI data were then entered into first-level individual analysis by comparing fMRI activity during the target stimuli presenting condition (sport related and sport unrelated condition) with that during the blank presenting condition (baseline condition).

In second-level analysis, contrast images from the analysis of individual subjects were analyzed by a 2 (Group: Athletes, Non-athletes) × 2 (Stimulus Type: Sports related, Sports unrelated) ANOVA (with Group as a between-subjects factor and Stimulus Type as a within-subjects factor). Regions showing a significant interaction were identified using an initial uncorrected voxel-wise threshold of F(1,52) = 12.164, p < 0.001.


TABLE 2 | Brain regions activated in "sports related condition" from between group analysis (H = hemisphere; p < 0.001, uncorrected).

# Analysis of Behavioral Data

Repeated ANOVA was used to check the reaction time and accuracy differences between athletes and non-athletes among sports related and unrelated stimulus.

# RESULTS

# Behavioral Results

The behavioral outcomes (task accuracy and response time) were shown in **Table 1**. A 2 × 2 repeated ANOVA was used to determine group differences for behavioral outcomes, employing the SPSS software. Statistical significance was defined at p < 0.05.

The ANOVA of the accuracy variable showed no statistical significant differences in main effect or interaction between the factors Group (athletes, non-athletes) and Condition (Sports related, Sports unrelated; p > 0.05). The ANOVA of the reaction time showed no statistically significant differences in interaction between the factors Group (athletes, non-athletes) and Condition (Sports related, Sports unrelated; p > 0.05),

#### but displayed significant differences in main effect between groups (F(1,52) = 10.05, p = 0.004, η <sup>2</sup> = 0.279). Compared with non-athletes, athletes needed much less time to recognize the target stimulus during both sports related task and sports unrelated task.

# Imaging Results

A few regions showed a significant Expertise × Stimulus-Type interaction at the whole-brain level, including lingual gyrus (BA 18), cuneus (BA 19), superior occipital lobe (BA 19), supramarginal (BA 40), cingulate gyrus (BA 24), paracentral lobule/ precuneus (BA 5), supplemental motor area (BA 6), medial superior frontal gyrus (BA 8). The post hoc tests were then used to check the simple effect for different factor level.

# Group Effect under Sports Related Stimulus Condition

Significant brain regions of group effect under sports related stimulus condition were shown in **Figure 2** and **Table 2**.



Athletes exhibited less activation than non-athletes in the left middle frontal gyrus, right middle orbitofrontal area, right angular gyrus and left cerebellum crus. No region was significantly more activated in the athletes than in the nonathletes.

# Group Effect under Sports Unrelated Stimulus Condition

Significant brain regions of group effect under sports unrelated stimulus condition were shown in **Figure 3** and **Table 3**.

Athletes exhibited less activation than non-athletes in the bilateral middle frontal gyrus, right middle orbitofrontal area, right supplementary motor area, right paracentral lobule, right precuneus, left supramarginal gyrus, right angular gyrus, left inferior temporal gyrus, left middle temporal gyrus, bilateral lingual gyrus and left cerebellum crus. No region was significantly more activated in the athletes than in the nonathletes.

### Stimulus Type Effect under Athlete Condition

Significant brain regions of stimulus type effect under athlete condition were shown in **Figure 4** and **Table 4**. The left middle frontal gyrus and pars opercularis of inferior frontal gyrus exhibited less activation under sports related condition than sports unrelated condition in athletes, but the precuneus exhibited more activation under sports related condition than sports unrelated condition in athletes.

# Group Effect under Sports Unrelated Stimulus Condition

Significant brain regions of stimulus type effect under non-athlete condition were shown in **Figure 5** and **Table 5**.

A few brain areas exhibited less activation under sports related condition than in sports unrelated condition, including the superior frontal gyrus, middle frontal gyrus, occipital lobe, inferior parietal lobule, supramarginal gyrus, lingual gyrus, middle occipital lobe and middle temporal gyrus. No region was significantly more activated under sports related condition than in sports unrelated condition.

# DISCUSSION

This study used fMRI to investigate the brain activation in athletes and non-athletes during a figure recognition task. Our hypothesis was based on research demonstrating that athletes seems to develop a focused and efficient organization


of task-related neural networks (Milton et al., 2007), the functional reorganization is task-specific rather than general in terms of improved motor abilities (Schwenkreis et al., 2007), and the ''neural efficiency'' hypothesis about experts (Del Percio et al., 2009a). More specifically, it was tested whether there was less cortical activity in athletes than in non-athletes during the sports related and sports unrelated visual-spatial task.

Behaviorally, we found that athletes showed shorter reaction time during both tasks than non-athletes. This result was supported by the previous findings that athletes exhibited faster than non-athletes during reaction time tasks, and the faster responses stimulus discrimination and response selection ability possibly due to athletes' enhanced attention and inhibitory control ability (Hung et al., 2004; Di Russo et al., 2006; Nakamoto and Mori, 2008, 2012; Muraskin et al., 2015).

Regarding the group effect, neuroimaging data demonstrated less brain activation in numerous areas in athletes than in non-athletes during the visuo-spatial tasks, no brain area showed more activation in athletes than in non-athletes during either of the tasks. Less brain activation areas in athletes than in non-athletes including the bilateral middle frontal gyrus (BA 6), right middle orbitofrontal area (BA 10), right supplementary motor area (BA 6), right paracentral lobule (BA 31), right precuneus (BA 7), left supramarginal gyrus (BA 40), right angular gyrus (BA 17), left inferior temporal gyrus (BA 20), left middle temporal gyrus (BA 21), bilateral lingual gyrus (BA 18) and left cerebellum crus. These results are in line with the findings of previous research that athletes exhibited less cortical activation during social cognition task. The activation in occipital areas was decreasing in non-athletes, amateur karate athletes and elite karate athletes during the observation of pictures with basket and karate attacks (Del Percio et al., 2007b). Low-and high-frequency alpha ERD was lower in amplitude in the elite rhythmic gymnasts compared to the non-gymnasts in occipital and temporal areas (ventral pathway) and in dorsal pathway, these results globally suggest that the judgment of observed sporting actions is related to low amplitude of alpha ERD, as a possible index of spatially selective cortical activation (''neural efficiency''; Babiloni et al., 2009). Low- and high-frequency alpha ERD was less pronounced in dorsal and ''mirror'' pathways in the elite karate athletes than in the non-athletes during the judgment of karate actions, and the researchers concluded that less pronounced alpha ERD in athletes hints at ''neural efficiency'' in experts engaged in social cognition (Babiloni et al., 2010b). In addition, extensive practice over a long period of time leads experts to develop a focused and efficient organization of task-related neural networks (Milton et al., 2007). It appears that the involvement of the executive functions associated with frontal pathways decreases while the role of specialized posterior brain regions becomes more important when individuals are sufficiently trained in a cognitive task (Neubauer and Fink, 2009). Less brain activation in athletes in present study may indicate that athletes have developed focused and efficient organization of task-related neural networks and needed less supervisory control while processing visuo-spatial information, and therefore exhibited ''neural efficiency'' during sports related and sports unrelated visuo-spatial tasks. In addition, this functional reorganization is possible not only for task-specific but also general cognitive task.

According to the P-FIT, the visual information was first processed in temporal and occipital lobes (mainly BAs 18, 19, 37), including recognition and subsequent imagery and/or elaboration of visual input, then this basic sensory/perceptual processing is fed forward to the parietal cortex (mainly BAs 40, 7, 39), wherein completed structural symbolism, abstraction, and elaboration emerge, and the parietal cortex interacts with frontal regions (mainly BAs 6, 9, 10, 45–47) at the same time, which serve to generate various solutions to a given problem. Once the best solution is arrived up on, the anterior cingulate (BA 32) is engaged to constrain response selection and inhibit other competing responses (Jung and Haier, 2007). Less brain activation in brain areas including BAs 17, 18, 20, 21, BAs 7, 31, 40 and BAs 6 and 10 in athletes than in non-athletes


during the visual-spatial tasks may suggest that athletes showed ''neural efficiency'' during the whole information processing flow, including the early processing of sensory information, the next information integration, the information matching identification and the last response selection procedure during these task.

Regarding the stimulus type effect, neuroimaging data demonstrated less brain activation under the sports related stimulus condition than the sports unrelated stimulus condition in both athletes and non-athletes, except for the precuneus which showed more activation under the sports related condition than the sports unrelated stimulus condition in athletes. Precuneus is involved in integration of external and internal information, and can extract information from internal memory storage according to external stimuli (Ren, 2010), the increased precuneus activation in athletes during sports related stimulus tasks possibly suggest that the processing of sports related stimulus information was based more on the athletes' sports experience compared to non-athletes.

Analyses of combined data show that results support our hypotheses. Athletes showed less brain activation during both the sports related and sports unrelated tasks. These findings are in accordance with previous studies reporting ''neural efficiency'' in athletes (Del Percio et al., 2008; Babiloni et al., 2010b), and this ''neural efficiency'' may stem from the long term training which enabled athletes to develop a focused and efficient organization of task-related neural networks, and this functional reorganization is possibly task-specific. However, it should be noted that we conducted a cross-sectional study and our entire corollary was based on compared outcomes for the two groups of subjects, for this reason we cannot exclude that maybe some differences already existed before practicing sports. It is possible that young people having certain basic perceptual-motor skills received positive feedback during their first attempts to practice sports and they became ''athletes'', while those who were less skilled gave up when they were young and became ''non-athletes''. Thus, probably both nature and experience contributed to the differences found by our research. One way to solve this problem

# REFERENCES


may be carrying out a longitudinal study. Instead of studying the effects of long-term field training, other specific kinds of training can be relatively easily manipulated, such as perceptual training. Previous research has shown that perceptual training can be effective, from the behavioral point of view, for both non-athletes (Savelsbergh et al., 2010; Ryu et al., 2013) and athletes (Farrow and Abernethy, 2002; Murgia et al., 2014) in a shorter time, i.e., weeks/months. Therefore, in order to explore the exact effect of training on performance and brain activation pattern of athletes during cognitive tasks, future studies could compare the performance and the brain activation pattern of a group (of either non-athletes or athletes) before and after a period of perceptual training with those of a matched control group.

# CONCLUSION

In summary, we used fMRI to investigate the possible brain activation difference between athletes and non-athletes in visual-spatial tasks. We found that athletes reacted faster than non-athletes during both the sports related and sports unrelated visuo-spatial tasks. Athletes decreased activation in cortical regions important for the early processing of sensory information, the next information integration, the information matching identification and the last response selection. Taken together, our findings suggest that there is neural efficiency in athletes during visuo-spatial tasks, and this ''neural efficiency'' may stem from the long term training which prompt athletes to develop a focused and efficient organization of task-related neural networks, and this functional reorganization is possibly task-specific.

# AUTHOR CONTRIBUTIONS

ZG: literature research, study design, data acquisition/analysis/ interpretation, manuscript preparation/editing/revision. AL: guarantor of integrity of entire study, manuscript final version approval. LY: literature research, statistical analysis, manuscript editing.


a high-resolution EEG study. Brain Res. Bull. 79, 193–200. doi: 10.1016/j. brainresbull.2009.02.001


**Conflict of Interest Statement**: The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

The reviewer DR and handling Editor declared their shared affiliation, and the handling Editor states that the process nevertheless met the standards of a fair and objective review.

Copyright © 2017 Guo, Li and Yu. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

# Executive function and endocrinological responses to acute resistance exercise

#### *Chia-Liang Tsai <sup>1</sup> \*, Chun-Hao Wang1, Chien-Yu Pan2, Fu-Chen Chen3, Tsang-Hai Huang1 and Feng-Ying Chou1,4*

*<sup>1</sup> Institute of Physical Education, Health and Leisure Studies, National Cheng Kung University, Tainan, Taiwan*

*<sup>2</sup> Department of Physical Education, National Kaohsiung Normal University, Kaohsiung, Taiwan*

*<sup>3</sup> Department of Recreational Sport and Health Promotion, National Pingtung University of Science and Technology, Pingtung, Taiwan*

*<sup>4</sup> Chi Mei Medical Center, Tainan, Taiwan*

#### *Edited by:*

*Lynne Ann Barker, Sheffield Hallam University, UK*

#### *Reviewed by:*

*Lynne Ann Barker, Sheffield Hallam University, UK Nicholas Morton, Rotherham, Doncaster and South Humber Mental Health NHS Foundation Trust, UK*

#### *\*Correspondence:*

*Chia-Liang Tsai, Lab of Cognitive Neurophysiology, Institute of Physical Education, Health and Leisure Studies, NO. 1, University Road, Tainan 701, Taiwan e-mail: andytsai@mail.ncku.edu.tw* This study had the following two aims: First, to explore the effects of acute resistance exercise (RE, i.e., using exercise machines to contract and stretch muscles) on behavioral and electrophysiological performance when performing a cognitive task involving executive functioning in young male adults; Second, to investigate the potential biochemical mechanisms of such facilitative effects using two neurotrophic factors [i.e., growth hormone (GH) and insulin-like growth factor-1 (IGF-1)] and the cortisol levels elicited by such an exercise intervention mode with two different exercise intensities. Sixty young male adults were recruited and randomly assigned to a high-intensity (HI) exercise group, moderate-intensity (MI) exercise group, and non-exercise-intervention (NEI) group. Blood samples were taken, and the behavioral and electrophysiological indices were simultaneously measured when individuals performed a Go/No-Go task combined with the Erikson Flanker paradigm at baseline and after either an acute bout of 30 min of moderate- or high-intensity RE or a control period. The results showed that the acute RE could not only benefit the subjects' behavioral (i.e., RTs and accuracy) performance, as found in previous studies, but also increase the P3 amplitude. Although the serum GH and IGF-1 levels were significantly increased via moderate or high intensity RE in both the MI and HI groups, the increased serum levels of neurotrophic factors were significantly decreased about 20 min after exercise. In addition, such changes were not correlated with the changes in cognitive (i.e., behavioral and electrophysiological) performance. In contrast, the serum levels of cortisol in the HI and MI groups were significantly lower after acute RE, and the changes in cortisol levels were significantly associated with the changes in electrophysiological (i.e., P3 amplitude) performance. The findings suggest the beneficial effects of acute RE on executive functioning could be due to changes in arousal, possibly modulated by the serum cortisol levels.

**Keywords: resistance exercise, cognition, electrophysiological, behavior, cortisol, GH, IGF-1**

# **INTRODUCTION**

Participation in physical activity has been demonstrated to be associated with changes in cognitive performance involving executive functioning (Etnier et al., 1997). Accordingly, there have been many studies that attempt to explore the changes in cognitive performance that occur after a bout of acute exercise. With regard to resistance exercise, a growing number of work has strongly supported the view that executive functioning performance is enhanced via chronic resistance exercise (Perrig-Chiello et al., 1998; Ozkaya et al., 2005; Cassilhas et al., 2007; Liu-Ambrose et al., 2010), but that such a facilitative effect, as measured by behavioral indices, can also be found via acute resistance exercise (Chang et al., 2012, 2014). However, while previous studies have implicated the physiological (e.g., arousal) or hormonal (e.g., neurotrophic factors) responses to acute exercise intervention as the basis of any improvements in behavioral performance following physical exercise (Magnie et al., 2000; Joyce et al., 2009; Lambourne and Tomporowski, 2010; Dietrich and Audiffren, 2011; McMorris et al., 2011; Pesce et al., 2011; Tsai et al., 2014), no research has yet been conducted to explore the potential mechanisms underlying this process using electrophysiological and biochemical markers.

Cognitive performance after a bout of acute exercise could be influenced by the intensity of exercise, which could be attributed to the secreted levels of biochemical markers (e.g., neurotrophins and cortisol) or the states of arousal (Kashihara et al., 2009). Biopsychological arousal theory proposes that physiological responses to exercise mediate changes in a number of aspects of psychological functioning (e.g., cognitive functioning), through its direct effects on energetic arousal (EA) (Oweis and Spinks, 2001). However, the level of EA after exercise is strongly associated with the degree of exercise intensity (Thayer, 1989), with the proposed positive effects of a moderate arousal level on cognitive performance being based on the inverse-U theory (Oweis and Spinks, 2001). As the amount of biochemical markers that are secreted are closely related to the level of physical activity, the effects of acute resistance exercise on cognitive (i.e., behavioral and electrophysiological) performance may be particularly relevant to an investigation of the various components of exercise, including intensity. Given the conceptual links between the secreted levels of circulating biomarkers [e.g., cortisol, growth hormone (GH) and insulin-like growth factor-1 (IGF-1)] and resistance exercise intensity (Schwarz et al., 1996; McGuigan et al., 2004; Irving et al., 2009; Wahl et al., 2010), examining whether different intensities (high vs. moderate) of acute exercise influence the exercise-cognition relationship would seem logical in any attempt to understand how acute resistance exercise may benefit cognitive performance.

Cortisol, a glucocorticoid hormone produced by the adrenal cortex, is a corticosteroid released in response to stress as the endproduct of the hypothalamic-pituitary-adreno-cortical (HPA) system (Henckens et al., 2012). Increased cortisol levels are often related to stress, which could result in dysfunction of neuronal plasticity, neurogeneis, or remodeling of the hippocampus, since the steroids inhibit glucose transport in hippocampal neurons and glia (Sapolsky, 1993; Duman, 2002). Cortisol can lead to arousal, as its release can limit the synthesis of the adrenocorticotrophin hormone (ACTH) and corticotrophin releasing hormone (CRH), both of which modulate arousal (Lambourne and Tomporowski, 2010). Although exercise is also thought of as a stressor, an acute bout of aerobic as well as resistance exercise can increase the arousal status and neural activation, which further facilitate the central executive function related to the hippocampus and frontal lobe (Magnie et al., 2000; Lambourne and Tomporowski, 2010; Dietrich and Audiffren, 2011; Pesce et al., 2011; Chang et al., 2012, 2014). Given the conceptual links between cortisol and exercise, cortisol may be related to the effects of exercise on cognition (Henckens et al., 2012). Indeed, a previous study found that the beneficial effects of a single bout of acute exercise on cognitive performance could be attributed to acute decreases in cortisol levels (Heaney et al., 2013). However, although earlier work has demonstrated that cortisol can modulate cognitive performance involving executive functions (e.g., inhibitive control, attention, and memory) (Vedhara et al., 2000; Henckens et al., 2012), the effects of cortisol levels on cognition follow a U-shaped curve (Lupien and McEwen, 1997), with moderate levels being positively associated with executive functioning (Blair et al., 2005), while highly elevated cortisol levels have been shown to interfere with the cognitive functions that are largely dependent on prefrontal networks (e.g., inhibitory control, attention regulation, and memory retrieval) (Kopell et al., 1970; Lupien and McEwen, 1997; Lyons et al., 2000; Quesada et al., 2012) and hippocampal functioning (e.g., declarative memory) (Almela et al., 2012). This is because the detrimental effects on such executive functions relying on these brain areas could, to some extent, be due to a pronounced cortisol stress response (Almela et al., 2012; Quesada et al., 2012) or the inhibition of dopaminergic reward-seeking systems (Tops et al., 2004).

Additionally, although the improvement in cognitive functioning following physical exercise could be attributed to changes in the hormonal response, some exercise-sensitive biomarker secretions seem to be characterized by different physiological and metabolic demands. For example, aerobic exercise can effectively increase serum brain-derived neurotrophic factor (BDNF) concentrations, whereas resistance exercise can effectively change serum GH and IGF-1 concentrations (Neeper et al., 1995; Cassilhas et al., 2007; Seo et al., 2010; Gregory et al., 2013). Indeed, resting IGF-1 concentrations are increased after shortterm resistance training (Borst et al., 2001), but decreased after short-term endurance training (Nemet et al., 2004). Additionally, a single bout of resistance exercise, but not aerobic exercise, is a physiological stimulus for acute increases in GH and IGF-1 levels (Gregory et al., 2013). Taken together, these findings indicate that these circulating responses of neurotrophic factors are specific to different exercise modes.

GH and IGF-1 are signaling peptides which can cross the blood-brain barrier and bind to receptors in the central nervous system (Sonntag et al., 2005). Given the importance of the GH/IGF-1 axis for growth of glial cells, myelination, and neurons (Sonntag et al., 2005), the capacity of resistance exercise to alter GH and IGF concentrations might have important implications for cognitive performance. Indeed, there is growing evidence for a significant association between the GH/IGF-1 axis and such performance. Previous studies have demonstrated that serum IGF-1 levels (Rollero et al., 1998; Aleman et al., 1999, 2001; Kalmijn et al., 2000; Dik et al., 2003), GH levels (Quik et al., 2012), and the IGF-1/GH ratio (Morley et al., 1997) are associated with behavioral performance (e.g., information processing speed, target detection and response speed, short-term memory, and visual/auditory learning), and acute resistance exercise can significantly increase serum GH and IGF-1 levels (Nicklas et al., 1995; Rubin et al., 2005). We thus hypothesize that the changes in serum levels of GH and IGF-1 following a bout of acute resistance exercise would be positively correlated with the cognitive performance of executive functioning.

Previous studies have reported that executive functions are more strongly affected by physical activity or exercise than other aspects of cognitive functioning (Etnier et al., 1997; Colcombe and Kramer, 2003), and acute exercise has been suggested to selectively augment executive function performance involving inhibitory control and attention (Drollette et al., 2012). Previous studies have demonstrated that, relative to the resting session, young adults exhibited not only higher response accuracy and shorter reaction time (RT), but also larger P3 amplitudes (i.e., devoting more attentional resources) following a bout of acute aerobic exercise when performing a modified flanker task (Hillman et al., 2003; Kamijo et al., 2009). While P3 event-related potential is known to be related to inhibition and attentional resource allocation (Jonkman et al., 2003; Tsai et al., 2009), a better understanding of the electrophysiological changes (e.g., P3 amplitude) underlying any improvements in behavioral performance may provide insights into the specific component processes involved in cognitive control that are modulated by acute resistance exercise (Hillman et al., 2009).

Although a number of studies have demonstrated that a bout of acute resistance exercise can effectively enhance behavioral indices (Chang et al., 2012, 2014), thus far, no research has yet been conducted on the effects of such an acute exercise mode on electrophysiological performance. Therefore, the first aim of this study was to elucidate the effects of a bout of moderate- or highintensity resistance exercise on behavioral (i.e., RT and accuracy) and electrophysiological (i.e., P3 amplitude) performance using a Go/No-Go task combined with the Erikson Flanker paradigm in young males. Since such a cognitive task involves the cognitive processes (i.e., inhibitory control and attention), and previous studies have demonstrated that the cognitive performance can effectively be enhanced in adolescents (Hogan et al., 2013) and young adults (Ruchsow et al., 2005) after a bout of acute exercise, we thus hypothesized that moderate- or high-intensity acute resistance exercise could produce different degrees of beneficial effects on cognition with regard to behavioral and electrophysiological performance in both exercise-intervention (EI) [i.e., moderate-intensity (MI) and high-intensity (HI)] groups relative to those seen in the non-exercise-intervention (NEI) group.

In addition, since no studies have examined the potential biochemical mechanisms underlying the beneficial effects of acute resistance exercise on cognitive performance, the second aim of this study was to explore further the issue. Based on previous research, we postulated that both EI groups would see different changes in serum levels of biochemical markers which would result in different effects on cognitive performance.

# **MATERIALS AND METHODS PARTICIPANTS**

Sixty male participants aged between 20 and 29 were recruited from the same university and randomly assigned to a highintensity (HI) exercise group (*n* = 20), moderate-intensity (MI) exercise group (*n* = 20), and non-exercise-intervention (NEI) group (*n* = 20). Only male participants were selected in the current study, because research has shown that gender differences exist in the responses to resistance training, such as those that affect endocrine functioning (Staron et al., 1994). Thus a mixed-gender group may lead to disproportionate improvements in muscle and metabolic functions between male and female participants, which would presumably affect any related cognitive changes with regard to both cognitive and biochemical indices (Rubia et al., 2010). All participants were nonsmokers, right-handed, as assessed by a handedness inventory (Chapman and Chapman, 1987), and had normal or correctedto-normal vision. They were asked to complete a medical history and demographic questionnaire, and reported being free of any psychiatric or neurological disorders, cardiovascular or metabolic diseases, or medication intake that would influence central nervous system functioning. Additionally, they completed the International Physical Activity Questionnaire (IPAQ) (Craig et al., 2003) and the Physical Activity Readiness Questionnaire (PARQ) (Thomas et al., 1992) to avoid potential risk factors that might be exacerbated during acute resistance exercise. None of the participants showed any symptoms of cognitive impairment or depression, as separately measured by the Mini-Mental State Examination (MMSE, all scored above 24) (Folstein et al., 1975) and Beck Depression Inventory II (DBI-II, all scored below 13) (Beck et al., 1996). All the participants provided written informed consent to participate in the experiment, which was approved by the Institutional Ethics Committee. As shown in **Table 1**, the three groups were matched in age, body mass index (BMI), BDI-II, MMSE, and IPAQ, as well as resting HR (all *p* > 0.05).

# **PROCEDURE**

The participants were required to make two visits to the cognitive neurophysiology laboratory. On the first visit the research assistant explained the experimental procedure, and asked the participants to complete an informed consent form, a medical history and demographic questionnaire, and MMSE, DBI-II, IPAQ, and PARQ. Their height and weight were also measured to calculate their BMI. Two certified fitness instructors then completed all assessments of one-repetition maximum (1-RM) and peak muscle power for each participant. All participants in the MI and HI groups were familiarized with the exercise machines before the acute exercise intervention took place.

The second visit took place in the morning 2 days later, and to prepare for this the participants were asked to refrain from strenuous exercise and alcohol intake for 24 h, and food and caffeine were also prohibited for 3 h before exercising, since both caffeine and food consumption are associated with increases in P3 amplitude (Geisler and Polich, 1990; Dixit et al., 2006) and biochemical makers (e.g., cortisol) (Wu, 2014). When arriving at the laboratory, each participant was fitted with a Polar heart rate (HR) monitor (RX800CX, Finland), and was then asked to sit in an adjustable chair in front of a computer screen (with a width of 43 cm) in an acoustically shielded room with dimmed lights. Body temperature and resting HR were measured. An electrocap and electro-oculographic (EOG) electrodes were attached

**Table 1 | Demographic characteristics (mean ±** *SD***) of the two exercise intervention groups and one non-exercise-intervention group.**


*HI, high-intensity; MI, moderate-intensity; NEI, non-exercise-intervention; MMSE, Mini Mental State Examination; BDI, Beck Depression Inventory; HR: Heart Rate; IPAQ: International Physical Activity Questionnaire. Three groups were not significantly different at baseline for all variables.*

to each participants' scalp and face before the cognitive task test. The viewing distance was approximately 75 cm. Ten practice trials were carried out to familiarize the participants with the procedure of the cognitive task. Blood was then withdrawn and the formal cognitive test was immediately administered and electrophysiological signals recorded. The HI and MI groups then performed approximately 40 min of high-intensity (80% 1RM) and moderate-intensity (50% 1RM) acute resistance exercise on the exercise machines, respectively, with 10 min of warm-up and 30 min of core content. The core resistance exercise consisted of the following exercises in the order stated: bench presses, biceps curls, triceps extensions, leg presses, vertical butterflies, and leg extensions. Both HI and MI groups performed the resistance exercise for two sets of 10 repetitions, at an average speed, with a 90-s rest between sets, and a 2-min interval between each different exercise. Since exercise-induced hyperthermia and tachycardia are associated with systematic changes in P3 component (e.g., decrease in P3 latency) (Geisler and Polich, 1990), body temperature was measured and HR was assessed with a Polar HR monitor after the acute resistance exercise, with both measurements taken every 3 min. Once the participants' body temperature and HR had returned to within 10% of pre-exercise levels (on average about 5 min after a bout of acute resistance exercise), blood was immediately withdrawn from them, and they then completed the cognitive task as their event-related potentials (ERPs) were recorded. Additional blood samples were then taken after the subjects had completed the cognitive task.

With regard to the NEI group, after the first cognitive test they took a rest of about 45 min, during which they read magazines, and they completed the cognitive test again. All of the participants performed the cognitive test at the same time of day to control for circadian distortions.

# **COGNITIVE TASK PARADIGM**

Since acute exercise has been suggested to selectively augment executive functioning performance involving inhibitory control and attention (Drollette et al., 2012), a Go/No-Go task combined with the Erikson Flanker paradigm was used in this study (Ruchsow et al., 2005). Eight different letter strings (i.e., congruent: UUUUU, BBBBB, VVVVV, DDDDD; and incongruent: VVUVV, DDBDD, DDVDD, BBDBB) were presented on the computer screen in a randomized order with equal probability. Participants had to focus on the target letter in the middle of an array. Upon the appearance of letters U and B (Go condition), the participants had to respond as quickly and accurately as they could, using their right index finger to press the "M" button of the keyboard and their left index finger to press the "X," respectively. In contrast, they were told not to press any key if the letters D and V appeared in the middle of an array (No-Go condition). The whole experiment consisted of two blocks with 200 trials each, with 200 Go trials and 200 No-Go trials. All letter strings were presented in white text and for 200 ms against a black background on a laptop computer monitor. The participants had to respond within 1800 ms. There was an interval of 2000 ms between each trial. All participants performed the cognitive task with concomitant electrophysiological recording. After a practice block of 10 trials to make sure the participants understood the whole experimental procedure, the formal test was administered to collect RTs, accuracy rate, and ERPs data. The total duration of the cognitive test was approximately 15 min.

# **ELECTROPHYSIOLOGICAL RECORDING AND ANALYSIS**

Electroencephalographic (EEG) activity was recorded using 18 electrode sites (F7, F8, F3, F4, Fz, T3, T4, C3, C4, Cz, T5, T6, P3, P4, Pz, O1, O2, Oz) mounted in an elastic electrode cap (Quik-Cap, Compumedics Neuroscan, Inc., El Paso, TX) designed for the International 10-20 System. To monitor possible artifacts due to eye movements, horizontal and vertical bipolar electrooculographic activity (HEOG and VEOG) was recorded using adhesive electrodes placed on the supero-lateral right canthus and below and lateral to the left eye. Scalp locations were referred to linked mastoid electrodes, while a ground electrode was placed on the mid-forehead on the Quik-Cap. All electrode impedances were below 5 k-. EEG data acquisition employed an A/D rate of 500 Hz/channel, a band-pass filter of 0.1–50 Hz, and a 60 Hz notch filter, with continuous writing to hard disk for off-line analysis using SCAN4.3 analysis software (Compumedics Neuroscan, Inc., El Paso, TX) (Tsai et al., 2014).

Trials with a response error or RT quicker than 150 ms or slower than 800 ms, which were regarded as anticipatory or delay errors, respectively were excluded from the analysis. This time window was able to exclude all responses greater than two standard deviations from the mean response of each group, thereby excluding outliers that could skew the group means. The ERP epoch was defined as 200 ms pre-stimulus to 1200 ms poststimulus onset. During the recording epoch, trials containing ocular artifacts were also discarded from further analysis, with a threshold of 100µV in the vertical and horizontal electrooculograms being set for this. The remaining effective ERPs data was assembled across epochs according to different conditions. The stimulus-elicited P3 component, defined as the major positive deflection after the stimulus over the scalp (i.e., Fz, Cz, and Pz) occurring 250–500 ms for Go-P3 and 350–550 ms for No-Go-P3, was distinguished and averaged across the three electrodes, with correction for differences in the 200 ms pre-stimulus baseline (Kato et al., 2009).

# **BLOOD SAMPLING AND ANALYSIS**

Blood samples were obtained at three time points (T1: before the 1st cognitive task test; T2: before the 2nd cognitive task about 5 min after acute exercise; and T3: immediately after the 2nd cognitive task test) by a phlebotomist. The blood samples were withdrawn from the antecubital vein via an aseptic technique for analysis of serum IGF-1, GH, and cortisol. During the T2 and T3 time points, blood samples were obtained via an indwelling catheter located in a forearm vein. Following each sample collection, the catheter was flushed with sterile saline to prevent clot formation, and the catheter was cleared of saline prior to each sample collection. The blood was allowed to clot (BD Vacutainer Plus), and then centrifuged at 3000 rpm for 15 min at 4◦C (Hettich Mikro 22R, C1110). Each sample was frozen and stored at −80◦C for further serum marker assays. Serum values of GH and IGF-I were determined by a chemiluminescence immunoassay method using an Access Ultrasensitive hGH reagent pack (Beckman Coulter Inc, USA) and Liaison IGF-1 reagent (DiaSorin S.P.A., Italy), respectively. The levels of serum cortisol were analyzed by an enzyme-linked immunosorbent assay (ELISA) using cortisol kits (JL840685/R06, Abbott, Abbott Park, Illinois, USA). The detection limit for GH by this method was 0.002 ng/mL for GH and 3 ng/mL for IGF-1. The whole procedure for the determination of the three biochemical markers was performed by the same person to avoid inter-operator bias.

#### **STATISTICAL ANALYSIS**

For the behavioral (i.e., RTs and accuracy) and electrophysiological (i.e., P3 amplitude) analysis, all independent variables from the acute bout of resistance exercise were analyzed with a threeway repeated measures analysis of variance (ANOVA) [i.e., time (pre- vs. post-exercise) × group (HI vs. MI vs. NEI) × conditions (behavioral: congruent-go vs. incongruent-go; electrophysiological: congruent-go vs. congruent-no-go vs. incongruent-go vs. incongruent-no-go)]. Where a significant difference occurred, Bonferroni *post-hoc* analyses were performed. For the serum analysis, two-way repeated ANOVA with *post-hoc* Bonferroni were used to assess both the effects of time (T1 vs. T2 vs. T3) and group (MI vs. HI vs. NEI). Homogeneity and normality of variance assumptions were confirmed by Levene's and Kolmogorov-Smirnov tests, respectively. The significance levels of the *F* ratios were adjusted with the Greenhouse-Geisser correction for the violation of the assumption of sphericity when the degrees of freedom were more than one. The effect size (i.e., partial η2: η<sup>2</sup> *<sup>p</sup>*) is also reported to complement the use of significance testing. The following conventions were adopted to determine the magnitude of the mean effect size: <0.08 (small effect size), between 0.08 and 0.14 (medium effect size), and >0.14 (large effect size). A value of *p* < 0.05 was considered to be significant.

# **RESULTS**

#### **BEHAVIORAL PERFORMANCE** *Reaction time*

As shown in **Figure 1**, there were main effects of *Time* [*F*(1, 57) = 42.25, *p* < 0.001, η<sup>2</sup> *<sup>p</sup>* = 0.43] and *Condition* [*F*(1, 57) = 221.94, *p* < 0.001, η<sup>2</sup> *<sup>p</sup>* = 0.80] on RTs, suggesting that RTs were faster after (283.24 ms) than before (301.29 ms) acute exercise, and that RTs were faster in the congruent-go (283.89 ms) than in the incongruent-go (300.65 ms) conditions. The effects of the interaction of *Time* <sup>×</sup> *Group* [*F*(2, 57) <sup>=</sup> <sup>3</sup>.78, *<sup>p</sup>* <sup>=</sup> <sup>0</sup>.029, <sup>η</sup><sup>2</sup> *p* = 0.12] on RTs was also significant. *Post-hoc* analyses showed RTs were significantly faster after compared to before acute exercise in both HI (pre- vs. post-exercise = 299.49 ± 47.90 vs. 275.02 ± 47.56 ms, *p* < 0.001) and MI (pre- vs. post-exercise = 309.57 ± 69.15 vs. 287.21 ± 68.91 ms, *p* < 0.001) groups in the Go condition.

### *Accuracy rate*

There was a main effect of *Condition* [*F*(1, 57) = 23.11, *p* < 0.001, η2 *<sup>p</sup>* = 0.29] on the accuracy rate, suggesting that the accuracy rate of the No-Go condition was higher in the congruent condition (99.2%) than in the incongruent one (98.3%). The effects of the interactions of *Time*×*Group* [*F*(2, 57) <sup>=</sup> <sup>3</sup>.61, *<sup>p</sup>* <sup>=</sup> <sup>0</sup>.033, <sup>η</sup><sup>2</sup> *p* = 0.11] and *Time*×*Group*×*Condition* [*F*(2, 57) = 3.22, *p* = 0.048, η2 *<sup>p</sup>* = 0.10] on the accuracy rate were also significant. *Post-hoc* analyses showed the accuracy rate in the incongruent-no-go condition was significantly higher after compared to before acute exercise in both HI (pre- vs. post-exercise = 98.00 ± 2.03% vs. 98.75 ± 1.21%, *p* = 0.024) and MI (pre- vs. post-exercise = 98.10 ± 2.10% vs. 98.85 ± 1.27%, *p* = 0.015) groups.

#### **ELECTROPHYSIOLOGICAL PERFORMANCE**

The grand averaged ERP waveforms obtained for the three groups are shown in **Figure 2**. There were significant effects

of *Group* [*F*(2, 57) <sup>=</sup> <sup>3</sup>.34, *<sup>p</sup>* <sup>=</sup> <sup>0</sup>.042, <sup>η</sup><sup>2</sup> *<sup>p</sup>* = 0.11] and *Time* [*F*(2, 57) <sup>=</sup> <sup>39</sup>.91, *<sup>p</sup>* <sup>&</sup>lt; <sup>0</sup>.001, <sup>η</sup><sup>2</sup> *<sup>p</sup>* = 0.41] on P3 amplitudes, suggesting that no significant differences were observed between the three groups in the averaged P3 amplitude across the four conditions before acute exercise. In addition, the HI (13.58 ± 7.35µV) and MI (14.91 ± 4.49µV) groups showed significantly larger P3 amplitudes than the NEI (8.34 ± 3.42µV) group (HI vs. NEI, *p* = 0.012; MI vs. NEI, *p* = 0.001). The effect of the interaction of *Group* <sup>×</sup> *Time* [*F*(2, 57) <sup>=</sup> <sup>7</sup>.10, *<sup>p</sup>* <sup>=</sup> <sup>0</sup>.002, <sup>η</sup><sup>2</sup> *p* = 0.20] on P3 amplitudes was also significant. *Post-hoc* analyses revealed that the P3 amplitudes were significantly larger after compared to before acute exercise in both HI (pre- vs. postexercise: 8.24 ± 6.98 vs. 13.58 ± 7.35µV, *p* < 0.001) and MI (prevs. post-exercise: 8.40 ± 4.86 vs. 14.91 ± 4.49µV, *p* < 0.001) groups.

# **BIOCHEMICAL INDICES**

# *Growth Hormone (GH)*

As seen in **Figure 3**, there were significant effects of *Group* [*F*(2, 57) <sup>=</sup> <sup>13</sup>.56, *<sup>p</sup>* <sup>&</sup>lt; <sup>0</sup>.001, <sup>η</sup><sup>2</sup> *<sup>p</sup>* = 0.32] and *Time* [*F*(2, 114) = 22.53, *p* < 0.001, η<sup>2</sup> *<sup>p</sup>* = 0.28], and a significant effect of *Group* × *Time* [*F*(4, 114) <sup>=</sup> <sup>12</sup>.83, *<sup>p</sup>* <sup>&</sup>lt; <sup>0</sup>.001, <sup>η</sup><sup>2</sup> *<sup>p</sup>* = 0.31], on GH levels. *Post-hoc* analyses revealed that no significant differences in the GH levels were observed between three groups at the T1 time point. The GH levels at the T2 (HI vs. MI: *p* < 0.001; HI vs. NEI: *p* < 0.001) and T3 (HI vs. MI: *p* = 0.013; HI vs. NEI: *p* = 0.001) time points were found to be significantly higher in the HI group as compared to the NEI and MI groups. In addition, in the HI group, the GH level was found to increase significantly at both the T2 and T3 time points relative to the T1 time point (T1 vs. T2: *p* < 0.001; T1 vs. T3: *p* = 0.004), and decrease significantly at

**FIGURE 2 | Grand averaged ERP waveforms (Fz, Cz, and Pz) in the congruent-go, congruent-no-go, incongruent-go, and incongruent-no-go conditions for the two exercise intervention (i.e., HI: high-intensity and**

**MI: moderate-intensity) groups before and after an acute bout of resistance exercise and one non-exercise-intervention (NEI) group before and after rest.**

the T3 relative to the T2 time point (*p* = 0.003). Moreover, in the MI group, the GH level was found to increase significantly at both the T2 and T3 time points relative to the T1 time point (T1 vs. T2: *p* = 0.003; T1 vs. T3: *p* = 0.020). No significant correlations emerged among the changes in GH levels and changes in behavioral and electrophysiological performances with acute exercise in any of the EI groups.

#### *Insulin-like growth factor-1 (IGF-1)*

There were significant effects of *Group* [*F*(2,57) = 3.83, *p* = 0.028, η*<sup>p</sup>* <sup>2</sup> <sup>=</sup> 0.12] and *Time* [*F*(2,114) <sup>=</sup> 10.14, *<sup>p</sup>* <sup>&</sup>lt; <sup>0</sup>.001, <sup>η</sup>*<sup>p</sup>* <sup>2</sup> <sup>=</sup> 0.15] on IGF-1 levels, as well as a significant *Group* × *Time* effect [*F*(4,114) = 5.90, *p* < 0.001, η*<sup>p</sup>* <sup>2</sup> <sup>=</sup> 0.17]. *Post-hoc* analyses revealed that no significant differences were found for the T1 time point among the three groups. However, the IGF-1 levels at the T2 (*p* = 0.003) and T3 (*p* = 0.039) time points were found to be significantly higher in the HI group as compared to the NEI group. In addition, in the HI group, the serum IGF-1 level was found to increase significantly at the T2 time point relative to T1 (*p* < 0.001), and to decrease significantly at T3 relative to T2 time point (*p* = 0.002). Moreover, in the MI group the serum IGF-1 level was found to increase significantly at the T2 and T3 time points relative to the T1 time point (T1 vs. T2: *p* = 0.010; T1 vs. T3: *p* = 0.005). No significant correlations emerged among the changes in IGF-1 levels and changes in behavioral and electrophysiological performances with acute exercise in any of the EI groups.

#### *Cortisol*

There was a significant effect of *Time* [*F*(1,57) = 21.40, *p* < 0.001, η<sup>2</sup> *<sup>p</sup>* = 0.27] on cortisol levels, and a significant *Group* × *Time* effect [*F*(4,114) = 6.10, *p* = 0.001, η*<sup>p</sup>* <sup>2</sup> <sup>=</sup> 0.18]. *Post hoc* analyses revealed that no significant differences were observed between the cortisol levels of the three groups at the T1 time point. However, in the HI group the cortisol level was found to decrease significantly at the T2 time point (T1 vs. T2: *p* = 0.023) and approach significance at the T3 time point (T1 vs. T3: *p* = 0.062) relative to T1. In the MI group the cortisol level was found to decrease significantly at both the T2 and T3 time points relative to T1 (both T1 vs. T2 and T1 vs. T3: *p* < 0.001). The serum cortisol levels between T2 and T3 did not change significantly.

Additionally, the correlations achieved significance with regard to the changes in cortisol levels and electrophysiological performance (i.e., P3 amplitude) with acute exercise in the MI (T2 vs. T1: *r* = −0.50, *p* = −024; T3 vs. T1: *r* = −0.51, *p* = 0.020) and HI (T2 vs. T1: *r* = −0.49, *p* = 0.029; T3 vs. T1: *r* = −0.58, *p* = 0.007) groups.

#### **DISCUSSION**

The purposes of the present study were to investigate the effects of acute resistance exercise on executive functions when individuals performed a Go/No-Go task combined with the Erikson Flanker paradigm, and to explore the potential biochemical mechanisms in relation to two neurotrophic factors (i.e., GH and IGF-1) and the cortisol biomarker. Even though the young males performed almost at ceiling and showed fast responses before the intervention when performing the cognitive task, acute resistance exercise, irrespective of high or moderate exercise intensity, could still affect cognitive [behavioral (e.g., RTs and accuracy rate) and electrophysiological (i.e., P3 amplitude)] performance when they performed an executive function/attentional control task. In terms of biochemical markers, such exercise prescriptions significantly increased the serum levels of two neurotrophic factors (i.e., GH and IGF-1) and lower the serum levels of cortisol in both EI groups compared to baseline. Only the cortisol levels remained stable for about 20 min after exercise, and the correlations between changes in serum cortisol levels and changes in electrophysiological (i.e., P3 amplitude) performance reached a significant level in the both EI groups.

Several experimental studies have demonstrated that acute aerobic exercise can improve cognitive performance when the subjects performed cognitive tasks involving executive functioning (Hillman et al., 2003, 2009; Davranche et al., 2009; Tsai et al., 2014). Recently, two studies also demonstrated that acute resistance exercise could significantly cause shorter RTs and higher accuracy rates when the participants performed the executive functioning tasks (e.g., the Stroop test and the Tower of London task), suggesting that such an exercise intervention could improve behavioral performance (Chang et al., 2012, 2014). In line with the findings of these previous studies, the RTs in the present study were significantly improved in both EI groups when performing the modified Flanker task after acute resistance exercise compared to pre-exercise. In addition, the finding that the accuracy rate in the incongruent-no-go condition was significantly higher after compared to before acute resistance exercise in both EI groups was also in agreement with that of a previous work (Hillman et al., 2009) which found that response accuracy was generally improved following acute aerobic exercise compared to pre-exercise, with better performance during incongruent conditions requiring greater amounts of inhibition when performing a modified Flanker task. Indeed, previous studies also reported that the beneficial effects of acute exercise could be confined to inhibitory control and attention (e.g., Drollette et al., 2012). However, given the results of the present study, where exercise had similar effects on behavioral performance in both EI groups, it appears that, regardless of whether moderate or high intensity, acute resistance exercise could be a viable approach to enhancing the executive functions involving inhibitory control and attention in young male adults. The improved behavioral performance with regard to central executive functioning found in this work could be related to increases in neural activation or general physiological arousal (Magnie et al., 2000; Joyce et al., 2009; Lambourne and Tomporowski, 2010; Dietrich and Audiffren, 2011; Pesce et al., 2011).

Indeed, the amplitude of the P3 potential increased following a bout of acute resistance exercise relative to the pre-exercise levels in both the EI groups in the present study, in a similar manner to the beneficial effects of acute aerobic exercise reported in previous studies (Hillman et al., 2003, 2009; Tsai et al., 2014). Since the P3 amplitude is proportional to the amount of attentional resources allocated to a task (Tsai et al., 2009), the findings of the current study suggest that young male adults could attain more efficient cognitive processing when performing the cognitive task after a bout of acute resistance exercise. In addition, this study also found that significant increases in P3 amplitude were observed across all three cortical sites (Fz, Cz, and Pz), supporting Polich and Kok's (1995) view that, since variations in scalp topography were not observed following fluctuations in biological state in their work, the effects of P3 amplitude and exercise occur in a global fashion. Moreover, Yanagisawa et al. (2010) used multichannel functional near-infrared spectroscopy (fNIRS) measurements and found that acute exercise could improve executive functions and increase cortical activation, suggesting that significant increases in oxy-Hb signaling could be the potential mechanism underlying such changes. Therefore, the increased oxygenation and flood flow in the brain after acute exercise is likely the cause of the neural activation (i.e., greater P3 amplitudes) found in the present study. Additionally, recent research found that glucose moderates the magnitude of the P3 ERP component when individuals performed the oddball task (Riby et al., 2008) and acute resistance exercise can facilitate blood glucose control and insulin secretion (Balaguera-Cortes et al., 2011; Moreira et al., 2012). It is possible that, since performing the executive function task in the present study requires simultaneous inhibitory control and attention, greater demands on metabolic resources could be compensated following a single bout of resistance exercise. Such an effect further facilitated the P3 performance. Another potential mechanism for the facilitative effects of acute resistance exercise on electrophysiological performance and executive function performance may be the changes in biochemical markers (i.e., GH and IGF-1) that occur in the central nervous system (Kashihara et al., 2009).

In this study we investigated two neurotrophic factors (i.e., GH and IGF-1) which play central roles in the health of neurons in the brain, since previous studies investigating the effect of resistance training on executive functions mostly discussed the potential mechanisms using these (Cassilhas et al., 2007, 2010; Seo et al., 2010). This is perhaps due to the fact that the secretions of these two biomarkers are exercise-sensitive, and occur in relation to specific physiological and metabolic demands (Gregory et al., 2013). In the current study, serum GH and IGF-1 levels were significantly increased in both EI groups after acute resistance exercise, supporting the findings of previous studies which demonstrated that a single bout of resistance training can significantly increase serum GH levels, which could produce a subsequent increase in its secondary mediator (i.e., IGF-1) (Nicklas et al., 1995), and that trained men could increase circulating IGF-1 responses with a bout of acute resistance exercise (Rubin et al., 2005). However, this study found that changes in the levels of both neurotrophic factors were not significantly correlated with the changes in behavioral and electrophysiological performance in the healthy young male adults when performing the cognitive task. These results do not stand alone, and are somewhat in agreement with prior studies which found that there was no relationship between serum IGF-1/GH concentrations and specific aspects of cognitive-behavioral measurements and electrophysiological performance (e.g., N2b) when middleaged to elderly adults performed a Go/No-Go task (Papadakis et al., 1995; Aleman et al., 1999; Quik et al., 2012). The possible explanations for the lack of correlation are as follows: (1) the serum levels of GH and IGF-1 were significantly decreased from the T2 to the T3 time points in this study; (2) even though the serum GH and IGF-1 levels increased much more after highintensity resistance exercise in the HI group compared to the MI one, these changes still did not show a very strong positive correlation with the changes in cognitive performance; and (3) the beneficial effects of acute resistance exercise on behavioral and electrophysiological performance might be explained in terms of heightened arousal, due to exercise-induced changes in reallocation of mental resources and metabolic rate (Audiffren, 2009). However, it is worth pointing out that GH and IGF-1 are important molecular mediators of neural efficiency in the human brain (Sonntag et al., 2005). Previous studies demonstrated that regular, long-term resistance exercise is associated with increases in circulating GH and IGF-1 concentrations in young males (Ballard et al., 2005; Willoughby et al., 2007), and that changes in the resting serum IGF-1 concentrations after 12 months of resistance exercise were significantly correlated with the changes in RTs and P3 amplitude in healthy elderly subjects (Tsai et al., under review). We can thus not negate the potential roles of GH and IGF-1 in the beneficial effects of regular, long-term resistance exercise on cognitive functioning, since these acute effects in the present study could be washed out due to the limited timeframe of exercise.

Given that cortisol is indicative of arousal, the results of this study support the view that the participants' level of arousal was altered after the acute exercise intervention, and this seems to have positively affected their cognitive processes (Lambourne and Tomporowski, 2010), leading to faster RTs and a higher accuracy rate following a bout of acute resistance exercise in both MI and HI groups. Since high levels of cortisol have a detrimental effect on executive functions (e.g., inhibitory control and attention regulation) (Kopell et al., 1970; Lupien and McEwen, 1997; Lyons et al., 2000), the beneficial effects of acute resistance exercise on behavioral and electrophysiological performance for both MI and HI groups in the present study could be attributed to the reduced cortisol levels. The post-exercise concentrations of cortisol that were found in the present study were significantly lower than the pre-exercise ones in both EI groups, echoing earlier studies which found that the cortisol concentration is modulated by acute exercise, and significantly decreases immediately post-exercise and for up to 1–2 h postexercise compared to pre-exercise (Kemmler et al., 2003; Heaney et al., 2013). Henckens et al. (2012) found that cortisol can modulate emotional and attentive processing, and that high circulating corticosteroid levels could negatively influence the function of the amygdala in executive networks. In addition, lower circulating corticosteroid levels modulate the neural correlates of sustained attention by reducing cuneus activity, which might shift the brain back from a stimulus-driven response mode to a more controlled mode, and restore proper brain functioning in the aftermath of stress. It is worth pointing out that the serum cortisol level was found to significantly decrease at T2 and T3 relative to the T1 time point, and did not significantly change between T2 and T3 in either EI group in the present study, suggesting that the serum cortisol levels had stabilized by at least 20 min after resistance exercise. These findings might partly support the relationship between changes in cortisol levels and cognitive performance.

However, while the changes in cortisol levels were significantly correlated with those in P3 amplitude, they were not significantly correlated with the changes in RTs and accuracy rate in either the MI and HI groups in the current study, suggesting such a biochemical marker might be more sensitive to the electrophysiological index relative to the behavioral measures. Indeed, cortisol is associated with an increased activation of the alerting/arousal component of attention (Lambourne and Tomporowski, 2010; Schulz et al., 2013), and thus resting EEG has been shown to be related to cortisol levels (Schulz et al., 2013). Previous studies also found that exercise may serve to increase neuronal synchrony, as alpha wave activation increased after a bout of acute exercise (Kubitz and Mott, 1996; Kubitz and Pothakos, 1997), and such a biological state (i.e., increased resting EEG alpha power) is related to P3 potential (Bashore, 1989; Lardon and Polich, 1996; Polich, 1997). Taken together, these results indicate that acute exercise could change the serum cortisol concentrations and the level of alpha wave activity, which in turn modulate the P3 amplitude, as seen in the current study when the participants performed a cognitive task involving executive control after acute resistance exercise. However, the positive cortisol effect on P3 amplitude induced by acute exercise in the present study need to be explored in further experiments using longer regimes. In addition, although previous studies have not demonstrated the relationship between the P3 amplitude and cortisol levels, some studies demonstrated that ERN amplitudes predicted reduced cortisol increases during a Stroop task (Amen et al., 2008), while higher ERN amplitudes were associated with a greater decrease in cortisol during a task session (Tops et al., 2006). Therefore, the results of the current study might be able to explain how the reduced cortisol levels following acute resistance exercise could efficiently increase the P3 amplitude when the participants performed the cognitive task. However, it is interesting as such cortisol effects which seem to be specific to short burst exercise might have important ramifications for neurorehabilitation after brain injury and in the elderly.

Although the electrophysiological and biochemical findings of the present study could extend the current knowledge base regarding the beneficial effects of acute resistance exercise on behavioral performance, there are the following limitations to the current work which must be addressed. First, there are welldocumented sex-specific differences in endocrine responses to acute resistance exercise, with higher basal GH levels and augmented (or at least equivalent) GH responses to a bout of acute resistance exercise found in males (Kraemer et al., 1991; Hakkinen et al., 2000). Further research is thus warranted in this area, possibly examining the relationships between acute resistance exercise and the neurotrophic factors in females. Second, since the blood samples in the current study were taken around 2–3 h after waking, the participants would have already experienced the large decrease in cortisol that occurs following the cortisol awakening response, which could lead to a favorable endocrine profile (Heaney et al., 2013). An identical experimental design, but carried out in the afternoon, is thus required to assess whether similar responses in cortisol levels after exercise would be found to those that the present study obtained in the morning. Third, since cortisol levels could also be influenced by an acute bout of aerobic exercise in young males (Kanaley et al., 2001), and such changes could affect cognitive performance (Kashihara et al., 2009), further research should explore the correlations between these in this context. In addition, since we only investigated the effects of acute exercise on the Go/No-Go task combined with the Erikson Flanker paradigm, it might also be worth considering using a broader array of cognitive measures and establishing whether the effects are different depending upon task demands.

In conclusion, this study found that a bout of moderate- or high-intensity resistance exercise could impact not only behavioral (i.e., RTs and accuracy rate) but also electrophysiological (i.e., P3 amplitude) performance in young male adults when performing a cognitive task involving executive functions. Although significantly different serum levels of neurotrophic factors (i.e., GH and IGF-1) could be secreted with different exercise intensities, there were no significant correlations between changes in the two neurotrophic factors and cognitive performance, possibly because the serum concentrations quickly returned to basal levels after exercise. The potential mechanisms underlying the changes in cognitive performance after acute resistance exercise found in the young male adults could be due to changes in arousal levels, possibly modulated by cortisol.

# **AUTHOR CONTRIBUTIONS**

Dr. Chia-Liang Tsai designed the study, wrote the protocol, and the first draft of the manuscript. Dr. Chun-Hao Wang analyzed the data. Dr. Chien-Yu Pan and Dr. Fu-Chen Chen worked on the revision of the manuscript. Dr. Tsang-Hai Huang helped collect and analyze the blood sample. Research assistant Mrs. Feng-Ying Chou helped collect data.

# **ACKNOWLEDGMENTS**

This research was supported by a grant from the National Science Council in Taiwan (NSC 100-2410-H-006-074-MY2). The authors are also grateful to the participants who gave their precious time to facilitate the work reported here.

# **REFERENCES**


Kubitz, K. A., and Pothakos, K. (1997). Does aerobic exercise decrease brain activation? *J. Sport Exerc. Psychol.* 19, 291–301.


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

*Received: 23 March 2014; accepted: 14 July 2014; published online: 01 August 2014. Citation: Tsai C-L, Wang C-H, Pan C-Y, Chen F-C, Huang T-H and Chou F-Y (2014) Executive function and endocrinological responses to acute resistance exercise. Front. Behav. Neurosci. 8:262. doi: 10.3389/fnbeh.2014.00262*

*This article was submitted to the journal Frontiers in Behavioral Neuroscience.*

*Copyright © 2014 Tsai, Wang, Pan, Chen, Huang and Chou. This is an openaccess article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.*

# Chronic exercise keeps working memory and inhibitory capacities fit

# *Concepción Padilla1,2\*, Laura Pérez 1,2 and Pilar Andrés 1,2*

*<sup>1</sup> Neuropsychology and Cognition group, Department of Psychology and Research Institute on Health Sciences, University of the Balearic Islands, Palma de Mallorca, Spain*

*<sup>2</sup> Instituto de Investigación Sanitaria de Palma, Palma de Mallorca, Spain*

#### *Edited by:*

*Lynne Ann Barker, Sheffield Hallam University, UK*

#### *Reviewed by:*

*Irini Skaliora, Biomedical Research Foundation of the Academy of Athens, Greece Stefano Sensi, University of California, Irvine, USA*

#### *\*Correspondence:*

*Concepción Padilla, Neuropsychology and Cognition group, Department of Psychology, University of the Balearic Islands, Ctra. de Valldemossa, Km 7.5, Palma de Mallorca 07122, Balearic Islands, Spain*

*e-mail: c.padilla@uib.es*

Padilla et al. (2013) recently showed that chronic aerobic exercise in young adults is associated with better inhibitory control as measured by the strategic Stop Signal Task (SST). The aim of the current study was to explore whether better inhibitory abilities, associated with high levels of physical fitness, were also associated with higher working memory capacity (WMC) in young healthy adults. Participants aged between 18 and 30 years and showing different levels of fitness confirmed by the Rockport 1-mile walking fitness test took part in this study. Active and passive participants were administered the SST to measure inhibitory control, and the Automatic Operation Span (AOSPAN) to measure verbal WMC. We first replicated Padilla et al.'s results showing that exercise specifically modulates strategic inhibitory processes. Our results also showed that active participants presented with better WMC than sedentary ones, showing a better capacity to manage simultaneously two verbal tasks and to inhibit interference. The results point to an association between chronic exercise, inhibitory abilities, and WMC. The theoretical relationship between these variables will be discussed.

**Keywords: working memory, inhibition control, aerobic exercise, young adults**

# **INTRODUCTION**

Executive functions can be described as an umbrella term including a family of controlled (in opposition to automatic) processes, which can be separated in three core functions: working memory, inhibition and cognitive flexibility (Diamond, 2013). The conjunction of these functions allows carrying out more complex functions as reasoning, problem solving and planning.

Executive functions and one of its subcomponents – working memory capacity (WMC) – have been shown to be relevant in the efficient cognitive functioning and in the progression of several developmental and neuropsychological disorders, which has resulted in a pursuit of therapeutic ways to decelerate the deterioration of such capacities. This is the case of cognitive training (e.g., Klingberg, 2010) and cardiovascular activity (e.g., Colcombe et al., 2004; Weinstein et al., 2012), both based on the principle of neuroplasticity across the lifespan. In the case of cognitive training through computer programs, transfer to other tasks that are not directly trained has not yet been clearly demonstrated in any age group (Owen et al., 2010; Shipstead et al., 2012). However, cardiovascular exercise has shown its involvement in the improvement of a wide range of executive functions in children (Hillman et al., 2011), young (Padilla et al., 2013; Pérez et al., submitted), and older populations (Erickson and Kramer, 2009), which is believed to be mediated by the release of neurotrophic factors, such as brain-derived neurotrophic factor (BDNF), insulin-like growth factor type 1 (IGF-1) and vascular endothelial growth factor (VEGF). In turn these factors are associated with the increase of the temporal (Voss et al., 2013) and prefrontal lobes' (Colcombe et al., 2003, 2004, 2006) volume and connectivity. Furthermore, aerobic exercise has been related to an increment in brain vascularity in cortical areas and the hippocampus (Lopez-Lopez et al., 2004). Nevertheless, the effects obtained with both interventions, cognitive training and cardiovascular exercise, have not yet been demonstrated to be maintained in the long-term (Lustig et al., 2009).

Prefrontal areas and associated executive functions (Colcombe et al., 2004, 2006), seem more sensitive to the beneficial effect of exercise than other areas, as several studies with seniors, children, or clinical population have revealed (Tomporowski et al., 2008a; Hertzog et al., 2009; Davis et al., 2011; Chang et al., 2012). Even short-term aerobic exercise programs performed over a 6 months period by older populations have proven to exert an improvement in executive functions and increase in volume of some areas of the brain (Colcombe et al., 2006). However, in a recent review Guiney and Machado (2013) revealed that there is a lack of studies investigating the effects of aerobic exercise on a young cohort and the few that have been published have revealed mixed results. Differences among active and sedentary groups are found using evoked potentials, but not in behavioral data (Hillman et al., 2006; Themanson et al., 2006; Guiney and Machado, 2013).

Previous studies with young participants have mainly concentrated on the effects of acute exercise, the immediate effect of a range of intensive exercise like cycling or running carried out before or while the participant is doing the cognitive task (i.e., Themanson and Hillman, 2006; Huertas et al., 2011; see Guiney and Machado, 2013 for a review). The effects of this kind of exercise are temporary and not representative of the brain changes produced by long-term exercise. Besides, these studies do not control the level of exercise that participants carried out throughout childhood, which has been shown to exert an important influence in brain development (Chaddock, 2012). Etnier and Chang (2009) have also noted that previous studies have focused in a broad range of executive tasks, resulting in mixed results.

Arguably, well theoretically grounded tasks would enable to extract the specific processes affected by exercise. In addition, it is necessary to focus on the effects of chronic exercise compared to short-term exercise on cognition because it is more likely to produce permanent changes in the brain since it is undertaken following a long-lasting routine that will generate a protective cognitive reserve (Stern, 2009). Finally, it is important that the selection criteria for participants and the difference between active and passive participants in amounts of exercise and levels of fitness may also contribute to an effect of exercise on executive control. The higher the difference between active and passive participants in these variables, the more likely it will be to observe differences in cognition.

To address these problems, Padilla et al. (2013) investigated for the first time the effects of chronic exercise on executive functions in young adults using a highly reliable executive control task: the stop signal task (SST; Logan and Cowan, 1984; Verbruggen and Logan, 2008), which assesses motor inhibitory control associated with frontal lobe functions (Weinstein et al., 2012). In this case, participants were assessed using standard and strategic versions of the SST, and the results revealed better inhibitory abilities in active participants when the task was more executively demanding (strategic version). Pérez et al. (submitted) obtained similar results using the Attention Network test (ANT, Fan et al., 2002), with physically active participants revealing better performance in the executive network.

Trying to know the cause of these results, in the current study, we wondered to what extent these differences in inhibition control were related to a better WMC in physically active participants. WM is a system for temporarily storing and managing the information required to carry out complex cognitive tasks such as learning, reasoning, and comprehension. According to Kane and Engle (2002), WMC is a hierarchical system that consists of two components: executive-attention and short-term memory. These authors equate WMC to executive functions, making the differences between them blurred. They sustain that this system allows for the proper allocation of attentional resources and the active maintenance of the information needed to accomplish a goal-directed behavior or reasoning, avoiding at the same time the interference from other external stimuli or thoughts. In this online processing, several processes come into play; the storage and rehearsal of domain-specific information, as well as other executive functions (Conway et al., 2005) and controlled attention to sustain, divide and switch the focus of attention (Engle et al., 1999). A crucial point in this model is the relationship between WMC and inhibition. Engle and collaborators argue that inhibition and WMC correlate with each other, and Redick et al. (2007) suggest that WMC affects the ability to inhibit at any of the following stages: access, deletion or restraint (see Hasher et al., 1999 work for this distinction of inhibitory functions). In this vein, several studies using the extreme groups method (i.e., selecting the participants whose scores in a working memory task are under the 25th percentile and above the 75th percentile of the normal distribution), have shown that high WMC participants present with better inhibitory abilities than low WMC on tasks such as the flanker task (Redick and Engle, 2006), antisaccades (Unsworth et al., 2004) and proactive interference (Redick et al., 2007, 2011). Moreover, WM and inhibition have been associated to dorsal prefrontal cortex activation (Kane and Engle, 2002; Andrés, 2003).

Few studies have focused on the role that WMC could be playing in the associations between physical exercise and executive functions. In the case of young populations, Hansen et al. (2004) demonstrated that fitter young adults showed better accuracy in a 2-back task and Lambourne (2006) observed that active participants showed a higher WMC than passive participants in a reading span task. However, Kamijo et al. (2010) did not find better performance in a Stenberg task in fitter young adults compared to sedentary ones. Finally, it is important to note that none of these studies measured concurrently WMC and inhibition.

To this aim, we used Engle and colleagues' WM tasks, which involve the performance of two tasks at the same time. They require maintaining a variable number of items in mind while resolving complex problems. They are good predictors of performance on other higher level cognitive tasks, such as stroop or fluid intelligence tests; as well as disorders such as Alzheimer's, alcohol consumption or stress management (Engle et al., 1999; Unsworth et al., 2005). It has been shown that WM tasks have high reliability and validity (Conway et al., 2002). In the case of the Automatic Operation Span Task (AOSPAN; Unsworth et al., 2005), it measures the phonological loop and the central executive component of WM, which is highly associated with controlled processing and attention (Baddeley, 1986, 1996).

In the present study we investigated three hypotheses. First, we wanted to replicate the results observed in our previous study (Padilla et al., 2013) showing better inhibitory abilities in physically active participants using the strategic version of the SST, which makes greater demands on executive resources as will be explained in the method section. Second we predicted that aerobic exercise would enhance WMC. Third, we evaluated to what extent the active group's better inhibitory control could be linked to a greater WMC, as suggested by Kane and Engle (2002). Thus, we expected a high relationship between the inhibitory control showed under the strategic instructions, and the WMC. A wider WMC should be associated to a better ability to inhibit interference.

The results confirmed the advantage in inhibition of active participants previously observed (Padilla et al., 2013), i.e., the physically active group showed a speeded inhibitory response when strategic instructions were applied, but most importantly, the whole active group exhibited a greater WMC. The possible relationship between these variables will be discussed.

# **METHODS**

#### **PARTICIPANTS**

Fifty eight participants ranged between 18 and 30 years of age (*M* = 22*.*26, *SD* = 3*.*26) were assigned to the active or passive groups according to their fitness levels (see **Table 1** for demographic details). The active group was formed by 29 participants with an average age of 22.21 (*SD* = 3*.*28), while the passive group consisted of 29 individuals with an average age of 22.31 (*SD* =

#### **Table 1 | Demographic variables.**


*Average and SDs (in brackets) for Age (age of participants at the moment of testing), Education (number of completed years of formal education), Vocabulary (WAIS' Vocabulary subtest score) and Rockport test (Rockport Fitness Walking Test score). \*Effect at p < 0.001.*

3*.*29). Each of these groups were further subdivided into standard (*n* = 14) and strategic (*n* = 15) subgroups according to the version of the SST that they performed. Participants were allocated to the active group if they had been doing aerobic exercise for at least 10 years, following a minimum routine of 6 h per week, distributed across at least 3 days a week. On the other hand, participants were allocated to the passive group if they had not been exercising for the last 4 years more than 1 h per week. The type of exercise that had been practiced during the last 4 years by the passive group could not have been cardiovascular (e.g., yoga, stretching, etc. were allowed). Also, they should not have done more than 6 h per week of aerobic exercise during their childhood (from 0 to 12 years old). This criterion was applied taking into account the fact that children in Spanish schools have at least 3 h per week of physical education.

Before the participants started the testing, they were interviewed by telephone following a questionnaire about demographic data, lifelong exercise routines, medical history and education. If they fulfilled the requirements to participate, they were invited to come to our university facilities to perform the testing in a 2 h session. All participants gave their informed consent and were paid or given course credits if they were students. The experiment was performed in accordance with the ethical standards stated in the 1964 Declaration of Helsinki. Each activity group was subdivided in two other groups depending on the SST instructions they received: strategic or standard.

#### **CARDIORESPIRATORY CAPACITY**

As in Padilla et al.'s (2013) study, maximal oxygen uptake was measured with the Rockport 1-mile Fitness Walking Test (Kline et al., 1987). This test was chosen due to its high correlation coefficient (0.88) with a direct index of VO2max*,*carried out using a treadmill (Kline et al., 1987; Weiglein et al., 2011). VO2 max is the maximal oxygen uptake that the organism is able to consume when it is carrying out a sub-maximal exercise. The higher the score, the higher the aerobic capacity and oxygen uptakes.

#### **DESIGN AND PROCEDURE**

First, a telephone interview was carried out to gather information about the demographic data of each participant. They were asked about their level of education, and medical history to exclude participants who suffered or had suffered in the past from any mental disorder or physical illness that could affect the results. In addition, they were asked about the frequency of exercise they had done along their whole life. If they met the criteria to participate, they were invited to come to our facilities to take part in our study.

In a 2 h session, participants completed a more detailed health questionnaire with an experienced clinical psychologist, where they had to specify whether they were having any mental or physical problem and/or taking any medication at that time. After that, they carried out the SST and AOSPAN tasks in a quiet room. Later, they completed the Wechsler Adult Intelligence Scale Vocabulary subtest and finally they completed the Rockport 1-mile Fitness Walking Test on the University campus, where they had to walk 1 mile as fast as possible to measure their initial and final pulse and the time they took to complete the distance.

#### *SST task*

The SST (Verbruggen et al., 2008; Padilla et al., 2013) task was presented on a LG computer with a 19-- Phillips monitor with a resolution of 1024 × 768 pixels. The task was programmed using the E-prime software (Schneider et al., 2002). Participants were seated approximately at 50 cm of distance from the screen and wore headphones.

As can be seen in Verbruggen et al. (2008), SST begins with the appearance of a fixation sign *(*+*)* followed by a stimulus drawn in white color presented in the center of a screen in black. Two types of trials were presented at random: the GO (75%) and the STOP (25%) trials. In the GO trials, participants had to decide as fast as possible whether a geometric figure displayed on the screen was a square or a circle. They responded by pressing "Z" or "−" on the keyboard with the index fingers. In the STOP trials, the procedure was the same with the difference that a tone was presented shortly after the geometric figure, and participants had then to inhibit their response. The interval between the geometric figure and the STOP signal followed a tracking procedure: when participants successfully withheld a response in a STOP trial, the interval between the figure and the stop signal was incremented by 50 ms; however, when participants failed to withhold their response, it was decreased by 50 ms. Doing so, the probability to inhibit a response is random, thereby, there is a 50% of likelihood of correctly withdrawing a response (see **Figure 1**).

The assessment procedure was the same as in Padilla et al.'s (2013), maintaining the same task conditions: standard and strategic. Both tasks were similar; the only difference between them is the instructions. In the standard condition (Verbruggen et al., 2008), participants were told that on the 25% of trials a tone was going to be presented. For half of these trials, it would appear very early and it would be relatively easy to withhold a response. On the other half of the STOP trials, the tone would come late, increasing the difficulty to inhibit the response. Importantly, participants were also warned that they should not postpone the responses while waiting for the potential occurrence of the stop signal. However, in the strategic condition participants were just told how to respond to the stimulus that would appear on the screen, and asked to withdraw their response when a sound appeared. They were asked not to wait to know whether the sound would appear or not, allowing them to apply the strategies they decided.

The variables measured by this test were: (a) the go RT: time to respond to the go trials; (b) the stop signal delay (SSD): mean delay between visual and auditive stimuli along all stop signal trials; (c) the stop signal RT (SSRT): latency of the inhibition process calculated by subtracting the mean SSD from the mean RT in go trials; (d) the signal respond RT (SRRT): mean time to respond incorrectly in the stop trials; (e) the percentage of correct responses in the go trials; and (f) the percentage of missed responses in the go trials.

#### *Automatic operation span*

AOSPAN (Unsworth et al., 2005) began with three blocks of practice. First, the letter practice block trained the participant to remember different sets of letters that could contain from 2 to 5 letters. Once the participants had completed a set, 12 letters were displayed in a matrix of 12 and they were asked to mark in order the letters that had been shown previously. After that, a feedback message was shown to inform participants about the number of letters correctly remembered. The second block was to practice solving a series of additions, subtractions, multiplications or divisions of one digit numbers as fast as possible. In this block, participants were presented with the operation, then had to click on the mouse left button once they had the solution in mind and then, a screen with a number appeared, in which the participant had to decide whether that number was the right solution to the problem ("false" or "true"). During this block, the software calculates the averaged time to solve these operations to set the maximum exposition time of the operation task in the experimental block. In the third practice block, both tasks were combined as with the experimental block. Individuals had first to remember in order the letters presented and then solve the arithmetical problems as fast as possible. This was followed by a variable number of trials, from two to five. Finally, participants had to say in order the letters that they remembered. The experimental block was similar to the last practice block, but the time to solve the arithmetic problem was limited to the averaged time calculated in the second practice block. The dependent variable was the total number of letters correctly recalled in all sets. This measure reflects WM *capacity* relatively uncontaminated by the processes involved in serial recall, which tend to be executive.

#### **RESULTS**

The resulting four groups of participants did not differ in terms of age [*F(*1*,* <sup>54</sup>*)* <sup>=</sup> <sup>0</sup>*.*003, *MSE* <sup>=</sup> <sup>0</sup>*.*033, *<sup>p</sup>* <sup>=</sup> <sup>0</sup>*.*955, <sup>η</sup><sup>2</sup> *<sup>p</sup>* = 0*.*000] or years of formal education [*F(*1*,* <sup>54</sup>*)* = 1*.*046, *MSE* = 11*.*546, *<sup>p</sup>* <sup>=</sup> <sup>0</sup>*.*311, <sup>η</sup><sup>2</sup> *<sup>p</sup>* = 0*.*019] (see **Table 1**). They showed similar vocabulary levels measured with the Vocabulary subtest from the Wechsler Adult Intelligence Scale (Wechsler, 1999), [*F(*1*,* <sup>54</sup>*)* = <sup>2</sup>*.*469, *MSE* <sup>=</sup> <sup>106</sup>*.*887, *<sup>p</sup>* <sup>=</sup> <sup>0</sup>*.*122, <sup>η</sup><sup>2</sup> *<sup>p</sup>* = 0*.*044]. Active participants had practiced cardiovascular exercise for an average of 204.621 months (17.052 years) and a total of 8488.107 h (*SD* = 5008*.*938) during their lives. The passive participants had practiced aerobic exercise during their lives, that is, before the past last 4 years and mostly during their childhood, which we consider from 0 to 12 years old; in an average of 67.414 months (5.618 years) with a mean of 1559.079 h (*SD* = 1646*.*452) across the lifespan. Averages of months [*F(*1*,* <sup>54</sup>*)* <sup>=</sup> <sup>70</sup>*.*575, *MSE* <sup>=</sup> <sup>269507</sup>*.*589, *<sup>p</sup> <sup>&</sup>lt;* <sup>0</sup>*.*001, <sup>η</sup><sup>2</sup> *<sup>p</sup>* = 0*.*567] and hours of sport [*F(*1*,* <sup>53</sup>*)* = 46*.*882, *MSE* = 680356813*.*058, *p <* 0*.*001, η<sup>2</sup> *<sup>p</sup>* = 0*.*469] were significantly different between passive and active participants.

As expected, physically active participants showed higher scores than passive in the Rockport test: 57.517 (*SD* = 8*.*381) against 45.795 (*SD* = 8*.*100) [*F(*1*,* <sup>54</sup>*)* = 28*.*515, *MSE* = 1976*.*772, *p <* 0*.*001, η<sup>2</sup> *<sup>p</sup>* = 0*.*346], which means active participants presented higher cardiovascular capacity as a consequence of their exercise routines.

The most important measures of the SST, which were used to test our hypotheses, were GO RTs (the time taken to respond to the primary or go task) and SSRT (the time required to inhibit an already initiated response, that is, inhibition control). These measures are presented in **Figures 2**, **3**, while all other measures (SSD, SRRT, percentages of correct and missed responses) are reported in **Table 2**.

A univariate 2 (group) × 2(instructions) ANOVA carried out on the GO RTs, revealed a significant effect of instructions [*F(*1*,* <sup>54</sup>*)* <sup>=</sup> <sup>22</sup>*.*407, *MSE* <sup>=</sup> <sup>546152</sup>*.*009, *<sup>p</sup> <sup>&</sup>lt;* <sup>0</sup>*.*001, <sup>η</sup><sup>2</sup> *<sup>p</sup>* = 0*.*293], whereby the strategic version yielded longer RTs (*M* = 838*.*678, *SD* = 140*.*506) than the standard one (*M* = 644*.*487, *SD* = 164*.*513). No significant effect of group [*F(*1*,* <sup>54</sup>*)* = 0*.*004, *MSE* <sup>=</sup> <sup>107</sup>*.*863, *<sup>p</sup>* <sup>=</sup> <sup>0</sup>*.*947, <sup>η</sup><sup>2</sup> *<sup>p</sup>* = 0*.*000] or instructions × group

interaction [*F(*1*,* <sup>54</sup>*)* = 0*.*066, *MSE* = 1609*.*640, *p* = 0*.*798, η2 *<sup>p</sup>* = 0*.*001] were found.

A univariate 2 (group) × 2(instructions) ANOVA on the SSRT data revealed a trend for the effect of instruction [*F(*1*,* <sup>54</sup>*)* = 3*.*190, *MSE* <sup>=</sup> <sup>10824</sup>*.*943, *<sup>p</sup>* <sup>=</sup> <sup>0</sup>*.*080, <sup>η</sup><sup>2</sup> *<sup>p</sup>* = 0*.*056], and, most importantly, a significant instruction × group interaction [*F(*1*,* <sup>54</sup>*)* = <sup>4</sup>*.*227, *MSE* <sup>=</sup> <sup>14344</sup>*.*243, *<sup>p</sup>* <sup>=</sup> <sup>0</sup>*.*045, <sup>η</sup><sup>2</sup> *<sup>p</sup>* = 0*.*073]. As in Padilla et al.'s (2013) study, *t*-tests revealed that active participants exhibited faster SSRT than passive participants in the strategic [*t(*26*)* = −2*.*460, *p* = 0*.*021, *d* = −0*.*965], but not the standard condition [*t(*28*)* = 0*.*488, *p* = 0*.*630, *d* = 0*.*184]. Furthermore, active participants inhibited responses faster under strategic instructions compared to standard instructions [*t(*27*)* = −2*.*489, *p* = 0*.*019, *d* = −0*.*958], while passive participants, in contrast, showed similar SSRTs regardless of instructions [*t(*27*)* = 0*.*212, *p* = 0*.*834, *d* = 0*.*082]. Finally, the comparison between SSRTs from the active participants in the strategic condition and the passive participants in the standard condition was just significant [*t(*27*)* = −2*.*057, *p* = 0*.*050, *d* = −0*.*79]. In sum, active participants from the strategic condition presented with better inhibitory responses than the remaining groups, as can be seen in **Figure 3**.

Further univariate 2 (group) × 2 (instructions) ANOVAs were carried out on the remaining measures (SSD, SRRT, percentage of correct responses, and missed responses). They all revealed main effects of instructions (all *p*s *<* 0.005), but no significant effect of group or group × instructions interaction (all *p*s *>* 0.341). It is noteworthy that the number of responded trials decreased in the strategic version, but the number of errors remained the same between standard and strategic conditions (see **Table 2**).

Regarding WMC (**Figure 4**), a univariate 2 (group) × 2 (SST condition: strategic vs. standard) ANOVA showed a significant group effect [*F(*1*,* <sup>54</sup>*)* = 4*.*309, *MSE* = 539*.*745, *p* = 0*.*043, η2 *<sup>p</sup>* = 0*.*074], revealing greater WMC for the active (*M* = 54*.*414, *SD* = 8*.*471) than for the passive participants (*M* = 48*.*379, *SD* = 13*.*116) [*t(*56*)* = 2*.*081, *p* = 0*.*042, *d* = 0*.*556]. There was no SST condition × group interaction [*F(*1*,* <sup>54</sup>*)* = 0*.*480, *MSE* = 60*.*159, *<sup>p</sup>* <sup>=</sup> <sup>0</sup>*.*491, <sup>η</sup><sup>2</sup> *<sup>p</sup>* = 0*.*009]. There were no differences among SST

conditions, participants who belonged to the strategic group, showed a similar WMC than those who belonged to the standard group [*F(*1*,* <sup>54</sup>*)* <sup>=</sup> <sup>0</sup>*.*013, *MSE* <sup>=</sup> <sup>1</sup>*.*656, *<sup>p</sup>* <sup>=</sup> <sup>0</sup>*.*909, <sup>η</sup><sup>2</sup> *<sup>p</sup>* = 0*.*000]. A separated ANOVA was performed with just the participants that carried out the strategic version, finding here also significant differences [*F(*1*,* <sup>26</sup>*)* = 4*.*254, *MSE* = 464*.*143, *p* = 0*.*049, η2 *<sup>p</sup>* = 0*.*141].

Once we observed a greater WMC in the physically active group, we ran an ANCOVA to evaluate the extent to which WMC could explain the differences between active and passive participants observed in the SSRT scores. The results showed that the instruction × group interaction [*F(*1*,* <sup>53</sup>*)* = 3*.*754, *MSE* = <sup>12578</sup>*.*461, *<sup>p</sup>* <sup>=</sup> <sup>0</sup>*.*058, <sup>η</sup><sup>2</sup> *<sup>p</sup>* = 0*.*066] did no longer reach statistical significance after controlling for WMC, indicating that WM capacity and SSRT scores were related to some extent. When the ANCOVA was applied only to the sample from the strategic version, the group effect did not reach statistical significance either [*F(*1*,* <sup>25</sup>*)* <sup>=</sup> <sup>4</sup>*.*162, *MSE* <sup>=</sup> <sup>13485</sup>*.*547, *<sup>p</sup>* <sup>=</sup> <sup>0</sup>*.*052, <sup>η</sup><sup>2</sup> *p* = 0*.*143].

Finally, the relation between exercise, WMC and inhibition (SSRT in the strategic condition) was evaluated with a hierarchical multiple regression analysis that entered group (active and passive) as the predictor variable in the step 1, and WM capacity in the step 2 to evaluate its additional contribution. Simple correlation values of all pairs of variables are shown in **Table 3**. The *R square* in step 1 was 0.189, which was highly significant [*F(*1*,* <sup>26</sup>*)* = 6*.*054, *MS*residual = 3155*.*189, *p* = 0*.*021], indicating a relationship between exercise and inhibition. However, the *R*<sup>2</sup> change in step 2 was 0.010, which was not significant [*F(*2*,* <sup>25</sup>*)* = 3*.*105, *MS*residual = 3240*.*495, *p* = 0*.*062], indicating no significant relationship between inhibition and WM.

#### **DISCUSSION**

The aim of this study was to investigate inhibitory/executive control and WMC in physically active compared to passive

# **Active Passive Strategic Standard Strategic Standard** SSD\* 611.91 (175.30) 348.98 (181.84) 547.28 (157.70) 368.43 (150.03) SRRT\* 752.49 (165.76) 569.67 (167.42) 743.96 (136.37) 545.43 (112.29) Go Accuracy\* 90.94 (11.34) 97.55 (3.24) 90.06 (12.89) 98.32 (1.67) Go miss\* 8.46 (11.48) 1.79 (3.44) 9.34 (13.12) 0.89 (1.20) Go error 0.60 (1.02) 0.65 (1.11) 0.60 (0.91) 0.79 (0.87) Stop accuracy 48.36 (6.63) 52.10 (5.19) 50.01 (8.36) 50.05 (5.55)

**Table 2 | Stop signal task variables.**

*Average and SDs (in brackets) for SSD, Delay between visual and auditive stimulus; SSRT, Inhibition process latency; SRRT, RT incorrect responses in the stop trials; Go Accuracy, % correct responses in Go trials; Go Miss, Miss responses in Go trials; Go Error, response errors in Go trials; and Stop accuracy, correct inhibited responses in stop trials; \*Effect of instructions at p < 0.005.*

**Table 3 | Correlations found between inhibition and multiple variables.**


*R is the Pearson Correlation and p is the p-value indicating the level of significance.*

participants. To this aim, we used the SST (Verbruggen et al., 2008; Padilla et al., 2013) and the AOSPAN task (Unsworth et al., 2005) to evaluate inhibition and WMC respectively. We also investigated to what extent the group differences observed in inhibition/executive control could be related to WMC.

Our results were in line with expectations in regard to the effect of task manipulations on performance: the strategic version of the SST gave rise to longer GO RTs than the standard version, replicating Padilla et al.'s (2013) findings. These results confirm that the strategic and standard versions of the SST are measuring different ways to deal with the task, with the strategic one allowing for the implementation of the "goal priority strategy" (Leotti and Wager, 2010; Sella et al., 2013), which consists of the lengthening of the GO RTs in order to improve the performance in the stop signal trials. When instructions allowed participants to apply a trade-off as in the strategic version, SST was analogous to a dual task, where each task must be carried out at the same level of accuracy and speed. The cognitive resources must be divided to control the performance in both tasks. The consequences of "the goal priority trade-off" are that participants wait longer to produce their response in the go trials to make sure the stop signal is not going to appear. This gives rise to an increased number of omissions, as participants produce their response after the maximal interval they are allowed to respond (see **Table 2**). This pattern of results reflects a different (more executive) way to deal with the task in its strategic version (although the different instructions did not affect the stop accuracy or the number of errors in the go trials).

Second, the results replicated Padilla et al.'s (2013) in showing that the physically active participants obtained a better inhibitory control (shorter SSRTs) than the passive ones, but only under the strategic condition of the SST. It is important to note that both groups of participants in the strategic condition had the same number of errors than the standard condition groups, but just the active participants in the strategic condition were faster withdrawing their responses in the stop signal trials compared to the remaining groups. Also, active participants were faster inhibiting their responses in the strategic version of the task than active participants in the standard condition, but this difference was not observed in passive participants. This is consistent with the findings by Pérez et al. (submitted) showing a relatively specific relationship between exercise and executive attention when using the ANT task (Fan et al., 2002). Since active and passive participants showed similar go RTs within each version of the SST (strategic and standard), it is important to note that both groups did not differ in terms of general speed of processing (Salthouse, 1996), which means that the benefit induced by chronic exercise is specific to the inhibition process.

Third, the results also revealed higher WMC in active compared to passive participants as we expected. A recent review studying the effects of acute aerobic exercise on working memory in young adults (Verburgh et al., 2013) revealed a very low effect size (*d* = 0*.*05), however, we obtained a medium effect size (*d* = 0*.*556). Importantly, the results showed that controlling for WMC as a covariate, reduced the group differences in inhibition to the point that the group effect on inhibition (active group in the strategic condition) no longer reached statistical significance.

However, although WMC explains a percentage of inhibition variance (strategic SSRT), this did not result in a strong relationship, since the regression between WMC and SSRT did not reach significance. It could be argued that this might be linked to the size of our sample. Nonetheless, using a significantly bigger sample (*n* = 262), Wilhelm et al. (2013) did not find correlations either between tasks that assess inhibition and interference control, such as Flanker and Simon tasks, and those that evaluate WM, although they did find a high correlation between updating, complex-span and binding tasks. One of the possible explanations that Wilhelm et al. (2013)raised about Engle's group findings showing correlations between WMC and inhibition is the use of the extreme-groups method, which removes most of the variability from the group, increasing in turn the likelihood to find correlations between WM and inhibition (Preacher et al., 2005). Here, it is worth mentioning that other studies that did not apply the extreme-groups technique did not find correlations between WM and inhibition (Friedman and Miyake, 2004; Hofmann et al., 2009). It is therefore likely that inhibition and WMC show some processing overlap and support each other, but they are also independent and none seems to be the unique cause of the other.

Our results are consistent with the suggestion by Davidson et al. (2006) and Zanto et al. (2011) claiming that WMC and inhibition, although independent, are interrelated and work together. These authors suggest that WM supports inhibitory control holding the task goal in mind. Focusing on the task decreases the probability of interference from irrelevant stimuli. At the same time, inhibitory control supports WM in different ways, for example, preventing recovery of related but unwanted memories, or avoiding the emergence of distractors. If this kind of information is not inhibited, it may result in mind-wandering (Diamond, 2013).

Roberts and Pennington (1996; also see Nyberg et al., 2009) attempted to understand the interaction between WM and inhibition attending to the premise that they are independent processes sharing limited resources. They suggest that inhibitory performance results from a dynamic interaction among one's WMC, the strength of the competing prepotencies or inhibitory task demands, and the WM task demands. It is only when demands on inhibition and/or WM reach high enough levels that this competition for common limited resources, or resource sharing, takes place. The individual differences in capability will therefore only be observed when the task demands are high (when they exceed a certain threshold). The implication of this is that "tasks that require both (WM and inhibition) are more likely to tax the PFC, although tasks that have a very high demand for either are also hypothesized to be prefrontal tasks" (Pennington and Ozonoff, 1996, p. 338).

The pattern of results observed in our studies is well explained under this model. We did not find any difference between active and passive participants in the standard condition of the SST (nor did we in Padilla et al.'s 2013 study), given that the attentional and WM demands are relatively low. However, the groups differed in the more executive version of the SST, where the physically active group showed better inhibition control. That the standard version of the SST does not require great deal of attentional or WM resources is supported by the finding by Yamaguchi et al. (2012) that response inhibition does not suffer from dual task interference. Since active participants present with higher WMC, this enables them to deal with the higher WM demands of the strategic version of the SST. However, when little WMC is required by the task, as it is the case in the standard version of the SST, the higher WMC observed in the active participants does not come as an advantage to contribute to the inhibitory process, resulting in no differences in inhibition between active and passive participants.

We emphasize that in our study we focused on the effect of aerobic chronic exercise as opposed to aerobic acute exercise as examined in past studies with young adults (e.g., Huertas et al., 2011). There are few studies studying the effects of chronic exercise in this age group (Verburgh et al., 2013), given that cognitive functions at this period of life are at their maximum level (Salthouse and Davis, 2006), and they are less likely to be affected by exercise interventions as a ceiling effect may be observed. Most of the studies with young adults have failed to demonstrate differences among active and passive participants with behavioral data. Those that have found an effect of exercise have used psychophysiological measures, demonstrating different patterns of brain activation. For instance, Hillman et al. (2006) and Themanson and Hillman (2006) revealed differences in the P3 component between young active and passive adults using the task-switching paradigm. Themanson and Hillman (2006) and Themanson et al. (2006) showed a lower error related negativity amplitude (ERN or Ne) and a higher error positivity (Pe) in the active group. Nevertheless, further studies using neuroimaging techniques are necessary to elucidate the positive effects of aerobic exercise on brain structure and connectivity in this group of age. For this reason, we used a strict selection criterion for the recruitment of participants, with active participants having practiced aerobic exercise for at least 10 years with a frequency of at least 6 h a week. We also decided to apply a more strategic task to deal with the likelihood of a ceiling effect.

Acute and chronic exercise has different effects on the brain. Acute exercise spans from 10 to 40 min and the cognitive tasks may be applied during or after the aerobic exercise is being performed. Chronic exercise, instead, range from periods of training of 3–6 weeks (Griffin et al., 2011), 6 months or 1 year, up to 10 years in our case. Acute exercise is related to an increase in brain blood flow, as well as the levels of vasopressin, βendorphine, catecholamines, and adrenocorticotropic hormone in plasma (Chmura et al., 1994; McMorris et al., 2008), which is thought to reflect neurotransmitters levels in the brain and lead to an elevated arousal that would enhance cognitive performance. A recent meta-analysis (Verburgh et al., 2013) has found a moderate positive effect of acute exercise (*d* = 0*.*52) on executive functions in children, adolescents, and young adults, that was more pronounced in inhibition/control processes than working memory tasks. It is worth noting to remark that these effects are temporary, since the tasks are applied during or just before participants are doing the exercise. On the contrary, chronic exercise is accompanied by more permanent physiological changes in the brain, such as the formation and extension of new vessels, which result in the improvement of brain perfusion. Also, neurogenesis and release of neurotrophic factors take place increasing the chances of neural growth and survival, which affects learning and memory learning capabilities (Voss et al., 2011). Larger brain volumes have been shown in active children (Chaddock et al., 2011) and old adults (Colcombe et al., 2006), while there is a lack of research using neuroimaging in young adults. Thereby, it is more likely to induce brain cognitive reserve with chronic compared to acute exercise (Ahlskog et al., 2011; Smith et al., 2013), as several studies with healthy children or old adults have suggested (Tomporowski et al., 2008b; Howie and Pate, 2012; Voss et al., 2013). These interventions have promising results for combating mental disorders such attention deficit hyperactivity disorder (ADHD) (Gapin et al., 2011) or dementia (Ahlskog et al., 2011; Smith et al., 2013).

Regarding the cognitive processes measured in our study, previous results (see Hillman et al., 2008 for a review) have shown that long-term high cardiovascular fitness gives rise to significant volumetric and functional improvements particularly in prefrontal areas, which underpin inhibition and executive processes. Recent research has revealed for example that gray matter volume of the right inferior frontal gyrus mediates the relationship between higher cardiovascular fitness and interference control in the Stroop task in older adults (Weinstein et al., 2012). It is therefore possible that the functional network that supports inhibitory mechanisms is preferentially boosted by the cardiovascular effects of exercise, which is consistent with the pattern of results observed in our studies revealing a relatively specific effect of chronic exercise on tasks requiring inhibitory control (Padilla et al., 2013; Pérez et al., submitted).

The results from the current study show that inhibition and WM can be potentiated by the chronic practice of physical exercise, which can be defined as a kind of exercise performed under a high frequency and long-term routine; in comparison with individuals who have a very sedentary lifestyle.

Our results also suggest that inhibition and WM are independent processes, but dependent on a limited shared capacity. This capacity is the quantity of information that can be held active, and that makes us self-aware. WM and inhibition processes are necessary to carry out a goal-directed task. However, in most cases, inhibition processes depend on WM, since it is crucial to keep in mind what must be inhibited (executive processes are "superordinate" in relation to inhibition, Nyberg et al., 2009).

Concerning our experimental design, cross-sectional studies that explore the influence of long-term aerobic exercise on cognition, brain function, and structure, along with cognitive reserve would be necessary in future studies. Most of the studies that are carried out under the category of chronic exercise do not span the range of more than 1 year and do not explore the effects once the intervention has finished. The present study is a better way to evaluate how exercise gives rise to cognitive reserve, since it accounts for the true chronic exercise that is performed throughout life, although not all variables can be controlled for.

Future research should establish how different ranges of physical activity in terms of frequency and years of aerobic exercise can affect cognitive performance, brain volume or connectivity, instead of being chosen arbitrarily. For example, it can be differentiated between acute, short-term, and long-term interventions.

Moreover, more executive tasks are recommended to challenge executive functions in a way that inhibition and WMC demands are high enough to see the benefits of exercise in young populations, as we have demonstrated in our study. Neuroimaging studies would also be required to establish the functional and structural brain changes produced by chronic aerobic exercise in young populations.

Finally and to conclude, the present study has demonstrated that chronic aerobic exercise benefits not only physical, but also cognitive functions across the lifespan.

# **AUTHOR CONTRIBUTIONS**

Pilar Andrés and Concepción Padilla designed and planned the study, analyzed, and interpreted the data and wrote the manuscript. Laura Pérez collaborated in the design and planning of the study and helped in data collection.

#### **ACKNOWLEDGMENTS**

We would like to thank Fabrice B. R. Parmentier and Gillian Cooke for their help and comments on our manuscript. This study was part of Concepción Padilla's doctoral thesis. Concepción Padilla, Laura Pérez, and Pilar Andrés were supported by a research grant from the Spanish Ministry of Science and Innovation (PSI2010-21609-C02-02). Concepción Padilla was granted with a FPI predoctoral studenship from the Spanish Ministry of Economy and Competitivity (BES-2011-043565).

### **REFERENCES**


event-related brain potentials in a task switching paradigm. *Int. J. Psychophysiol.* 59, 30–39. doi: 10.1016/j.ijpsycho.2005.04.009


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

*Received: 07 November 2013; accepted: 31 January 2014; published online: 11 March 2014.*

*Citation: Padilla C, Pérez L and Andrés P (2014) Chronic exercise keeps working memory and inhibitory capacities fit. Front. Behav. Neurosci. 8:49. doi: 10.3389/fnbeh. 2014.00049*

*This article was submitted to the journal Frontiers in Behavioral Neuroscience. Copyright © 2014 Padilla, Pérez and Andrés. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s)*

*or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.*

# The effects of long-term resistance exercise on the relationship between neurocognitive performance and GH, IGF-1, and homocysteine levels in the elderly

#### **Chia-Liang Tsai <sup>1</sup>\*, Chun-Hao Wang<sup>1</sup> , Chien-Yu Pan<sup>2</sup> and Fu-Chen Chen<sup>3</sup>**

<sup>1</sup> Lab of Cognitive Neurophysiology, Institute of Physical Education, Health and Leisure Studies, National Cheng Kung University, Tainan, Taiwan

<sup>2</sup> Department of Physical Education, National Kaohsiung Normal University, Tainan, Taiwan

<sup>3</sup> Department of Recreation Sport and Health Promotion, National Pingtung University of Science and Technology, Pingtung, Taiwan

#### **Edited by:**

Lynne Ann Barker, Sheffield Hallam University, UK

#### **Reviewed by:**

Anna Maria Colangelo, University of Milano-Bicocca, Italy Sue McHale, Sheffield Hallam University, UK

#### **\*Correspondence:**

Chia-Liang Tsai, Lab of Cognitive Neurophysiology, Institute of Physical Education, Health and Leisure Studies, National Cheng Kung University, NO. 1, University Road, Tainan City 701, Taiwan e-mail: andytsai@mail.ncku.edu.tw This study aimed to investigate the effects of a long-term resistance exercise intervention on executive functions in healthy elderly males, and to further understand the potential neurophysiological mechanisms mediating the changes. The study assessed forty-eight healthy elderly males randomly assigned to exercise (n = 24) or control (n = 24) groups. The assessment included neuropsychological and neuroelectric measures during a variant of the oddball task paradigm, as well as growth hormone (GH), insulin-like growth factor-1 (IGF-1), and homocysteine levels at baseline and after either a 12 month intervention of resistance exercise training or control period. The results showed that the control group had a significantly lower accuracy rate and smaller P3a and P3b amplitudes in the oddball condition after 12 months. The exercise group exhibited improved reaction times (RTs), sustained P3a and P3b amplitudes, increased levels of serum IGF-1, and decreased levels of serum homocysteine. The changes in IGF-1 levels were significantly correlated with the changes in RT and P3b amplitude of the oddball condition in the exercise group. In conclusion, significantly enhanced serum IGF-1 levels after 12 months of resistance exercise were inversely correlated with neurocognitive decline in the elderly. These findings suggest that regular resistance exercise might be a promising strategy to attenuate the trajectory of cognitive aging in healthy elderly individuals, possibly mediated by IGF-1.

**Keywords: resistance exercise, aging, cognition, IGF-1, GH, homocysteine, neuroelectric**

# **INTRODUCTION**

While life expectancy has been increasing in developed countries, one risk associated with a rapid growth in aged populations is that the number of people suffering from cognitive impairments and dementia could increase, due to age-related deteriorations in an array of cognitive processes involving central executive functions, attention, and short- and long-term memory (Anderson and McConnell, 2007). This can be attributed to gradual declines in physical activity levels (Kimura et al., 2013), the structural integrity of the brain (e.g., frontal, parietal, and temporal lobes) (Jernigan et al., 2001; Raz et al., 2004), and the secretion of growth factors (e.g., growth hormone (GH) and insulin-like growth factor-1 (IGF-1)) from the neurochemical system (Sonntag et al., 2005), which inevitably occur with aging. Cognitive impairment is closely related to decreases in both living-independence and general health. Therefore, determining how to counteract neurocognitive decline in order to reduce the costs associated with geriatric care is becoming an important issue for many public health systems around the world.

Recently, researchers have focused on the role of the GH/IGF-1 axis in attenuating age-related neurocognitive decline. Elderly individuals experience a fall in the ability to secrete GH, and a subsequent decrease in its secondary mediator (i.e., IGF-1), which is primarily produced in the liver (90%) but also by other cell types in the brain and vasculature (Sonntag et al., 2005). Both substances cross the blood-brain barrier and bind to receptors in the central nervous system to stimulate the growth of glial cells, myelination, and neurons: therefore, agerelated decreases in serum GH and IGF-1 levels might contribute to cognitive decline in the elderly (Sonntag et al., 2005). A number of studies have attempted to understand the relationship between neuropsychological test performance and serum GH and IGF-1 levels in the elderly. Although some argue that there is no relationship between serum IGF-1 and GH concentrations and specific neurocognitive functions (e.g., GH vs. event-related potential (ERP) N2b component; IGF-1 vs. delayed recall and Brus reading) (Papadakis et al., 1995; Aleman et al., 1999; Quik et al., 2012), there is growing evidence for a significant association between neuropsychological measures (e.g., GH vs. reaction times (RTs) in selective attention and short-term memory; IGF-1 vs. speed of information processing) and serum IGF-1 and GH levels in elderly populations (Rollero et al., 1998; Aleman et al., 1999; Kalmijn et al., 2000; Dik et al., 2003; Quik et al., 2012). These cross-sectional studies, however, do not provide sufficient evidence for effective interventions to prevent or reduce the risk of cognitive decline in the elderly. The inconclusive findings of such studies regarding the relationship between serum GH and IGF-1 levels and age-related neurocognitive decline in healthy elderly individuals mean that more research needs to be carried out that uses long-term exercise interventions to clarify the role of growth factors in this phenomenon.

Exercise is a lifestyle factor crucial to the prevention or delayed onset of mild cognitive impairment in later life (Smith et al., 2013). The potential mechanisms for this effect include exerciseinduced increases in the levels of nerve growth factors, such as GH and IGF-1, as these play central roles in the health of neurons in the brain. Previous studies investigating the effect of resistance training on cognition mostly discussed the potential mechanisms using GH, IGF-1, and homocysteine, possibly due to the fact that these biomarker secretions are exercisesensitive, since they are characterized by different physiological and metabolic demands (Cassilhas et al., 2007, 2010; Seo et al., 2010). However, several experimental studies suggest that basal levels of serum IGF-1 (Singh et al., 1999; Cassilhas et al., 2007, 2010) and GH (Seo et al., 2010) in the elderly are only affected by long-term resistance training at moderate and high intensities.

A growing number of studies strongly support the beneficial effects of resistance exercise on various aspects of cognitive performance, such as on executive control, memory, attention, and mini mental state examination (MMSE) score (Perrig-Chiello et al., 1998; Ozkaya et al., 2005; Cassilhas et al., 2007; Liu-Ambrose et al., 2010). In contrast, Kimura et al. (2013) and Venturelli et al. (2010) failed to replicate these results, reporting that 3 months of resistance training did not significantly improve cognitive performance (e.g., on MMSE and task switching) in healthy elderly populations. These inconsistent results can be attributed to differences in the forms of resistance exercise that were prescribed, and the neuropsychological measures that were assessed. The kinds of resistance exercise that successfully enhanced cognition in elderly subjects included moderate to high intensity training performed two or three times per week. Additionally, different cognitive functions require different training periods: for example, shortterm (e.g., 8 weeks) resistance training can only benefit shortterm memory (Perrig-Chiello et al., 1998), whereas long-term (e.g., 12 months) resistance training is better with regard to enhancing long-term memory and executive functions (Liu-Ambrose et al., 2010).

Executive functions are susceptible to senescence (Alvarez and Emory, 2006), as the neural networks involved in these are subject to age-related atrophy (West et al., 2010; O'Connell et al., 2012). The oddball task is a cognitive test extensively used by previous studies to examine the effects of aging on executive functions involving information processing. The performance of ERPs during the oddball task is a sensitive index of the changes in neural activity related to cognition that are associated with aging, with the P300s (e.g., P3a and P3b) being the most sensitive biomarkers for normal aging and age-related pathology among the neuroelectric markers elicited during this task (Pontifex et al., 2009; West et al., 2010; O'Connell et al., 2012). Since neuronal loss in the cerebral cortex and the cerebral white matter commences during the third decade of life (Jernigan et al., 2001), this causes a linear decrease in the amplitude of the potentials, coupled with a marked anterior shift in the topographical orientation of the P300 components observed in the elderly when performing the oddball task (West et al., 2010; Richardson et al., 2011; O'Connell et al., 2012). Fortunately, executive functions are more strongly affected by physical activity or exercise than other aspects of cognitive functioning (Colcombe and Kramer, 2003). A number of studies have demonstrated that active lifestyles in the elderly can enhance performance on the oddball task. For example, fitter elderly subjects showed shorter RTs and larger P3b amplitudes compared with less-fit age-matched counterparts (Pontifex et al., 2009). Similarly, physically active elderly subjects with habitual moderate exercise demonstrated better performance on the oddball task, such as faster RTs and larger P3 amplitudes (McDowell et al., 2003; Hatta et al., 2005). Exercise intervention could thus elicit training effects on the performance of such cognitive tasks in the elderly.

Previous studies on exercise and brain aging have intensively examined neuropsychological (i.e., behavioral and psychomotor) measures. However, thus far, research has not yet considered the effects of long-term resistance exercise on neuroelectric performance, nor has it explored the potential neurophysiological mechanisms impacting performance of the visual oddball task in healthy elderly individuals. Therefore, the aims of this study were as follows: (1) to investigate the effects of a 12 month resistance exercise intervention on executive functions, with respect to age-related effects on neuropsychological and neuroelectric performance elicited during the visual oddball task; and (2) to further explore whether changes in basal levels of serum GH, IGF-1, and homocysteine relate to the effects of long-term resistance exercise on executive functions. Based on the above review of research on cognition and neurobiology, the current study examined the following hypotheses: (1) longterm resistance exercise can attenuate or minimize certain aspects of the progression of cognitive degeneration in healthy elderly individuals when performing an oddball task designed to elicit P3a and P3b components; and (2) exercise-induced changes in growth factors can act as potential mechanisms for this effect. We believe the combination of neuropsychological and neuroelectric measures, with biochemical measures obtained during long-term exercise intervention, can provide more compelling evidence and deeper insights into the nature of the effects of exercise on attenuation or prevention of the cognitive decline associated with aging.

# **METHODS**

# **SUBJECTS**

This study recruited forty-eight elderly males (aged 65–79 years, mean 71.40 ± 3.79) from senior community centers in rural areas of Taiwan, with these subjects being more sedentary and less educated than their city-dwelling counterparts, and thus presumably more responsive to exercise training. We only recruited male subjects because the effects of exercise on endocrinological responses may be gender-dependent (Baker et al., 2010). Subjects were recruited with the use of print advertisements and underwent screening by a standardized telephone interview. They then underwent a medical examination, including heart rate and blood pressure measurements, electrocardiography, routine laboratory testing, a standardized neurological examination, and a structured interview on their previous medical histories. The medical exam ascertained whether subjects were free of a history of brain injury, liver and kidney dysfunction, severe medical conditions affecting the dopaminergic system, neurological disorders, and psychiatric illnesses, such as depressive symptoms [defined by scores above 13 on the Beck Depression Inventory, second edition (BDI-II)], dementia, and mild cognitive impairment (defined by scores below 26 on the MMSE) (Ruscheweyh et al., 2011). The Edinburgh Handedness Inventory assessed all subjects as right-handed (Oldfield, 1971). The study obtained written informed consent from all participants, and was approved by the Institutional Ethics Committee of National Cheng Kung University.

#### **STUDY PROCEDURE**

**Figure 1** presents the procedure of this study. The original cohort consisted of 60 subjects. Based on the assessment of two physicians specializing in geriatric care and physiotherapy, three subjects were excluded due to incomplete individual data, three due to high blood pressure, and six due to musculoskeletal problems, neurological disorders, or psychiatric illness (e.g., scores above 13 on the BDI-II or below 26 on the MMSE), leaving a total of 48 subjects for the present study. Participants also completed the 7-day physical activity recall questionnaire (7-day PAR; Sallis et al., 1985) to ascertain their previous levels of physical activity, in order to provide sufficient preactivity screening to lower potential risk factors prior to longterm resistance training. The 48 participants were randomized to the exercise group (i.e., resistance exercise intervention) or control group after matching for age and baseline level of physical activity. The two groups did not significantly differ at baseline in any of the demographic characteristics, including years of formal education, body mass index, years of smoking, MMSE, BDI-II, or systolic and diastolic pressure (see **Table 1**).

Two certified fitness instructors completed all assessments of one repetition maximum (1-RM) and peak muscle power for each participant within 1 week of the completion of the baseline evaluation. All the participants in the exercise group were familiarized with the use of free weights and bodybuilding machines before the formal resistance exercise program. On a separate day during the week after the baseline evaluation, the subjects had blood withdrawn between 8:30 and 9:30 AM following overnight fasting, and then performed a cognitive task test with concomitant neuroelectric recording (i.e., ERPs).

After 12 months, the participants completed the same questionnaires, had blood withdrawn, and received measures of neurocognitive parameters over a period of 1 week.

#### **COGNITIVE TASK AND EXPERIMENTAL PROCEDURE**

**flow chart.**

A laptop computer monitor displayed all white stimuli, including oddball stimuli, standard stimuli, and novel stimuli, against a black background. Pontifex et al. (2009) suggested that the three-stimulus oddball task has a higher level of cognitive difficulty with regard to stimulus discrimination among the elderly. The oddball stimulus was the geometric figure "◦", and the standard stimulus was the geometric figure "". The novel stimulus category consisted of the following 10 figures: <sup>&</sup>gt;, <sup>∗</sup>, }, <sup>3</sup>, <sup>4</sup>, <sup>5</sup>, , <sup>⊕</sup>, ♀, ♂, . The center of the computer screen (width = 43 cm), located directly in front of the participant at face level at a distance of approximately 75 cm, displayed the stimulus (4.09◦ × 4.09◦ ).

The cognitive test (oddball task) paradigm presented the three types of stimuli in different proportions, with 20% oddball, 60% standard, and 20% novel stimuli. The monitor presented each stimulus for 500 ms, followed by a blank screen for 1500 ms. The stimulus would disappear immediately after the participants responded. If the participants did not respond within 2000 ms, the stimulus would disappear, and the program would advance to the next trial. Participants pressed the "M" key in response to the oddball stimulus, and the "B" key in response to the standard or a novel stimulus. Stimuli were presented in a different, random order for each participant. The entire experiment consisted of three blocks of 100 trials, with the order of the stimulus blocks counterbalanced across participants. Participants were instructed to respond as quickly and accurately as possible. The present study adapted the oddball task from West et al. (2010), wherein its capacity to effectively differentiate



MMSE, mini mental state examination; BDI, Beck Depression Inventory; 7-day PAR,7-day physical activity recall; 1RM: one-repetition maximum.

lower executive functions of older adults from younger adults was demonstrated.

A trained experimenter blind to group assignment performed the cognitive testing. The experiment was administered in an acoustically shielded room with dimmed lights. On arrival at the laboratory, the experimenter explained the procedure and made sure that the participants were familiar with it. Participants completed 30 practice trials prior to the formal test to ensure they understood the whole process. The electrocap and electrooculographic (EOG) electrodes were attached to the head and face of the participants before the formal test. Each participant was asked to sit comfortably in an adjustable chair in front of a laptop computer display driven by an IBM-compatible personal computer with a stimulation system (Neuroscan Ltd., EI Paso, USA). During the test, the experimenter sat next to the participant to monitor visual fixation. The experimenter gave verbal encouragement to look at the screen if they detected the participant's eyes moving away from the central stimulus during the execution of a response. All participants with normal or corrected-to-normal vision acuity performed the oddball task with simultaneous recording of ERPs.

#### **ELECTROPHYSIOLOGICAL RECORDING AND ANALYSIS**

Electroencephalographic (EEG) activity was recorded from 18 electrode sites (F7, F8, F3, F4, Fz, T3, T4, C3, C4, Cz, T5, T6, P3, P4, Pz, O1, O2, and Oz), using an elastic electrode cap (Quik-Cap, Compumedics Neuroscan, Inc., El Paso, TX) designed for the International 10–20 System. Additional ocular electrodes placed on the supero-lateral right canthus and infero-lateral to the left eye monitored horizontal and vertical EOG (i.e., HEOG and VEOG) activity for eye movements. Scalp locations were referred to linked mastoid electrodes, while a ground electrode was placed on the mid-forehead on the Quik-Cap. All electrode impedances were below 5 kΩ. EEG data acquisition employed an A/D rate of 500 Hz/channel, a band-pass filter of 0.1–50 Hz, and a 60 Hz notch filter, with continuous writing to hard disk for off-line analysis using SCAN4.3 analysis software (Compumedics Neuroscan, Inc., El Paso, USA).

The ERP analysis epochs extracted off-line consisted of segments from −100 ms of pre-stimulus activity to 1000 ms of post-stimulus activity. Trials with a response error or EEG artifacts (e.g., VEOG, HEOG, and electromyogram) exceeding peak-to-peak deflections over 100µV were rejected before averaging. The remaining effective data was assembled according to the three different conditions (i.e., oddball, standard, and novel). Measures of peak amplitude were calculated for two components to quantify the effects of long-term exercise intervention and stimulus type on the ERPs. Since P3a has a more anterior distribution than P3b (O'Connell et al., 2012), West et al. (2010) outlined the following definitions for P3 amplitudes: the novelty P3 amplitude (i.e., P3a) is the major positive deflection over the anterior scalp (F3 and F4), and the most positive point between 300 ms and 400 ms after the stimulus. In contrast, the P3b amplitude is the major positive deflection over the central and posterior scalps (Cz, Pz, and Oz), and the most positive point between 300 ms and 800 ms.

#### **RESISTANCE TRAINING PRESCRIPTION**

Resistance training classes for the exercise group began within 1 week after initial 1-RM testing. Two certified fitness instructors formally trained and educated by professional physical fitness courses led the classes at a university fitness center. The exercise group was divided into small subgroups of three to six participants. Each training class lasted approximately 60 min, with 10 min of warm-up, 40 min of core content, and 10 min of cool-down. The warm-up included slow-paced walking and active mobility exercises for the joints of the four limbs. The core resistance exercise content implemented a circuittraining schedule with a progressive, high-intensity protocol (Liu-Ambrose et al., 2010). The training circuit consisted of the following exercises in the order stated: biceps curls, leg presses, triceps extensions, hamstring curls, latissimus dorsi pull-downs, calf raises, and seated rowing. The training equipment included bodybuilding machines and free weights. The participants performed the resistance training at an intensity of 75–80% 1- RM for three sets of 10 repetitions, at an average speed, with a 90-second rest between sets, and a 3 min interval between each apparatus. The load pressed or lifted for each exercise was recorded in each participant's exercise log at every class. As each individual's muscle strength increased, their prescribed training load was also raised to ensure they performed the training at intensity levels corresponding to 75–80% of 1-RM. Such an intensity protocol led to increases in serum IGF-1 levels in elderly subjects in previous studies (Singh et al., 1999; Cassilhas et al., 2007, 2010). During the cool-down period, participants used a variety of relaxation techniques, such as controlled breathing and static stretching exercises (i.e., maintaining maximal muscle elongation for 30s to increase range of motion). The fitness instructors recorded adherence, expressed as the percentage of classes attended, in the participants' exercise logs. The attendance rate was above 90% for all participants, with no subjects dropping out of the study.

The participants in the exercise group were required to participate in 60 min resistance training classes three times a week for a period of 12 months. Participants in the control group received baseline and post-intervention evaluations, but did not receive a specific intervention or group activity that would prevent any potential cognitive benefits from social interactions they might have engaged in.

#### **SERUM ANALYSIS**

Prior to the two cognitive task tests, a trained phlebotomist withdrew blood from the antecubital vein using an aseptic technique for analysis of serum IGF-1, GH, and homocysteine. The blood was allowed to clot (BD Vacutainer Plus), and then centrifuged at 3000 rpm for 15 min at 4◦C (Hettich Mikro 22R, C1110). Each sample was frozen and stored at –80◦C for further serum marker assays. Serum values of GH, IGF-I, and homocysteine were determined by a chemiluminescence immunoassay method using an Access Ultrasensitive hGH reagent pack (Beckman Coulter Inc, USA), Liaison IGF-1 reagent (DiaSorin S.P.A., Italy), and Siemens reagents for homocysteine assay (Siemens Healthcare Diagnostics Inc., USA), respectively. The detection limit for GH was 0.002 ng/mL, while that for the IGF-1 was 3 ng/mL, and that for homocysteine was 0.50 mmol/L. All the procedures to assess GH, IGF-I, and homocysteine were performed by the same person to avoid inter-operator bias.

# **STATISTICAL ANALYSIS**

Independent *t*-tests were used to examine the homogeneity of the demographic backgrounds of the subjects in the exercise and control groups. The accuracy rates and correct-trial RTs were submitted separately to a 2 (*Group*: exercise vs. control) × 2 (*Time*: pre-exercise vs. post-exercise) × 3 (*Condition*: oddball vs. standard vs. novel) mixed repeated measures analysis of variance (RM–ANOVA). P3 amplitudes from ERP recordings were submitted separately to a 2 (*Group*: exercise vs. control) × 2 (*Time*: pre-exercise vs. post-exercise) × 3 (*Condition*: oddball vs. standard vs. novel) × 2 (*Electrode*: F3 vs. F4 for the P3a component; Cz vs. Pz vs. Oz for the P3b component) RM– ANOVA. All biochemical markers were submitted separately to a 2 (*Group*: exercise vs. control) × 2 (*Time*: pre-exercise vs. post-exercise) RM–ANOVA. Appropriate multiple comparisons were performed following any simple main effects. When a significant difference occurred, Bonferroni *post hoc* analyses were performed. The Greenhouse–Geisser (G–G) correction adjusted the significance levels of the *F* ratios whenever RM–ANOVA detected a major violation of the sphericity assumption. Partial Eta squared (η 2 *P* ) was used to calculate effect sizes for significant main effects and interactions, with the following standards used to determine the magnitude of mean effect size: 0.01–0.059 represented a small effect size; 0.06–0.139, a medium effect size; and >0.14, a large effect size. Pearson product–moment correlations were used to examine changes in the biochemical markers and cognitive variables. Significance was set at *p* < 0.05 for all analyses.

# **RESULTS**

#### **ACCURACY RATE**

RM–ANOVA performed on the accuracy rates (see **Figure 2A**) highlighted a main effect of *Condition* [*F*(2,92) = 102.72, *p* < 0.001,

η 2 *<sup>P</sup>* = 0.69], with a lower accuracy rate for oddball (88.4%) than standard (97.3%) and novel (96.6%) conditions. The interactions between *Time* × *Group* [*F*(1,46) = 5.34, *p* = 0.025, η 2 *<sup>P</sup>* = 0.10], *Condition* × *Group* [*F*(2,92) = 3.42, *p* = 0.037, η 2 *<sup>P</sup>* = 0.07], and *Time* × *Condition* × *Group* [*F*(2,92) = 5.39, *p* = 0.006, η 2 *<sup>P</sup>* = 0.11] were also significant. *Post hoc* analysis showed a lower accuracy rate for the oddball condition in the control group 12 months after baseline [*t*(23) = 3.14, *p* = 0.005]. The exercise group exhibited a significantly higher accuracy rate in the oddball condition [*t*(46) = 2.77, *p* = 0.008] compared to the control group after 12 months.

### **REACTION TIME**

As shown in **Figure 2B**, RM–ANOVA conducted on mean RTs revealed a main effect of *Time* [*F*(1,46) = 26.47, *p* < 0.001, η 2 *<sup>P</sup>* = 0.37], and a main effect of *Condition* [*F*(2,92) = 210.94, *p* < 0.001, η 2 *<sup>P</sup>* = 0.82], suggesting that RTs were faster after the exercise intervention (490.51 ms) than before it (514.52 ms), and that RTs were faster in the standard condition (448.35 ms) than in both the oddball (529.67 ms) and novel (529.52 ms) conditions. The interactions of *Time* × *Group* [*F*(1,46) = 10.91, *p* = 0.002, η 2 *<sup>P</sup>* = 0.19], *Time* × *Condition* [*F*(2,92) = 8.35, *p* < 0.001, η 2 *<sup>P</sup>* = 0.15], and *Time* × *Condition* × *Group* [*F*(2,92) = 4.22, *p* = 0.018, η 2 *<sup>P</sup>* = 0.08] were also significant. *Post hoc* analysis showed the exercise group responded faster in the oddball [*t*(23) = 5.10, *p* < 0.001] and novel [*t*(23) = 5.27, *p* < 0.001] conditions after exercise intervention compared to baseline. The exercise group only showed significantly faster responses than the control group in the oddball condition [*t*(46) = −3.97, *p* < 0.001] after 12 months.

#### **P3a AMPLITUDE**

As illustrated in **Figure 3**, RM–ANOVA performed on the P3a amplitudes showed a main effect of *Condition* [*F*(2,92) = 7.66, *p* = 0.001, η 2 *<sup>P</sup>* = 0.14], and a main effect of *Electrode* [*F*(1,46) = 14.64, *p* < 0.001, η 2 *<sup>P</sup>* = 0.24], suggesting that the P3a amplitude was significantly smaller in the standard condition than in the novel condition, and significantly larger for the F4 electrode than for the F3 electrode. The interactions of *Time* × *Condition* [*F*(2,92) = 3.29, *p* = 0.042, η 2 *<sup>P</sup>* = 0.07], *Condition* × *Electrode* [*F*(2,92) = 7.54, *p* = 0.001, η 2 *<sup>P</sup>* = 0.14], *Time* × *Condition* × *Electrode*[*F*(2,92) = 3.90, *p* = 0.024, η 2 *<sup>P</sup>* = 0.08], and *Time* × *Condition* × *Group* [*F*(2,92) = 3.49, *p* = 0.034, η 2 *<sup>P</sup>* = 0.07] were also significant. *Post hoc* analysis showed that only the P3a amplitude in the oddball condition [*t*(23) = 3.19, *p* = 0.004] was significantly smaller across all electrodes in the control group after 12 months. The exercise group exhibited significantly larger P3a amplitudes in the oddball condition [*t*(46) = 2.40, *p* = 0.001] across all electrodes compared to the control group after 12 months.

### **P3b AMPLITUDE**

RM–ANOVA performed on the P3b amplitudes showed a main effect of *Condition* [*F*(2,92) = 4.07, *p* = 0.020, η 2 *<sup>P</sup>* = 0.08], and a main effect of *Electrode* [*F*(2,92) = 153.23, *p* < 0.001, η 2 *<sup>P</sup>* = 0.77], suggesting that the P3b amplitude was significantly larger in the oddball condition than in the standard condition, and

significantly smaller for the Oz electrode than for the Cz and Pz electrodes. The interactions of *Time* × *Condition* [*F*(2,92) = 4.03, *p* = 0.021, η 2 *<sup>P</sup>* = 0.08], *Condition* × *Electrode* [*F*(4,184) = 15.61, *p* < 0.001, η 2 *<sup>P</sup>* = 0.25], *Condition* × *Electrode* × *Group* [*F*(4,184) = 2.98, *p* = 0.021, η 2 *<sup>P</sup>* = 0.06], and *Time* × *Condition* × *Group* [*F*(2,92) = 3.35, *p* = 0.039, η 2 *<sup>P</sup>* = 0.07] were also significant. *Post hoc* analysis showed that only P3b amplitude in the oddball condition [*t*(23) = 3.61, *p* = 0.001] was significantly smaller across all electrodes in the control group after 12 months. The P3b amplitude approached significance [*t*(46) = 2.02, *p* = 0.050] between the two groups in the oddball condition across all electrodes after 12 months.

# **GH**

**Figure 4A** shows the levels of all biochemical markers before the intervention and after 12 months in the exercise and control groups. RM–ANOVA performed on serum GH levels showed that neither a significant main effect of *Group* or *Time* nor a significant interaction of *Time* × *Group* was present, indicating that the training effect of serum GH levels did not differ between two groups.

# **IGF-1**

RM–ANOVA performed on serum IGF-1 levels (see **Figure 4B**) showed a main effect of *Time* [*F*(1,46) = 5.33, *p* = 0.025, η 2 *<sup>P</sup>* = 0.10], and the interaction of *Time* × *Group* [*F*(1,46) = 11.24, *p* = 0.002, η 2 *<sup>P</sup>* = 0.20] to be significant. *Post hoc* analysis showed that serum IGF-1 levels significantly increased in the exercise group after 12 months of resistance exercise. The changes in IGF-1 levels in the exercise group were significantly correlated with the changes in RTs (*r* = −0.47, *p* = 0.020) and P3b amplitude (*r* = 0.52, *p* = 0.009) in the oddball condition. However, this effect was not found for the accuracy rates (*r* = 0.17, *p* = 0.939) or P3a amplitudes (*r* = 0.32, *p* = 0.123).

# **HOMOCYSTEINE**

As can be seen from **Figure 4C**, RM–ANOVA performed on serum homocysteine levels showed *Time* [*F*(1,46) = 9.49, *p* = 0.003, η 2 *<sup>P</sup>* = 0.71], and the interaction of *Time* × *Group* [*F*(1,46) = 4.85, *p* = 0.033, η 2 *<sup>P</sup>* = 0.10], to produce significant main effects. *Post hoc* analysis showed that serum homocysteine levels were only significantly reduced in the exercise group after 12 months of resistance exercise. However, there were no significant correlations between changes in homocysteine levels and changes in behavioral and ERPs performances after long-term intervention in the exercise group.

# **DISCUSSION**

This study aimed to investigate whether a 12 month high-intensity resistance exercise intervention could effectively retard a decline

in executive functions in healthy elderly males, and to determine the relationship between changes in IGF-1, GH and homocysteine levels and neurocognitive performance (e.g., neuropsychological and neuroelectric components) during an oddball task. The control group displayed a lower accuracy rate and smaller P3a and P3b amplitudes in the oddball condition when performing the oddball task after 12 months. The results for the exercise group showed that long-term high-intensity resistance exercise can decrease RTs and attenuate decreases in P3a and P3b amplitudes during a stimulus discrimination task, as well as increase serum IGF-1 levels and decrease serum homocysteine levels, and that changes in IGF-1 levels were significantly correlated with changes in RTs and P3b amplitudes in the oddball condition. These findings suggest that long-term resistance exercise could be an effective mechanism for attenuating the age-related decreases in neural efficiency in healthy elderly individuals manifested during the oddball task, possibly modulated by increased IGF-1 levels.

### **NEUROPSYCHOLOGICAL INDEX**

Participants in the control group had significantly lower accuracy rates in the oddball condition after 12 months than at baseline, indicating that older adults exhibit a reduced ability with aging to differentiate between standard and target stimuli during the three-stimulus oddball task. The results of the present study support those of previous research, as healthy individuals aged 55–80 years showed a decrease of 0.21 in mean MMSE score per year, with 22% of individuals showing a decrease of more than 1 point per year (Kalmijn et al., 2000), and elderly individuals demonstrated lower accuracy for novel stimuli (Fabiani and Friedman, 1995). These findings demonstrate that aging is marked by a progressive decline in cognitive functioning, and that information processing is vulnerable to aging. However, the participants in the exercise group did not exhibit a similar trend of decreased performance during the study intervention period, suggesting that these negative effects of aging may be attenuated by regular participation in resistance exercise.

In addition, the exercise group displayed faster RTs in the oddball and novel conditions after the exercise intervention compared with at baseline, and, in particular, this change led to a group difference in the oddball condition after 12 months. These findings suggest that the neuromotor and central processing of cognitive functions used to distinguish the oddball stimulus from a frequent stimulus could be significantly enhanced in elderly males by 12 months of resistance exercise. Indeed, Hatta et al. (2005) found that regular participation in moderate exercise could promote response processing when performing a somatosensory oddball task in elderly adults. In addition, a previous study demonstrated that elderly individuals who are physically active could retain their reaction capacity (Spirduso, 1980). Based on findings from both the present study and previous research, we postulate that high-intensity resistance exercise could facilitate greater temporal efficiency in the central processing of cognitive functions in healthy elderly individuals.

Interestingly, the group differences seen in task accuracy and RT were driven by different patterns of behavioral changes during a one-year period. That is, the accuracy in the oddball condition declined significantly in the control group, while the RT decreased significantly in the exercise group. Previous studies examined the accuracy rates and RTs during completion of an oddball task among subjects of various ages, and found that elderly adults are less accurate relative to middle-aged and younger adults, whereas no statistically significant differences in RTs were observed, showing that aging has different effects on these two outcomes, with the authors suggesting that this may be due to age-related changes in response strategies (Ford and Pfefferbaum, 1985; Iragui et al., 1993). In this study we observed that, relative to baseline performance, the control group exhibited decreased accuracy with maintenance of RT performance. This is probably because the elderly adults in the control group adopted a strategy that favors processing speed over accuracy when performing this type of cognitive task, which diminishes RT delays at the expense of decreased accuracy. Similarly, for the exercise group, the benefits of exercise training were easier to observe with regard to RTs, since these subjects also focused more on speed than accuracy, and this trade-off may thus reduce any benefits with regard to task accuracy.

# **NEUROELECTRIC INDEX**

P300 reflects attentional processes, indexed by two distinct yet related subcomponents of neural processes (Pontifex et al., 2009): the P3a component is elicited by a change in the stimulus environment (e.g., an infrequent or novel non-repeating distractor), with the P3a amplitude reflecting a stimulus-driven, or bottom-up, attentional orienting to a salient but irrelevant stimulus (Polich, 2007; Richardson et al., 2011); the P3b component is elicited by a rare stimulus within a series of frequent irrelevant stimuli (O'Connell et al., 2012), with the P3b amplitude serving as a proposed reflection of the topdown allocation of attentional resources to stimulus evaluation when working memory is updated (Polich, 2007; Verleger, 2008). In the present study, the control group exhibited significantly smaller P3a and P3b amplitudes in the oddball condition after 12 months, indicating age-related decreases in both attentional orienting/engagement of focal attention, and attentional resource allocation and subsequent memory processing (Polich, 2007; Verleger, 2008) in healthy elderly subjects. These findings echo those of a previous study that examined the extent of the decline in attention control efficiency during normal aging, and which suggested that reduced attention control in older adults relative to younger adults, observed in terms of cortical activity, causes greater inefficiency in the tendency to filter out irrelevant information over successive trials (Fabiani et al., 2006).

In contrast, the exercise group sustained P3 amplitudes in the oddball condition over 12 months of high-intensity resistance exercise, resulting in a significant difference in P3 amplitudes between the exercise and control groups. Similarly, Polich and Lardon (1997) demonstrated that very physically active young adults achieved larger P3 amplitudes compared to relatively inactive counterparts when performing the visual oddball task. However, Pontifex et al. (2009) examined P300 components separately, and found that older adults with high cardiorespiratory fitness only exhibited greater P3b amplitudes, with no significant differences in P3a amplitudes relative to controls, suggesting that fitnessrelated changes in cognitive aging appear specific to attentional processing. In the present study, the exercise group exhibited significantly larger P3a and P3b amplitudes during the stimulus discrimination task, indicating that 12 months of resistance training can simultaneously maintain the capacities for both orienting and allocating attention. The present neuroelectric findings seem to suggest that chronic resistance exercise might modulate, and in some cases, potentially reverse, age-related decreases in neuronal tissue loss in the brain cortices.

# **NEUROPHYSIOLOGICAL INDEX**

Although GH stimulation can cause an increase in IGF-1 production (Sonntag et al., 2005), the healthy elderly males in the current study showed significant increases in basal serum IGF-1 levels after 12 months of high-intensity resistance exercise, without any accompanying elevation in GH levels from baseline. This result supports previous studies demonstrating that spontaneous GH secretion does not positively correlate with basal IGF-1 levels in older adults (Vermeulen, 1987; Benbassat et al., 1997). In this study, the slight increase in GH levels in the exercise group after long-term exercise does not indicate a significant relationship with changes in neuropsychological and neuroelectric performance. This substantiates previous studies that investigated neurocognitive performance in subjects with GH replacement. For example, Papadakis et al. (1996) found that 6 months of GH treatment did not result in a significant improvement in neuropsychological performance in healthy elderly men with low baseline IGF-1 levels. Golgeli et al. (2004) also reported that 6 months of GH replacement therapy in Sheehan syndrome patients with severe GH deficiency did not significantly affect P3 amplitudes. More recently, Quik et al. (2012) failed to find a relationship between GH levels and N2b amplitudes in healthy males aged 50–78 years during performance of a selection-potential go/no-go task. However, Rollero et al. (1998) observed that although the MMSE score was not associated with basal GH levels or GH peaks after GH-releasing hormone stimulation in elderly subjects, cognitive performance was positively related with total IGF-1 levels.

Despite a decrease in circulating serum IGF-1 levels paralleling a decline in GH pulses in later life (Corpas et al., 1993), the close relationship between them seems not to extend to identical effects on cognitive performance. Indeed, only changes in the IGF-1 levels of the exercise group subjects in the current study correlated significantly with changes in RTs and P3b amplitudes, with such an effect not exhibited by the GH parameter. Previous studies have demonstrated that age-related decreases in serum IGF-1 levels could be a potential mechanism for age-related decline in cognitive functions (e.g., processing speed) in the elderly (Sonntag et al., 2005). However, increased brain uptake of peripheral IGF-1 during exercise could lead to training-induced neuroprotective effects (Carro et al., 2000). That is, IGF-1 is essential for exercise-induced neurogenesis (Carro et al., 2000), and acts to mediate exercise-induced angiogenesis (Lopez-Lopez et al., 2004). In the current study,

IGF-1 levels were significantly increased in the exercise group after 12 months of resistance exercise training. The changes in IGF-1 levels significantly correlated with RT performance, reflecting previous research which demonstrated that higher IGF-1 levels in adults with Prader–Willi syndrome are associated with faster temporal memory performance (van Nieuwpoort et al., 2011). The findings of the present study show that increases in IGF-1 levels after long-term resistance exercise can reduce the time needed for central processing of cognitive functions (e.g., RTs) in healthy elderly males. Similarly, Baker et al. (2010) found that aerobic fitness may improve executive control with an increase in IGF-1 in elderly males at risk of cognitive disorder. However, although Aleman et al. (1999) reported no correlation between IGF-1 levels and memory, attention, or fluid intelligence in healthy elderly males aged 65–76 years, they did observe significant associations for serum IGF-1 levels with both perceptual–motor and information processing speed, which is known to decline significantly with aging. Based on our findings and those of previous studies, serum IGF-1 levels could be the key neurophysiological indicator of improved response processing in healthy elderly males after participation in 12 months of high-intensity resistance exercise. However, such a positive effect did not appear to emerge in the neural system during the aging process, and thus increased serum IGF-1 levels do not increase P3a and P3b amplitudes in healthy elderly males.

The current study found a significant negative correlation between the changes in IGF-1 levels and P3b amplitudes for the exercise group subjects in the oddball condition, suggesting that enhanced IGF-1 levels might have decreased the neural system degeneration, and thus enabled them to better distinguish the oddball stimulus from the standard stimulus. However, the changes in IGF-1 levels were only associated with P3b amplitude, not P3a amplitude, indicating that changes in the growth factor were not related to a general change in the attentional system. Serum IGF-1 levels appear to selectively associate with a particular aspect of attention; specifically, IGF-1 might mediate the neural network involved in the topdown allocation of attentional resources, while not affecting the bottom-up allocation used during attentional orienting. Since P3a and P3b might relate to the dopaminergic and locus-coeruleusnorepinephrine systems, respectively (Nieuwenhuis et al., 2005; Polich and Criado, 2006), further research is warranted in this area, with a possible focus on examining the potential interactive mechanisms between IGF-1 and these two biochemical systems.

Overall, the evidence presented above suggests that IGF-1 might be an intermediary for the effects of resistance exercise at central levels, despite previous studies reporting that the upregulation of the GH/IGF-1 axis seemed to produce positive effects on cognitive functions. Indeed, Kalmijn et al. (2000) found that higher total serum IGF-1 concentrations at baseline significantly correlated with less cognitive decline in terms of MMSE score over a two-year period in healthy individuals aged 55–88 years. Serum IGF-1 concentrations might reflect an underlying biological process influencing cognitive decline based on the statistically significant correlations obtained between changes in IGF-1 levels and cognitive performance. Collectively, these results indicate that this biochemical agent can attenuate or minimize the progress of certain aspects of cognitive degeneration in healthy elderly individuals. This might be achieved through various central mechanisms, including actions on neurons, cerebrovasculature, and the number of cells expressing c-fos in neurons and glia, possibly because of its transportation to the central nervous system via the hematoencephalic barrier (Sonntag et al., 2005). Additionally, these findings indicate that factors other than GH secretion are involved in the relationship between serum IGF-1 levels and cognitive function decline in the elderly.

High homocysteine levels are a risk factor for cognitive impairment in older adults (Ford et al., 2012). In the present study, serum homocysteine levels were significantly reduced in the exercise group after 12 months of resistance exercise, echoing the findings of a previous study in which serum homocysteine decreased after 6 months of high- or lowintensity resistance exercise (Vincent et al., 2003). Although the potential mechanisms by which resistance training might prevent cognitive decline in the elderly involve homocysteine (Liu-Ambrose and Donaldson, 2009), the present study did not show an association between changes in homocysteine levels and changes in neuropsychological and neuroelectric measures. A possible explanation for this is that the cognitive task adopted in this study related to executive functions, with new research suggesting that high homocysteine levels in elderly adults only decrease performance in tests of immediate and delayed memory, not executive functions (Ford et al., 2013). Although a few studies have demonstrated prolonged potential latencies in P3 amplitude associated with elevated homocysteine levels (Evers et al., 1997; Díaz-Leines et al., 2013), as in all experimental studies with a cross-sectional design, it is difficult to infer causal relationships between serum homocysteine levels and neurocognitive performance in healthy elderly males.

### **STUDY LIMITATIONS**

While the present findings shed light on the beneficial effects of 12 months of high intensity resistance training on neuropsychological, neuroelectric, and neurophysiological outcomes, there are some limitations that indicate they should be applied with caution. First, we only recruited male elderly adults in the present study to exclude the influences of gender differences in executive control (Rubia et al., 2010), myofiber hypertrophy (Bamman et al., 2003) and endocrine indices (Staron et al., 1994) in response to resistance training. The results may thus not be generalized to female subjects without more work being done. Second, the results of the present experimental design (i.e., a single laboratory-based cognitive task) might be difficult to apply to daily living activities. An additional virtual reality task (i.e., Chaddock et al., 2012) may thus help to assess the potential behavioral benefits of resistance exercise in future investigations.

# **CONCLUSIONS**

In conclusion, increasing the level of physical activity via high-intensity resistance exercise could assist in lowering the rate of age-related neurocognitive decreases in healthy elderly males. In addition, increases in basal IGF-1 levels achieved via such an exercise protocol could have positive effects on both neuropsychological (i.e., RT) and neuroelectric (i.e., P3b amplitude) performance in the elderly. This study's findings imply that healthy elderly individuals who regularly engage in resistance exercise might delay the onset of age-related decline in executive functions, and that this protective effect may be modulated by the growth factor-IGF-1.

# **ACKNOWLEDGMENTS**

This research was supported by a grant from the National Science Council in Taiwan (NSC 100-2410-H-006-074-MY2).

# **REFERENCES**


Vermeulen, A. (1987). Nyctohemeral growth hormone profiles in young and aged men: correlation with somatomedin-C levels. *J. Clin. Endocrinol. Metab.* 64, 884–888. doi: 10.1210/jcem-64-5-884

Vincent, K. R., Braith, R. W., Bottiglieri, T., Vincent, H. K., and Lowenthal, D. T. (2003). Homocysteine and lipoprotein levels following resistance training in older adults. *Prev. Cardiol.* 6, 197–203. doi: 10.1111/j.1520-037x.2003.01723.x

West, R., Schwarb, H., and Johnson, B. N. (2010). The influence of age and individual differences in executive function on stimulus processing in the oddball task. *Cortex* 46, 550–563. doi: 10.1016/j.cortex.2009.08.001

**Conflict of Interest Statement**: The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

*Received: 20 November 2014; accepted: 21 January 2015; published online: 10 February 2015*.

*Citation: Tsai C-L, Wang C-H, Pan C-Y and Chen F-C (2015) The effects of longterm resistance exercise on the relationship between neurocognitive performance and GH, IGF-1, and homocysteine levels in the elderly. Front. Behav. Neurosci. 9:23. doi: 10.3389/fnbeh.2015.00023*

*This article was submitted to the journal Frontiers in Behavioral Neuroscience*.

*Copyright © 2015 Tsai, Wang, Pan and Chen. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution and reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms*.

digital media

of impactful research

article's readership